Skip to content

Visual Testing & Regression Artifacts

This document explains the technical reasons behind visual differences observed in automated regression reports, particularly between local hardware rendering and CI software rendering.

The Core Challenge: Software vs. Hardware

The project CI (GitHub Actions) uses Mesa llvmpipe, a software-based OpenGL implementation that runs on the CPU. While highly compliant, it differs from hardware GPUs (NVIDIA, Intel, AMD) in several ways:

1. Floating Point Precision

  • Bit-Level Parity: Even with standard-compliant drivers, transcendental functions like pow(), exp(), sin(), and cos() vary slightly in their low-level implementation.
  • Accumulated Errors: In complex PBR shaders, these micro-discrepancies accumulate, leading to "salt and pepper" noise in difference maps.

2. The "Sphere Center" Artifact

A common finding in our PBR tests is a concentric ring pattern at the center of spheres.

  • Cause: At the center of a sphere, the dot product N · V is exactly 1.0.
  • Sensitivity: The BRDF LUT (Look-Up Table) is sampled using (N · V, roughness). Any tiny floating-point variation in the calculation of N · V or the texture sampling coordinates causes the hardware to pick a different pixel in the LUT than the software renderer.
  • Result: High-contrast deltas in the difference map, even if the visual change is imperceptible to the human eye.

PBR Engine Evolution

Recent improvements to the rendering engine also contribute to deltas compared to older "Master" references:

1. Multiple Scattering (Kulla-Conty)

We implemented a compensation term for energy loss on rough surfaces. This increases the brightness of metallic/rough materials, changing the overall luminance compared to a "Standard" PBR implementation.

2. Analytic Roughness Clamping

To ensure portability across vendors, we replaced derivative-based smoothing (fwidth) with Analytic Roughness Clamping (MIN_ROUGHNESS = 0.03).

  • Benefit: Identical results on Intel, NVIDIA, and Mesa.
  • Delta: Reference images captured with older versions using fwidth will show significant edges deltas.

Naming Convention & Multi-View

To ensure full coverage and clear categorization, we use a structured naming convention for reference images:

Format: ref_<view>_<mode>_<effect>.png

  • View: front, back, left, right, top, bottom.
  • Mode: default, subtle.
  • Effect: none, bloom, fxaa, dof, auto_exposure, motion_blur, etc.

Subtle Mode Strategy

By default, visual regression tests for post-processing effects are conducted using the Subtle mode (PRESET_SUBTLE). This prevents visual artifacts from overloading the diff maps and focuses on the high-frequency impact of the effect itself.

Motion Blur Testing Strategy

Since motion blur relies on frame-to-frame camera velocity calculation (previousViewProj), testing it with a single static render yields no blur. We employ a Double Frame Sequence strategy: 1. Frame N-1 (Warmup): Render the static view to initialize the previousViewProj matrix in the shader. 2. Frame N (Motion): Apply a deterministic camera rotation (yaw or pitch) depending on the ViewPoint and render the second frame. 3. Capture: The resulting image captures a consistent, deterministic blur driven entirely by spatial data (not time). We favor rotation over translation to avoid introducing parallax and depth-weighting discontinuities that occur when moving around the pillar of spheres, ensuring 100% stable test references across CI environments.

Interactive Visual Report

The visual report (index.html) generated by CI provides advanced comparison tools:

  1. Selective Menu: Filter by View, Mode, and Effect to quickly isolate regressions.
  2. Interactive Split Slider: Drag a handle across the image to see a side-by-side comparison between the Reference (Baseline) and the PR Render.
  3. Effect Visualization: A dedicated mode to see the delta between "None" and "With Effect", helping developers understand the exact contribution of a specific shader pass.
  4. Magnifier Lens: A toggleable zoom tool (2.5x) that follows the cursor, allowing for pixel-perfect inspection of rendering details (AA, noise, banding) across both the slider and the diff maps.
  5. Difference Maps: High-contrast (x5 intensity) maps are automatically generated for both PR regressions and effect visualization.

Guidelines for Reference Updates

1. ISO Verification

When creating or updating references, we ensure visual consistency by: * Standardizing on Billboard + Raytraced Impostors for geometric primitives (spheres). * Removing redundant buffers/mesh updates (like icosphere_generate) to minimize noise from CPU-side triangulation discrepancies. * Using a fixed camera distance (25.0) centered at origin.

2. Updating References

When a Visual Regression Report shows failures:

  1. Inspect with the Magnifier: Determine if deltas are concentrated at geometric edges (precision/AA) or in noise patterns (sampling).
  2. Verify PR Intent: If the PR modified PBR math, lighting, or post-processing, a delta is expected.
  3. Regenerate locally:
    • Run GEN_REFS=1 .github/workflows/scripts/run_test_with_xvfb.sh build/tests/test_app to regenerate the baseline.
    • Commit the updated PNG files.

See also: Shader Cross-GPU Compatibility