Skip to content

Fix: Async Loader Deadlock

Date: 2026-02-17 Files: src/async_loader.c Symptom: Application freezes during HDR environment map loading

Root Cause

A mutex double-lock deadlock in async_worker_func().

The worker thread flow was:

pthread_mutex_unlock(&request_mutex);   // (1) Unlock for heavy work

if (has_work) {
    async_handle_io(path_to_load);      // (2) Returns with mutex HELD
}

pthread_mutex_lock(&request_mutex);     // (3) DEADLOCK — already held!
has_pending_work = false;

async_handle_io() internally re-acquires the mutex before returning (in all code paths: success, failure, cancel). Step (3) then tried to lock an already-held mutex — undefined behavior on POSIX default (non-recursive) mutexes, which on Linux manifests as a permanent deadlock.

Once the worker thread deadlocked on itself, the main thread's async_loader_poll() also tried to acquire the same mutex and blocked forever, freezing the entire application.

Fix

Made the re-lock conditional on whether async_handle_io() was called:

if (has_work) {
    async_handle_io(path_to_load);
    /* Returns with mutex HELD — no re-lock needed */
} else {
    /* No work dispatched, re-acquire for next iteration */
    pthread_mutex_lock(&request_mutex);
}
has_pending_work = false;

Why It Appeared Intermittently

The deadlock only triggers when: 1. The worker thread finishes all internal work (IO + conversion) 2. The main thread happens to call async_loader_poll() after the worker re-acquires the mutex at step (3)

On faster GPUs/CPUs, timing differences could mask the bug. On Intel HD 4600 (Haswell) with a 4K HDR (64 MB PBO), the IO + SIMD conversion took ~55 ms, making the race window consistently hit.

Backward Compatibility

This fix is unconditionally correct on all platforms. The previous code was always undefined behavior — it just happened to not deadlock on some systems due to timing.