┌─ vision thread ────────────────────────────┐ ┌─ audio thread (host-owned) ──┐
│ camera ─► GPU inference ─► manifold filter │──▶│ ring-buffer read (wait-free) │
│ │ │ interpolate to block time │
│ │ │ write automation events │
└─────────────────────────────────────────────┘ └──────────────────────────────┘
│
└──► UI thread (viewport, config) — read-only snapshot
The audio thread
This is where 90% of all real-time audio bugs live. Non-negotiable rules:
- No allocation. Not
Vec::push, notBox::new, notformat!, not aHashMap::insert. Preallocate everything. - No locks. Not
Mutex, notRwLock, notparking_lot. Usertrb(wait-free SPSC) and atomic primitives. - No blocking I/O. No file I/O, no network, no logging that writes to
a file.
tracingevents on the audio thread must route through a lock-free channel. - No panics. Panics in the audio thread are UB for the host.
- Bounded work. The callback must return in less than the block time, every call, even in the worst case. Think in 99.9th-percentile terms.
Cross-thread communication
- Vision → Audio:
rtrbring buffer carryingSample<G, C>values. Writer is the single vision-thread producer; reader is the audio thread. Wait-free on the reader side. - Audio → UI: atomic snapshots of parameter state.
- UI → Vision: a separate
rtrbring carrying commands (reconfigure, reset, change model).
Timestamp discipline
Every sample carries a host-monotonic Timestamp. The audio thread knows
now_audio - sample_timestamp and interpolates accordingly, so latency is
measured, not guessed. Interpolation happens in the tangent space of
the relevant manifold — you cannot slerp a homography by blending matrix
entries.
Why not async
tokio and similar runtimes spawn worker threads that can starve an audio
callback on a shared core and allocate unpredictably. They have no place in
the real-time path. The vision thread can use a minimal runtime for GPU
futures, but the audio thread sees none of it.