Until I Find You

Platform	Camera	Inference EP	Plugin
macOS	AVFoundation	CoreML, ORT CPU	CLAP, AU, VST3
iOS	AVCaptureSession	CoreML	AUv3 (via clap-wrappers)
Windows	Media Foundation	DirectML, ORT CPU	CLAP, VST3
Linux	V4L2	ORT CPU, CUDA	CLAP, VST3
Android	Camera2 (JNI)	NNAPI, XNNPACK	N/A
Web	`getUserMedia`	WebGPU / ORT-web	N/A

macOS / iOS

Apple framework dependencies live in nix/uify.nix under stdenv.isDarwin. The build picks up AVFoundation, CoreMedia, CoreVideo, CoreML, Metal, AudioToolbox.

Windows

Media Foundation + DirectML. Tested under MSVC; MinGW is not supported.

Linux

V4L2 + ALSA for mic-adjacent cameras. nokhwa is the default abstraction.

Android / iOS

Bindings generated by UniFFI into bindings/kotlin/ and bindings/swift/. Ship as AAR / XCFramework.

Web

The core builds to wasm32-unknown-unknown. The runtime camera backend uses getUserMedia; inference uses ORT-web with WebGPU when available.

23 APR 2026

platform behavior

macOS / iOS

Windows

Linux

Android / iOS

Web