| Platform | Camera | Inference EP | Plugin |
|---|---|---|---|
| macOS | AVFoundation | CoreML, ORT CPU | CLAP, AU, VST3 |
| iOS | AVCaptureSession | CoreML | AUv3 (via clap-wrappers) |
| Windows | Media Foundation | DirectML, ORT CPU | CLAP, VST3 |
| Linux | V4L2 | ORT CPU, CUDA | CLAP, VST3 |
| Android | Camera2 (JNI) | NNAPI, XNNPACK | N/A |
| Web | getUserMedia | WebGPU / ORT-web | N/A |
macOS / iOS
Apple framework dependencies live in nix/uify.nix under
stdenv.isDarwin. The build picks up AVFoundation, CoreMedia,
CoreVideo, CoreML, Metal, AudioToolbox.
Windows
Media Foundation + DirectML. Tested under MSVC; MinGW is not supported.
Linux
V4L2 + ALSA for mic-adjacent cameras. nokhwa is the default abstraction.
Android / iOS
Bindings generated by UniFFI into bindings/kotlin/ and bindings/swift/.
Ship as AAR / XCFramework.
Web
The core builds to wasm32-unknown-unknown. The runtime camera backend uses
getUserMedia; inference uses ORT-web with WebGPU when available.