PitchDetection

Public facade for pitch detection — realtime and batch (ADR-001).

What is Pitch Detection?

Pitch detection estimates the fundamental frequency (F0) of a monophonic audio signal. VoxaTrace offers two algorithms:

  • YIN — classic DSP autocorrelation, no ML dependency, frame-by-frame.

  • SwiftF0 — lightweight neural network (95k params), batch-oriented, higher accuracy for vocals at the cost of requiring an ONNX model bundle.

This facade provides factory methods for two workflows:

When to Use

ScenarioUseWhy
Live singing feedback (karaoke, practice)createDetectorFrame-by-frame with live contour
Offline melody scoring / intonation analysiscreateContourExtractorFull contour with cleanup
Post-process an existing contourPitchProcessing.process()Not detection — uses this facade's output
Histogram / transcription on a contourPitchAnalysis.*Also not detection — downstream

Quick Start

Kotlin

// Realtime detection
val detector = PitchDetection.createDetector()
val point = detector.detect(audioBuffer, sampleRate = 16000)
detector.close()

// Batch extraction
val extractor = PitchDetection.createContourExtractor(
ContourExtractorConfig.SCORING,
modelProvider = { ModelLoader.loadSwiftF0() }
)
val contour = extractor.extract(audioSamples, sampleRate = 16000)
extractor.release()

Swift

// Realtime detection (accepts native [Float], any sample rate — ADR-017)
let detector = PitchDetection.createDetector()
let point = detector.detect(samples: audioBuffer, sampleRate: 48000)
detector.close()

// Batch extraction
let extractor = PitchDetection.createContourExtractor(config: .scoring) {
ModelLoader.shared.loadSwiftF0()
}
let contour = extractor.extract(audio: audioSamples, sampleRate: 48000)
extractor.release()

Common Pitfalls

  1. SwiftF0 requires a model: Pass modelProvider explicitly or register via AIModelRegistry.registerSwiftF0 { ... } at startup. Forgetting this throws at create-time.

  2. Release when done: Both PitchDetector and PitchContourExtractor hold native resources. Call close() / release() to avoid leaks.

  3. Mono input only: Both algorithms expect single-channel audio. SonixDecoder.decode() auto-converts stereo to mono (ADR-017).

  4. PRECISE is not for realtime: PitchDetectorConfig.PRECISE uses a 4096-sample buffer — too expensive per frame for real-time use (ADR-020).

See also

The realtime detector interface returned by createDetector

The batch extractor returned by createContourExtractor

For post-processing a contour (smoothing, octave correction)

For histogram, quantization, and transcription on a contour

Configuration for detection (presets, builder, .copy)

Configuration for extraction

Functions

Link copied to clipboard
fun createContourExtractor(config: ContourExtractorConfig = ContourExtractorConfig.DEFAULT, modelProvider: () -> ByteArray? = null): PitchContourExtractor

Create a pitch contour extractor for batch (offline) processing.

Link copied to clipboard
fun createDetector(config: PitchDetectorConfig = PitchDetectorConfig.BALANCED, modelProvider: () -> ByteArray? = null): PitchDetector

Create a realtime pitch detector.