Tona Overview
Tona is the pitch module of VoxaTrace. It owns three responsibilities, each with its own facade:
| Facade | Purpose |
|---|---|
PitchDetection | Realtime detection (per frame) and batch contour extraction (whole recording) |
PitchProcessing | Cleanup of an existing contour — octave correction, smoothing, blip removal |
PitchAnalysis | Histograms, tuning estimation, quantization, melodic transcription |
Tona produces and consumes a single shared data type — PitchContour — which downstream modules (Tessera, Accura, Calibra) read.
Pipeline
audio samples ─► PitchDetection ──► PitchContour ──► PitchProcessing ──► cleaned PitchContour
│ │
└────────────────────► PitchAnalysis ◄──┘
│
▼
PitchHistogram, TonalSegment[], …
- Live use case:
PitchDetection.createDetector()— feed audio buffer-by-buffer: calldetect(samples, sampleRate)for aPitchPointper frame, and/orfeedContour(samples, sampleRate, anchorTime)to accumulate the session contour, then readpitchContour.recent(seconds)for visualization. - Offline use case:
PitchDetection.createContourExtractor()— pass a complete recording, get back a fullPitchContourwith cleanup baked in (configurable viaContourExtractorConfig). - Cleanup an existing contour:
PitchProcessing.process(contour, PitchProcessingConfig.SCORING). - Histogram or transcription:
PitchAnalysis.computeHistogram(contour, tonicHz),PitchAnalysis.quantize(...),PitchAnalysis.labelByMeanPitch(...).
Algorithms
| Algorithm | Best for | Cost | Dependencies |
|---|---|---|---|
PitchAlgorithm.YIN | Realtime, pure DSP, no model bundle | Lower per frame | None |
PitchAlgorithm.SWIFT_F0 | Higher accuracy on vocals; batch | Needs ONNX model | swift_f0.onnx (95k params) |
PitchDetectorConfig defaults to YIN (realtime). ContourExtractorConfig defaults to SWIFT_F0 (batch). To use SwiftF0 in either path, register a model provider once at startup or pass it explicitly:
AIModelRegistry.registerSwiftF0 { ModelLoader.loadSwiftF0() }
Performance budget
Per ADR-020, the per-buffer processing cost on mid-tier hardware must stay under 40 ms when used inside CalibraLiveEval. Defaults are tuned to that budget:
PitchDetectorConfig.BALANCED.bufferSize = 1024(was 2048 before 1.0.0)SessionConfig.hopSize = 320(was 160 before 1.0.0)PitchDetectorConfig.PRECISE.bufferSize = 4096— not for realtime; use only for offline analysis.
See also
- Calibra Live Evaluation — consumes a
PitchDetectorfor realtime scoring. - Calibra Melody Eval — consumes a
PitchContourExtractorfor batch scoring. - Accura — consumes a
PitchContourfor intonation analysis. - Tessera — consumes a
PitchContourfor breath / agility / range metrics.