Skip to main content

Tona Overview

Tona is the pitch module of VoxaTrace. It owns three responsibilities, each with its own facade:

FacadePurpose
PitchDetectionRealtime detection (per frame) and batch contour extraction (whole recording)
PitchProcessingCleanup of an existing contour — octave correction, smoothing, blip removal
PitchAnalysisHistograms, tuning estimation, quantization, melodic transcription

Tona produces and consumes a single shared data type — PitchContour — which downstream modules (Tessera, Accura, Calibra) read.

Pipeline

audio samples ─► PitchDetection ──► PitchContour ──► PitchProcessing ──► cleaned PitchContour
│ │
└────────────────────► PitchAnalysis ◄──┘


PitchHistogram, TonalSegment[], …
  • Live use case: PitchDetection.createDetector() — feed audio buffer-by-buffer: call detect(samples, sampleRate) for a PitchPoint per frame, and/or feedContour(samples, sampleRate, anchorTime) to accumulate the session contour, then read pitchContour.recent(seconds) for visualization.
  • Offline use case: PitchDetection.createContourExtractor() — pass a complete recording, get back a full PitchContour with cleanup baked in (configurable via ContourExtractorConfig).
  • Cleanup an existing contour: PitchProcessing.process(contour, PitchProcessingConfig.SCORING).
  • Histogram or transcription: PitchAnalysis.computeHistogram(contour, tonicHz), PitchAnalysis.quantize(...), PitchAnalysis.labelByMeanPitch(...).

Algorithms

AlgorithmBest forCostDependencies
PitchAlgorithm.YINRealtime, pure DSP, no model bundleLower per frameNone
PitchAlgorithm.SWIFT_F0Higher accuracy on vocals; batchNeeds ONNX modelswift_f0.onnx (95k params)

PitchDetectorConfig defaults to YIN (realtime). ContourExtractorConfig defaults to SWIFT_F0 (batch). To use SwiftF0 in either path, register a model provider once at startup or pass it explicitly:

AIModelRegistry.registerSwiftF0 { ModelLoader.loadSwiftF0() }

Performance budget

Per ADR-020, the per-buffer processing cost on mid-tier hardware must stay under 40 ms when used inside CalibraLiveEval. Defaults are tuned to that budget:

  • PitchDetectorConfig.BALANCED.bufferSize = 1024 (was 2048 before 1.0.0)
  • SessionConfig.hopSize = 320 (was 160 before 1.0.0)
  • PitchDetectorConfig.PRECISE.bufferSize = 4096not for realtime; use only for offline analysis.

See also

  • Calibra Live Evaluation — consumes a PitchDetector for realtime scoring.
  • Calibra Melody Eval — consumes a PitchContourExtractor for batch scoring.
  • Accura — consumes a PitchContour for intonation analysis.
  • Tessera — consumes a PitchContour for breath / agility / range metrics.