Skip to main content

CalibraPitch

Real-time pitch detection and batch pitch extraction for singing and speech analysis. Detects the fundamental frequency (F0) of audio with two algorithm backends.

Quick Start

Kotlin

val detector = CalibraPitch.createDetector()
val point = detector.detect(audioBuffer, sampleRate = 16000)
println("Pitch: ${point.pitch} Hz, Confidence: ${point.confidence}")
detector.close()

Swift

let detector = CalibraPitch.createDetector()
let point = detector.detect(samples: audioBuffer, sampleRate: 16000)
print("Pitch: \(point.pitch) Hz, Confidence: \(point.confidence)")
detector.close()

When to Use

Use CaseAPIExample
Live tuner displaycreateDetector()Show pitch meter in real-time
Karaoke scoringcreateDetector() + CalibraLiveEvalScore singing while playing
Analyze recorded audiocreateContourExtractor()Extract pitch from audio files
Post-process contoursPostProcessClean up octave errors, smooth noise

Detector Configuration

Presets

PresetKotlinSwiftDescription
BalancedPitchDetectorConfig.BALANCED.balancedDefault for most use cases
PrecisePitchDetectorConfig.PRECISE.preciseLarger buffer, stricter thresholds
RelaxedPitchDetectorConfig.RELAXED.relaxedMore forgiving, higher recall

Builder

Kotlin

val config = PitchDetectorConfig.Builder()
.algorithm(PitchAlgorithm.SWIFT_F0)
.voiceType(VoiceType.carnaticMale)
.quietHandling(QuietHandling.SENSITIVE)
.enableProcessing()
.build()

val detector = CalibraPitch.createDetector(config, modelProvider = { ModelLoader.loadSwiftF0() })

Swift

let config = PitchDetectorConfig.Builder()
.algorithm(.swiftF0)
.voiceType(.carnaticMale)
.quietHandling(.sensitive)
.enableProcessing()
.build()

let detector = CalibraPitch.createDetector(config: config, modelProvider: { ModelLoader.loadSwiftF0() })

Config Properties

PropertyTypeDefaultDescription
algorithmPitchAlgorithmYINDetection algorithm (YIN or SWIFT_F0)
bufferSizeInt2048Audio buffer size for analysis
hopSizeInt160Hop size between frames in samples
toleranceFloat0.15YIN algorithm tolerance (lower = more accurate)
minFreqFloat80Minimum detectable frequency in Hz
maxFreqFloat1000Maximum detectable frequency in Hz
amplitudeGateDbFloat-40RMS threshold in dB for gating quiet frames
confidenceThresholdFloat0.75Minimum confidence to accept pitch (0.0-1.0)
enableSmoothingBooleanfalseEnable pitch smoothing filter
enableOctaveCorrectionBooleanfalseEnable octave error correction

Builder Methods

MethodDescription
preset(config)Start from a preset configuration
algorithm(algo)Set pitch detection algorithm
voiceType(type)Set frequency range for a voice type
quietHandling(handling)Set amplitude gate threshold
strictness(strictness)Set confidence threshold
enableProcessing()Enable smoothing + octave correction
bufferSize(size)Set buffer size for analysis
hopSize(samples)Set hop size between frames
tolerance(value)Set YIN algorithm tolerance
swiftF0BatchSize(samples)Set SwiftF0 batch size

Algorithms

AlgorithmBest ForLatencyDependencies
YINReal-time, low power, edge cases~50msNone (pure DSP)
SWIFT_F0Vocals, high accuracy~16ms per frameONNX Runtime

SwiftF0 achieves 91.80% harmonic mean accuracy at 10dB SNR with only 95k parameters. It requires a model provider:

// Register globally at app startup
AIModelRegistry.registerSwiftF0 { ModelLoader.loadSwiftF0() }

// Or pass per-instance
val detector = CalibraPitch.createDetector(config, modelProvider = { ModelLoader.loadSwiftF0() })

Voice Types

Optimize frequency range for different singing styles:

Voice TypeKotlinSwiftRange (Hz)
AutoVoiceType.auto.auto80–1000
Western SopranoVoiceType.westernSoprano.westernSoprano250–1000
Western AltoVoiceType.westernAlto.westernAlto180–700
Western TenorVoiceType.westernTenor.westernTenor130–500
Western BassVoiceType.westernBass.westernBass80–350
Western ChildVoiceType.westernChild.westernChild200–1200
Carnatic MaleVoiceType.carnaticMale.carnaticMale90–450
Carnatic FemaleVoiceType.carnaticFemale.carnaticFemale140–900
Carnatic ChildVoiceType.carnaticChild.carnaticChild200–1000
Hindustani MaleVoiceType.hindustaniMale.hindustaniMale90–450
Hindustani FemaleVoiceType.hindustaniFemale.hindustaniFemale180–900
Hindustani ChildVoiceType.hindustaniChild.hindustaniChild200–1000
Pop MaleVoiceType.popMale.popMale100–500
Pop FemaleVoiceType.popFemale.popFemale180–800
Pop ChildVoiceType.popChild.popChild200–1000
Indian Film MaleVoiceType.indianFilmMale.indianFilmMale100–500
Indian Film FemaleVoiceType.indianFilmFemale.indianFilmFemale180–900
Indian Film ChildVoiceType.indianFilmChild.indianFilmChild200–1000

Quiet Handling

LevelKotlinSwiftGate (dB)Description
SensitiveQuietHandling.SENSITIVE.sensitive-50Soft singing, quiet room
NormalQuietHandling.NORMAL.normal-40Typical environment (default)
NoisyQuietHandling.NOISY.noisy-30Loud environment

Detection Strictness

LevelKotlinSwiftThresholdDescription
StrictDetectionStrictness.STRICT.strict0.85Fewer false positives
BalancedDetectionStrictness.BALANCED.balanced0.75Default setting
LenientDetectionStrictness.LENIENT.lenient0.65Catches more notes

PitchPoint

Each detection returns a PitchPoint with:

PropertyTypeDescription
pitchFloatFrequency in Hz (-1 if unvoiced)
confidenceFloatDetection confidence (0.0–1.0)
timeSecondsFloatTimestamp in seconds
isSingingBooleanWhether pitch was detected (pitch > 0)
midiNoteIntMIDI note number (69 = A4)
noteString?Note name with octave (e.g., "A4", "C#5")
centsOffIntCents deviation from nearest note (-50 to +50)
tuningTuningSILENT, FLAT, IN_TUNE, or SHARP

Real-time Detection

Detector Methods

MethodDescription
detect(samples, sampleRate)Detect pitch from audio buffer. Resamples to 16kHz internally.
getAmplitude(samples, sampleRate)Get RMS amplitude of audio buffer
reset()Reset state and internal buffer
release() / close()Release all resources
duplicate()Create independent copy with same config
setContourMaxDuration(seconds)Set max duration for live pitch contour
clearPitchContour()Clear accumulated pitch contour

Detector Properties

PropertyTypeDescription
configPitchDetectorConfigConfiguration used to create this detector
latencyMsFloatDetection latency in milliseconds
hasProcessingBooleanWhether post-processing is available
processingEnabledBooleanEnable/disable post-processing at runtime
livePitchContourStateFlow<PitchContour>Accumulated pitch contour for visualization

Observing Live Pitch Contour

Kotlin (StateFlow)

detector.livePitchContour.collect { contour ->
// Update scrolling pitch display
updatePitchVisualization(contour)
}

Swift (Observer)

let task = detector.observeLivePitchContour { contour in
self.pitchContour = contour
}

// Cancel when done
task.cancel()

Batch Extraction

Extract a complete pitch contour from recorded audio.

Contour Extractor Presets

PresetKotlinSwiftDescription
DefaultContourExtractorConfig.DEFAULT.defaultBalanced with scoring cleanup
ScoringContourExtractorConfig.SCORING.scoringOptimized for melody evaluation
DisplayContourExtractorConfig.DISPLAY.displayOptimized for visualization
RawContourExtractorConfig.RAW.rawNo post-processing

Kotlin

val extractor = CalibraPitch.createContourExtractor(
ContourExtractorConfig.SCORING,
modelProvider = { ModelLoader.loadSwiftF0() }
)
val contour = extractor.extract(audioSamples, sampleRate = 44100)
println("Duration: ${contour.duration}s, Voiced: ${(contour.voicedRatio * 100).toInt()}%")
extractor.release()

Swift

let extractor = CalibraPitch.createContourExtractor(
config: .scoring,
modelProvider: { ModelLoader.loadSwiftF0() }
)
let contour = extractor.extract(audio: audioSamples, sampleRate: 44100)
print("Duration: \(contour.duration)s, Voiced: \(Int(contour.voicedRatio * 100))%")
extractor.release()

Contour Extractor Builder

Kotlin

val config = ContourExtractorConfig.Builder()
.pitchPreset(PitchPreset.PRECISE)
.algorithm(PitchAlgorithm.SWIFT_F0)
.sampleRate(16000)
.hopMs(10)
.cleanup(ContourCleanup.SCORING)
.voiceType(VoiceType.carnaticMale)
.build()

val extractor = CalibraPitch.createContourExtractor(config, modelProvider = { ModelLoader.loadSwiftF0() })

Swift

let config = ContourExtractorConfig.Builder()
.pitchPreset(.precise)
.algorithm(.swiftF0)
.sampleRate(16000)
.hopMs(10)
.cleanup(.scoring)
.voiceType(.carnaticMale)
.build()

let extractor = CalibraPitch.createContourExtractor(config: config, modelProvider: { ModelLoader.loadSwiftF0() })

Contour Cleanup Presets

PresetKotlinSwiftDescription
RawContourCleanup.RAW.rawNo post-processing
ScoringContourCleanup.SCORING.scoringOctave + boundary + blip removal
DisplayContourCleanup.DISPLAY.displayScoring + smoothing for visualization

Post-Processing

Clean up pitch contours with CalibraPitch.PostProcess.

Contour-Level Methods

// Apply a cleanup preset
val cleaned = CalibraPitch.PostProcess.cleanup(contour, ContourCleanup.SCORING)

// Individual operations
val fixed = CalibraPitch.PostProcess.fixOctaveErrors(contour)
val noBoundary = CalibraPitch.PostProcess.fixBoundaryOctaves(contour)
val noBlips = CalibraPitch.PostProcess.removeBlips(contour, minDurationMs = 80f)
val smoothed = CalibraPitch.PostProcess.smooth(contour)
let cleaned = CalibraPitch.PostProcess.cleanup(contour, options: .scoring)

let fixed = CalibraPitch.PostProcess.fixOctaveErrors(contour)
let noBoundary = CalibraPitch.PostProcess.fixBoundaryOctaves(contour)
let noBlips = CalibraPitch.PostProcess.removeBlips(contour, minDurationMs: 80)
let smoothed = CalibraPitch.PostProcess.smooth(contour)

Array-Level Methods

MethodDescription
process(pitchesHz)Full processing (smoothing + octave correction)
smooth(pitchesHz, windowSize)Smoothing filter only
correctOctaveErrors(pitchesHz, thresholdCents, referencePitchHz)Fix octave jumps
medianFilter(pitchesHz, kernelSize)Median filter for spike removal
rejectOutliers(pitchesHz, hopMs, minDurationMs)Remove short pitch runs (blips)
correctBoundaryOctaves(pitchesHz, hopMs, boundaryWindowMs)Fix octave errors at phrase edges

PitchContour

A sequence of PitchPoint values over time.

PropertyTypeDescription
samplesList<PitchPoint>Pitch points in chronological order
sampleRateIntOriginal audio sample rate
durationFloatDuration in seconds
voicedRatioFloatRatio of voiced to total samples (0.0–1.0)
sizeIntNumber of samples
isEmptyBooleanTrue if contour has no samples
MethodDescription
slice(startTime, endTime, relativeTimes)Extract a time range
fromArrays(times, pitches, sampleRate)Create from parallel arrays
fromPoints(points, sampleRate)Create from a list of PitchPoints

Common Patterns

Pitch Detector ViewModel

class PitchViewModel : ViewModel() {
private var detector: CalibraPitch.Detector? = null

val currentPitch = MutableStateFlow(PitchPoint.EMPTY)

fun startDetection() {
detector = CalibraPitch.createDetector(PitchDetectorConfig.BALANCED)

viewModelScope.launch {
recorder.audioBuffers.collect { buffer ->
val point = detector!!.detect(buffer.samples, buffer.sampleRate)
currentPitch.value = point
}
}
}

override fun onCleared() {
detector?.close()
}
}

Platform Notes

  • iOS: Audio input is typically 48kHz; resampled internally to 16kHz. SwiftF0 requires ai-models module with ONNX Runtime.
  • Android: Audio input is typically 44.1kHz or 16kHz depending on device. SwiftF0 uses ONNX Runtime for Android. YIN has no external dependencies.

Next Steps

  • CalibraVAD — Detect when someone is singing vs. silence
  • CalibraLiveEval — Live singing evaluation with scoring
  • CalibraMelodyEval — Evaluate singing accuracy against a reference
  • Utilities — Shared model types (PitchPoint, PitchContour, VoiceType)