Detecting Pitch

A complete guide to pitch detection with PitchDetection (in tona). For the full type reference see PitchDetection and PitchProcessing; this guide is task-oriented.

What You'll Learn

Detect pitch in real-time from microphone input
Extract pitch contours from recorded audio
Choose between YIN and SwiftF0 algorithms
Handle octave errors and noise
Convert frequency to musical notes

Prerequisites

VoxaTrace installed

Quick Start

Kotlin

val detector = PitchDetection.createDetector()

recorder.audioBuffers.collect { buffer ->
    val samples = buffer.toFloatArray()
    val point = detector.detect(samples, buffer.sampleRate)

    if (point.pitch > 0) {
        println("Pitch: ${point.pitch} Hz, Confidence: ${point.confidence}")
    }
}

detector.close()

Swift

let detector = PitchDetection.createDetector()

for await buffer in recorder.audioBuffers {
    let samples = buffer.samples
    let point = detector.detect(samples: samples, sampleRate: Int(buffer.sampleRate))

    if point.pitch > 0 {
        print("Pitch: \(point.pitch) Hz, Confidence: \(point.confidence)")
    }
}

detector.close()

Real-time Detection

Basic Setup

// Create detector with defaults
val detector = PitchDetection.createDetector()

// Or with preset
val detector = PitchDetection.createDetector(PitchDetectorConfig.BALANCED)

Processing Audio

// From recorder
recorder.audioBuffers.collect { buffer ->
    val samples = FloatArray(buffer.sampleCount)
    buffer.fillFloatSamples(samples)

    val point = detector.detect(samples, buffer.sampleRate)
    updateUI(point)
}

// From any audio source
val point = detector.detect(audioSamples, sampleRate = 48000)

Understanding Results

data class PitchPoint(
    val pitch: Float,      // Frequency in Hz (-1 if unvoiced)
    val confidence: Float, // 0.0 to 1.0
    val timeSeconds: Float // Timestamp (for contours)
)

// Check if voiced
if (point.pitch > 0) {
    // Valid pitch detected
} else {
    // Unvoiced (silence, noise, or consonant)
}

// Check confidence
if (point.confidence > 0.7f) {
    // High confidence - trust this pitch
}

Choosing an Algorithm

YIN (Default)

Classic DSP algorithm. No external dependencies.

val detector = PitchDetection.createDetector(
    PitchDetectorConfig.Builder()
        .algorithm(PitchAlgorithm.YIN)
        .build()
)

Best for:

Simple tuner apps
Low-resource devices
When you can't add ONNX Runtime

SwiftF0 (Neural Network)

Deep learning model with better accuracy in noisy conditions.

val detector = PitchDetection.createDetector(
    PitchDetectorConfig.Builder()
        .algorithm(PitchAlgorithm.SWIFT_F0)
        .build(),
    modelProvider = { ModelLoader.loadSwiftF0() }
)

Best for:

Singing apps (more robust to vibrato, noise)
Production apps where accuracy matters
When ONNX Runtime is acceptable

Configuration Options

Presets

PitchDetectorConfig.BALANCED  // Good tradeoff (default)
PitchDetectorConfig.PRECISE   // Higher accuracy, more CPU
PitchDetectorConfig.RELAXED      // Lower latency, less accuracy

Voice Type

Optimize for specific vocal ranges:

val config = PitchDetectorConfig.Builder()
    .voiceType(VoiceType.WesternSoprano)  // High female voice
    .voiceType(VoiceType.WesternTenor)    // High male voice
    .voiceType(VoiceType.WesternBass)     // Low male voice
    .voiceType(VoiceType.Auto)         // Detect automatically
    .voiceType(VoiceType.carnaticMale) // Indian classical male
    .build()

Processing (Smoothing + Octave Correction)

val config = PitchDetectorConfig.Builder()
    .enableProcessing()  // Enable smoothing and octave correction
    .build()

// Or control at runtime
detector.processingEnabled = true

Quiet Handling

How to handle low-amplitude audio:

val config = PitchDetectorConfig.Builder()
    .quietHandling(QuietHandling.NORMAL)      // Standard gating
    .quietHandling(QuietHandling.SENSITIVE)   // More sensitive for soft singers
    .quietHandling(QuietHandling.NOISY)       // For noisy environments
    .build()

Batch Extraction

Extract complete pitch contours from recorded audio:

val extractor = PitchDetection.createContourExtractor(
    ContourExtractorConfig.SCORING,
    modelProvider = { ModelLoader.loadSwiftF0() }
)

// Load audio (must be 16kHz or will be resampled)
val audio = decoder.decode("recording.m4a")

// Extract contour
val contour = extractor.extract(audio.samples, audio.sampleRate)

// Access points
contour.samples.forEach { point ->
    println("Time: ${point.timeSeconds}s, Pitch: ${point.pitch} Hz")
}

extractor.release()

Cleanup Options

// For scoring - removes artifacts, corrects octaves
val config = ContourExtractorConfig.SCORING

// For display - includes smoothing
val config = ContourExtractorConfig.DISPLAY

// Raw - no processing
val config = ContourExtractorConfig.RAW

// Custom
val config = ContourExtractorConfig.Builder()
    .preset(ContourExtractorConfig.SCORING)
    .hopMs(10)  // 10ms between pitch samples
    .cleanup(PitchProcessingConfig.SCORING)
    .build()

The cleanup field on ContourExtractorConfig is typed as PitchProcessingConfig. Available presets: PitchProcessingConfig.RAW, SCORING, DISPLAY.

Post-Processing

Apply cleanup to existing contours via PitchProcessing:

// Apply preset pipeline (octave correction → blip removal → smoothing)
val cleaned = PitchProcessing.process(contour, PitchProcessingConfig.SCORING)

// Or build a custom config
val config = PitchProcessingConfig.Builder()
    .fixOctaveErrors(true)
    .removeSpuriousJumps(false)
    .smoothingWindowSize(9)
    .build()
val cleaned = PitchProcessing.process(contour, config)

Working with Arrays

val pitches = floatArrayOf(440f, 442f, 880f, 438f)  // Has octave error

// Full processing
val processed = PitchProcessing.process(pitches, PitchProcessingConfig.SCORING)

// Individual operations
val smoothed = PitchProcessing.smooth(pitches, windowSize = 5)
val corrected = PitchProcessing.correctOctaveErrors(pitches)
val filtered = PitchProcessing.medianFilter(pitches, kernelSize = 3)

Live Pitch Contour

For visualization, access the growing contour:

// Read the trailing window for drawing — call once per render frame
val contour = detector.pitchContour.recent(seconds = 10f)
drawPitchCurve(contour.samples)

// The window span is chosen at read time (for scrolling displays):
// pass the number of trailing seconds you want to recent(...).

// Clear when starting new recording
detector.clearPitchContour()

Converting to Notes

fun pitchToNote(frequency: Float): NoteInfo {
    if (frequency <= 0) return NoteInfo.UNVOICED

    val a4 = 440.0
    val semitones = 12 * kotlin.math.log2(frequency / a4) + 69

    val noteNames = arrayOf("C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B")
    val noteIndex = (semitones.toInt() % 12 + 12) % 12
    val octave = (semitones.toInt() / 12) - 1
    val midiNumber = semitones.toInt()
    val centsOff = ((semitones - semitones.toInt()) * 100).toInt()

    return NoteInfo(
        name = "${noteNames[noteIndex]}$octave",
        midi = midiNumber,
        centsOff = centsOff,
        frequency = frequency
    )
}

data class NoteInfo(
    val name: String,
    val midi: Int,
    val centsOff: Int,  // -50 to +50
    val frequency: Float
) {
    companion object {
        val UNVOICED = NoteInfo("--", 0, 0, -1f)
    }
}

Common Patterns

Tuner App

class TunerViewModel : ViewModel() {
    private var detector: PitchDetector? = null
    private var recorder: SonixRecorder? = null

    val note = MutableStateFlow("--")
    val frequency = MutableStateFlow(0f)
    val centsOff = MutableStateFlow(0)

    fun start() {
        detector = PitchDetection.createDetector()
        recorder = SonixRecorder.createTemporary()

        recorder?.start()

        viewModelScope.launch {
            recorder?.audioBuffers?.collect { buffer ->
                val samples = FloatArray(buffer.sampleCount)
                buffer.fillFloatSamples(samples)

                val point = detector?.detect(samples, buffer.sampleRate) ?: return@collect

                if (point.pitch > 0 && point.confidence > 0.6f) {
                    val noteInfo = pitchToNote(point.pitch)
                    note.value = noteInfo.name
                    frequency.value = noteInfo.frequency
                    centsOff.value = noteInfo.centsOff
                } else {
                    note.value = "--"
                    frequency.value = 0f
                    centsOff.value = 0
                }
            }
        }
    }

    fun stop() {
        recorder?.stop()
        recorder?.release()
        detector?.close()
    }
}

Pitch Visualization

class PitchGraphView : View {
    private val pitchHistory = mutableListOf<Float>()
    private val maxHistory = 100

    fun addPitch(pitch: Float) {
        pitchHistory.add(pitch)
        if (pitchHistory.size > maxHistory) {
            pitchHistory.removeAt(0)
        }
        invalidate()
    }

    override fun onDraw(canvas: Canvas) {
        // Draw pitch history as scrolling graph
        pitchHistory.forEachIndexed { index, pitch ->
            if (pitch > 0) {
                val x = index * (width / maxHistory.toFloat())
                val y = height - (pitch / 1000f * height)  // Normalize to view
                canvas.drawCircle(x, y, 4f, paint)
            }
        }
    }
}

Troubleshooting

No Pitch Detected

Check microphone permission
Verify audio is reaching detector (print buffer sizes)
Try lowering confidence threshold
Check if audio is too quiet (use QuietHandling.SENSITIVE)

Wrong Octave

Enable processing: .enableProcessing()
Use post-processing: PitchProcessing.correctOctaveErrors(pitches, OctaveCorrectionConfig.FULL)
Try SwiftF0 algorithm (better octave handling)

Noisy/Jumpy Readings

Enable smoothing: .enableProcessing()
Filter by confidence (> 0.7)
Use median filter in post-processing

Next Steps

Pitch Detection Concepts - Theory and background
Tuner App Recipe - Complete example

What You'll Learn​

Prerequisites​

Quick Start​

Kotlin​

Swift​

Real-time Detection​

Basic Setup​

Processing Audio​

Understanding Results​

Choosing an Algorithm​

YIN (Default)​

SwiftF0 (Neural Network)​

Configuration Options​

Presets​

Voice Type​

Processing (Smoothing + Octave Correction)​

Quiet Handling​

Batch Extraction​

Cleanup Options​

Post-Processing​

Working with Arrays​

Live Pitch Contour​

Converting to Notes​

Common Patterns​

Tuner App​

Pitch Visualization​

Troubleshooting​

No Pitch Detected​

Wrong Octave​

Noisy/Jumpy Readings​

Next Steps​