TesseraSpeakingPitch
Detects a speaker's natural speaking pitch (the median F0 of voiced frames in conversational speech). Useful for voice profiling and shruti suggestion.
Quick Start
Kotlin
val hz = TesseraSpeakingPitch.detectFromAudio(audioSamples, sampleRate = 16000)
if (hz > 0) println("Speaking pitch: $hz Hz")
Swift
let hz = TesseraSpeakingPitch.detectFromAudio(audioMono: samples, sampleRate: 16000)
if hz > 0 { print("Speaking pitch: \(hz) Hz") }
Methods
| Method | Description |
|---|---|
detect(contour: PitchContour): Float | Median F0 from a pre-extracted contour. Throws IllegalArgumentException if the contour is empty (per ADR-022). |
detectFromAudio(audioMono, sampleRate = 16000): Float | One-shot from raw audio. Resamples to 16 kHz internally (per ADR-017). Throws if audioMono is empty or sampleRate <= 0. |
detectFromPitch(pitchesHz): Float | One-shot from a raw pitch array. Returns 0 (not throws) for empty input. |
All three return 0 on detection failure (e.g., not enough voiced frames). Always check > 0 before using the result.
Common Pitfalls
- Returns 0, not -1, on failure. Sentinel value, not an exception (the empty-input throw on
detectanddetectFromAudiois a separate caller-bug case). - Tuned for speech, not singing. For singing, use
TesseraBreathorTessera.analyze. - Don't pass an empty
PitchContourtodetect. It throws (caller bug).detectFromPitchis the silent variant for already-known-empty input.
Use with shruti derivation
The speaking-pitch reading feeds directly into MusicTheory.deriveUserShruti:
val nspHz = TesseraSpeakingPitch.detectFromAudio(speechAudio)
val derivation = MusicTheory.deriveUserShruti(
nspHz = nspHz.takeIf { it > 0 },
rangeLowHz = userRange?.lower?.frequencyHz,
rangeHighHz = userRange?.upper?.frequencyHz,
)
println("Practice shruti: ${derivation.targetHz} Hz (source = ${derivation.source})")
See also
- MusicTheory.deriveUserShruti
- TesseraRange — for vocal range
- Tessera (multi-metric)