Skip to main content

CalibraBreath

Breath capacity and control analysis for vocal performance assessment. Measures how well a singer manages their breathing by analyzing pitch contour data.

Quick Start

Kotlin

// Extract pitch contour first
val contour = pitchExtractor.extract(audio, sampleRate = 16000)
val times = contour.toTimesArray()
val pitches = contour.toPitchesArray()

// Check if enough data (needs 5+ seconds of voiced audio)
if (CalibraBreath.hasEnoughData(times, pitches)) {
val capacity = CalibraBreath.computeCapacity(times, pitches)
println("Breath capacity: $capacity seconds")
}

// Get total voiced time
val voicedTime = CalibraBreath.getCumulativeVoicedTime(times, pitches)
println("Total sung time: $voicedTime seconds")

Swift

// Extract pitch contour first
let contour = pitchExtractor.extract(audio: audio, sampleRate: 16000)
let times = contour.toTimesArray()
let pitches = contour.toPitchesArray()

// Check if enough data (needs 5+ seconds of voiced audio)
if CalibraBreath.hasEnoughData(times: times, pitchesHz: pitches) {
let capacity = CalibraBreath.computeCapacity(times: times, pitchesHz: pitches)
print("Breath capacity: \(capacity) seconds")
}

// Get total voiced time
let voicedTime = CalibraBreath.getCumulativeVoicedTime(times: times, pitchesHz: pitches)
print("Total sung time: \(voicedTime) seconds")

When to Use

ScenarioUse This?Why
Analyze sustained note abilityYesCore use case
Track breath improvementYesCompare over time
Compare student vs. referenceYesUse computeMetrics
Real-time breath feedbackNoUse pitch contour length instead
Detect breathing momentsPartiallyUse unvoiced gaps in pitch data

Methods

All methods are static on the CalibraBreath object (Kotlin) / class (Swift).

hasEnoughData

Check if there is enough data for breath analysis. Requires at least 5 seconds of cumulative voiced audio to produce meaningful results.

Kotlin

val enough: Boolean = CalibraBreath.hasEnoughData(times, pitchesHz)

Swift

let enough: Bool = CalibraBreath.hasEnoughData(times: times, pitchesHz: pitches)

Parameters

ParameterTypeDescription
timesFloatArray / [Float]Timestamps in seconds
pitchesHzFloatArray / [Float]Pitch values in Hz (-1 for unvoiced frames)

Returns: Boolean / Bool -- true if there are at least 5 seconds of voiced audio.

computeCapacity

Compute breath capacity from a pitch contour. Measures the maximum duration of sustained voiced segments, indicating how long the singer can hold notes without breathing.

Internally resamples the pitch contour to a 10 Hz feature rate using SonixResampler (libsamplerate), eliminates short non-breath silences, and models breath reserve as an exponential function.

Kotlin

val capacity: Float = CalibraBreath.computeCapacity(times, pitchesHz)

Swift

let capacity: Float = CalibraBreath.computeCapacity(times: times, pitchesHz: pitches)

Parameters

ParameterTypeDescription
timesFloatArray / [Float]Timestamps in seconds
pitchesHzFloatArray / [Float]Pitch values in Hz (-1 for unvoiced frames)

Returns: Float -- Breath capacity in seconds (longest sustained phrase). Returns -1 on failure. Returns at least 1 on success.

getCumulativeVoicedTime

Calculate the total amount of time where pitch was detected (i.e., the singer was producing voiced sound).

Kotlin

val voicedTime: Float = CalibraBreath.getCumulativeVoicedTime(times, pitchesHz)

Swift

let voicedTime: Float = CalibraBreath.getCumulativeVoicedTime(times: times, pitchesHz: pitches)

Parameters

ParameterTypeDescription
timesFloatArray / [Float]Timestamps in seconds
pitchesHzFloatArray / [Float]Pitch values in Hz (-1 for unvoiced frames)

Returns: Float -- Total voiced time in seconds. Returns -1 on failure (fewer than 2 samples or mismatched array lengths).

computeMetrics

Compute comprehensive breath metrics comparing a student performance to a reference. Internally merges consecutive feedback segments into sung regions, then computes per-region breath capacity and control scores using FFT-based cross-correlation alignment and peak detection.

Kotlin

val metrics: BreathMetrics = CalibraBreath.computeMetrics(
refTimes = refTimes,
refPitchesHz = refPitchesHz,
studentTimes = studentTimes,
studentPitchesHz = studentPitchesHz,
feedbackSegmentIndices = feedbackSegmentIndices,
feedbackStartTimes = feedbackStartTimes,
feedbackEndTimes = feedbackEndTimes,
refSegmentStarts = refSegmentStarts,
refSegmentEnds = refSegmentEnds
)

Swift

let metrics: BreathMetrics = CalibraBreath.computeMetrics(
refTimes: refTimes,
refPitchesHz: refPitchesHz,
studentTimes: studentTimes,
studentPitchesHz: studentPitchesHz,
feedbackSegmentIndices: feedbackSegmentIndices,
feedbackStartTimes: feedbackStartTimes,
feedbackEndTimes: feedbackEndTimes,
refSegmentStarts: refSegmentStarts,
refSegmentEnds: refSegmentEnds
)

Parameters

ParameterTypeDescription
refTimesFloatArray / [Float]Reference pitch timestamps in seconds
refPitchesHzFloatArray / [Float]Reference pitches in Hz
studentTimesFloatArray / [Float]Student's pitch timestamps in seconds
studentPitchesHzFloatArray / [Float]Student's pitches in Hz
feedbackSegmentIndicesIntArray / [Int]Indices of feedback segments
feedbackStartTimesFloatArray / [Float]Start times of feedback segments
feedbackEndTimesFloatArray / [Float]End times of feedback segments
refSegmentStartsFloatArray / [Float]Reference segment start times
refSegmentEndsFloatArray / [Float]Reference segment end times

Returns: BreathMetrics with capacity, control, and validity.

Result Types

BreathMetrics

PropertyTypeDescription
capacityFloatBreath capacity in seconds -- longest sustained phrase
controlFloatBreath control score (0.0 to 1.0) -- breathing pattern consistency vs. reference
isValidBooleanWhether the result is valid (enough data was available)

When data is insufficient or computation fails, computeMetrics returns BreathMetrics(capacity = -1, control = -1, isValid = false).

Understanding Breath Capacity

Breath capacity represents the longest sustained phrase duration:

CapacityLevelInterpretation
< 3 secondsNeeds workShort breath support
3--5 secondsBeginnerAverage, typical for untrained singers
5--8 secondsGoodSolid breath control
> 8 secondsExcellentTrained singer level

Understanding Breath Control

Breath control (from computeMetrics) measures how well the student's breathing patterns match the reference performance:

ScoreInterpretation
0.8--1.0Excellent match to reference breathing
0.5--0.8Moderate alignment, room for improvement
< 0.5Poor match, breathing patterns differ significantly

The score is computed by aligning the student's breath function against the reference using FFT cross-correlation, then comparing peak positions and amplitudes with a 0.5-second tolerance and 30% amplitude similarity threshold.

Common Patterns

Post-Lesson Breath Report

class BreathReportViewModel(
private val pitchExtractor: CalibraPitch.ContourExtractor
) : ViewModel() {

fun analyzeRecording(audio: FloatArray, sampleRate: Int) {
viewModelScope.launch {
val contour = pitchExtractor.extract(audio, sampleRate)
val times = contour.toTimesArray()
val pitches = contour.toPitchesArray()

if (!CalibraBreath.hasEnoughData(times, pitches)) {
showMessage("Not enough singing data. Need at least 5 seconds.")
return@launch
}

val capacity = CalibraBreath.computeCapacity(times, pitches)
val voicedTime = CalibraBreath.getCumulativeVoicedTime(times, pitches)

showResults(
capacitySeconds = capacity,
totalSungTime = voicedTime
)
}
}
}

Compare Student to Reference

fun evaluateBreath(
refTimes: FloatArray, refPitches: FloatArray,
studentTimes: FloatArray, studentPitches: FloatArray,
feedbackIndices: IntArray,
feedbackStarts: FloatArray, feedbackEnds: FloatArray,
refStarts: FloatArray, refEnds: FloatArray
) {
val metrics = CalibraBreath.computeMetrics(
refTimes = refTimes,
refPitchesHz = refPitches,
studentTimes = studentTimes,
studentPitchesHz = studentPitches,
feedbackSegmentIndices = feedbackIndices,
feedbackStartTimes = feedbackStarts,
feedbackEndTimes = feedbackEnds,
refSegmentStarts = refStarts,
refSegmentEnds = refEnds
)

if (metrics.isValid) {
println("Breath capacity: ${metrics.capacity}s")
println("Breath control: ${(metrics.control * 100).toInt()}%")
}
}

Common Pitfalls

  1. Not enough data -- Need 5+ seconds of actual singing (not silence). Always check with hasEnoughData first.
  2. Wrong pitch format -- Use -1 for unvoiced frames, not 0. Values at or below 50 Hz are treated as silence internally.
  3. Mismatched array lengths -- times and pitchesHz must be the same length. Methods return -1 if they differ.
  4. Timestamps must be evenly spaced -- The sample rate is inferred from the gap between the first two timestamps. Irregular spacing will produce incorrect results.

Next Steps