Histograms, tuning estimation, quantization, and melodic transcription on a PitchContour. Use after detection (PitchDetection) and optional cleanup (PitchProcessing).
For full intonation scoring (per-note deviation + 0–100 score), use Accura.analyzePitching — Accura wraps this facade.
Quick Start
Kotlin
val extractor = PitchDetection.createContourExtractor(ContourExtractorConfig.SCORING)
val contour = extractor.extract(audioSamples, 16000)
extractor.release()
val histogram = PitchAnalysis.computeHistogram(contour, tonicHz = 196f)
val tuningOffset = PitchAnalysis.estimateTuningOffset(contour, refFreqHz = 196f)
val targetIntervals = MusicTheory.EQ_TEMPERED_INTERVALS_CENTS_BASE
.map { it.toFloat() }.toFloatArray()
val segments = PitchAnalysis.labelByMeanPitch(contour, tonicHz = 196f, targetIntervals)
for (seg in segments) {
println("${seg.label}: ${seg.startSeconds}s – ${seg.endSeconds}s")
}
Swift
let histogram = PitchAnalysis.computeHistogram(contour: contour, tonicHz: 196)
let offset = PitchAnalysis.estimateTuningOffset(contour: contour, refFreqHz: 196)
let targets = MusicTheory.eqTemperedIntervalsCentsBase.map { Float($0) }
let segments = PitchAnalysis.labelByMeanPitch(
contour: contour, tonicHz: 196, targetIntervalsCents: targets
)
Methods
Histogram
| Method | Description |
|---|
computeHistogram(contour, tonicHz, config = HistogramConfig.DEFAULT): PitchHistogram | Bins pitch in cents relative to tonic. Optional fold-octaves, density normalization, smoothing. |
estimateTuningOffset(contour, refFreqHz, centTolerance: Float = 50f): Float | Aligns histogram peaks to the 12-TET grid; returns offset in cents. |
Transcription
| Method | Description |
|---|
quantize(contour, tonicHz, targetIntervalsCents, config = QuantizationConfig.DEFAULT): PitchContour | Snap stable frames to nearest target interval; non-stable frames become unvoiced. |
labelByMeanPitch(contour, tonicHz, targetIntervalsCents, config = LabellingConfig.DEFAULT): List<TonalSegment> | Sliding-window mean-pitch labelling against target intervals. |
fitLinearSegments(contour, tonicHz, config = LinearFitConfig.DEFAULT): List<TonalSegment> | Piecewise linear regression (gamaka / ornament analysis). |
Config classes
HistogramConfig
| Property | Type | Default | Description |
|---|
numBins | Int? | null (auto) | Number of bins; null selects from range |
density | Boolean | true | Normalize so total area = 1 |
foldOctaves | Boolean | false | Fold all pitches into one octave |
mode | HistogramMode | DURATION | DURATION (time-weighted) or INSTANCE_COUNT |
smoothSigma | Float? | 5f | Gaussian smoothing sigma; null = no smoothing |
Presets: DEFAULT, FOLDED (foldOctaves=true), RAW (density=false, smoothSigma=null).
QuantizationConfig
| Property | Type | Default |
|---|
slopeThresholdCentsPerSec | Float | 150 |
maxDeviationCents | Float | 50 |
medianFilterWindowSamples | Int | 7 |
applyMedianFilter | Boolean | true |
minSegmentDurationMs | Int? | null |
LabellingConfig
| Property | Type | Default |
|---|
windowSeconds | Float | 0.150 |
hopSeconds | Float | 0.030 |
LinearFitConfig
| Property | Type | Default |
|---|
windowSeconds | Float | 1.5 |
breakThresholdSeconds | Float | 1.5 |
hopSeconds | Float? | null (= window/2) |
Result types
PitchHistogram
data class PitchHistogram(
val binCenters: FloatArray,
val values: FloatArray,
val tonicHz: Float,
val isDensity: Boolean,
val isFolded: Boolean,
val mode: HistogramMode
)
| Method | Description |
|---|
smooth(sigma = 5f): PitchHistogram | New histogram with Gaussian-smoothed values |
normalizeArea(): PitchHistogram | New histogram normalized so total area = 1 |
getPeaksValleys(targetIntervalsCents, config = PeakDetectionConfig.DEFAULT): PeakData | Peak detection — "hybrid", "slope", or "interval" methods |
computePeakStats(peakData, rawPitchCents, refIntervalsCents, config = PeakStatsConfig.DEFAULT): PeakStatsCollection | Per-peak distribution stats |
TonalSegment
data class TonalSegment(
val startSeconds: Float,
val endSeconds: Float,
val label: String? = null,
val meanCents: Float? = null,
val slopeCentsPerSec: Float? = null,
val interceptCents: Float? = null,
)
duration: Float (computed) returns endSeconds - startSeconds.
TonalSegment is distinct from calibra.model.Segment (which models song structure with index / lyrics / student timing).
PeakStats / PeakStatsCollection
PeakStats carries per-peak statistics: referenceInterval, peakPosition, peakAmplitude, mean, median, stdDev, variance, coeffOfVariation, skewness, kurtosis, pearsonSkew2. PeakStatsCollection is a Map<Float, PeakStats> keyed by reference interval, iterable.
PeakDetectionConfig
| Property | Default |
|---|
method | "hybrid" (also "slope", "interval") |
peakAmpThresh | 0.00005f |
valleyThresh | 0.00003f |
lookahead | 20 |
avgIntervalHint | null |
minPeakAreaFraction | 0f (relative valley-to-valley area gate; 0 = off) |
PeakStatsConfig
| Property | Default |
|---|
maxPeakwidthCents | 50 |
minPeakwidthCents | 25 |
symmetricBounds | true |
Common Pitfalls
tonicHz must be > 0. All methods convert Hz → cents relative to tonic; a zero or negative tonic produces NaN.
- Cleanup the contour first. Raw contours with octave errors produce misleading histograms; run through
PitchProcessing.process(contour, PitchProcessingConfig.SCORING) before analyzing.
targetIntervalsCents is in cents, not Hz. Use MusicTheory.EQ_TEMPERED_INTERVALS_CENTS_BASE for 12-TET (returns List<Int>; convert to FloatArray).
- Histogram smoothing is on by default (
smoothSigma = 5f). Use HistogramConfig.RAW to disable.
See also