Skip to main content

PitchAnalysis

Histograms, tuning estimation, quantization, and melodic transcription on a PitchContour. Use after detection (PitchDetection) and optional cleanup (PitchProcessing).

For full intonation scoring (per-note deviation + 0–100 score), use Accura.analyzePitching — Accura wraps this facade.

Quick Start

Kotlin

val extractor = PitchDetection.createContourExtractor(ContourExtractorConfig.SCORING)
val contour = extractor.extract(audioSamples, 16000)
extractor.release()

val histogram = PitchAnalysis.computeHistogram(contour, tonicHz = 196f) // G3
val tuningOffset = PitchAnalysis.estimateTuningOffset(contour, refFreqHz = 196f)

val targetIntervals = MusicTheory.EQ_TEMPERED_INTERVALS_CENTS_BASE
.map { it.toFloat() }.toFloatArray()
val segments = PitchAnalysis.labelByMeanPitch(contour, tonicHz = 196f, targetIntervals)
for (seg in segments) {
println("${seg.label}: ${seg.startSeconds}s – ${seg.endSeconds}s")
}

Swift

let histogram = PitchAnalysis.computeHistogram(contour: contour, tonicHz: 196)
let offset = PitchAnalysis.estimateTuningOffset(contour: contour, refFreqHz: 196)

let targets = MusicTheory.eqTemperedIntervalsCentsBase.map { Float($0) }
let segments = PitchAnalysis.labelByMeanPitch(
contour: contour, tonicHz: 196, targetIntervalsCents: targets
)

Methods

Histogram

MethodDescription
computeHistogram(contour, tonicHz, config = HistogramConfig.DEFAULT): PitchHistogramBins pitch in cents relative to tonic. Optional fold-octaves, density normalization, smoothing.
estimateTuningOffset(contour, refFreqHz, centTolerance: Float = 50f): FloatAligns histogram peaks to the 12-TET grid; returns offset in cents.

Transcription

MethodDescription
quantize(contour, tonicHz, targetIntervalsCents, config = QuantizationConfig.DEFAULT): PitchContourSnap stable frames to nearest target interval; non-stable frames become unvoiced.
labelByMeanPitch(contour, tonicHz, targetIntervalsCents, config = LabellingConfig.DEFAULT): List<TonalSegment>Sliding-window mean-pitch labelling against target intervals.
fitLinearSegments(contour, tonicHz, config = LinearFitConfig.DEFAULT): List<TonalSegment>Piecewise linear regression (gamaka / ornament analysis).

Config classes

HistogramConfig

PropertyTypeDefaultDescription
numBinsInt?null (auto)Number of bins; null selects from range
densityBooleantrueNormalize so total area = 1
foldOctavesBooleanfalseFold all pitches into one octave
modeHistogramModeDURATIONDURATION (time-weighted) or INSTANCE_COUNT
smoothSigmaFloat?5fGaussian smoothing sigma; null = no smoothing

Presets: DEFAULT, FOLDED (foldOctaves=true), RAW (density=false, smoothSigma=null).

QuantizationConfig

PropertyTypeDefault
slopeThresholdCentsPerSecFloat150
maxDeviationCentsFloat50
medianFilterWindowSamplesInt7
applyMedianFilterBooleantrue
minSegmentDurationMsInt?null

LabellingConfig

PropertyTypeDefault
windowSecondsFloat0.150
hopSecondsFloat0.030

LinearFitConfig

PropertyTypeDefault
windowSecondsFloat1.5
breakThresholdSecondsFloat1.5
hopSecondsFloat?null (= window/2)

Result types

PitchHistogram

data class PitchHistogram(
val binCenters: FloatArray,
val values: FloatArray,
val tonicHz: Float,
val isDensity: Boolean,
val isFolded: Boolean,
val mode: HistogramMode
)
MethodDescription
smooth(sigma = 5f): PitchHistogramNew histogram with Gaussian-smoothed values
normalizeArea(): PitchHistogramNew histogram normalized so total area = 1
getPeaksValleys(targetIntervalsCents, config = PeakDetectionConfig.DEFAULT): PeakDataPeak detection — "hybrid", "slope", or "interval" methods
computePeakStats(peakData, rawPitchCents, refIntervalsCents, config = PeakStatsConfig.DEFAULT): PeakStatsCollectionPer-peak distribution stats

TonalSegment

data class TonalSegment(
val startSeconds: Float,
val endSeconds: Float,
val label: String? = null, // labelByMeanPitch only
val meanCents: Float? = null, // labelByMeanPitch only
val slopeCentsPerSec: Float? = null, // fitLinearSegments only
val interceptCents: Float? = null, // fitLinearSegments only
)

duration: Float (computed) returns endSeconds - startSeconds.

TonalSegment is distinct from calibra.model.Segment (which models song structure with index / lyrics / student timing).

PeakStats / PeakStatsCollection

PeakStats carries per-peak statistics: referenceInterval, peakPosition, peakAmplitude, mean, median, stdDev, variance, coeffOfVariation, skewness, kurtosis, pearsonSkew2. PeakStatsCollection is a Map<Float, PeakStats> keyed by reference interval, iterable.

PeakDetectionConfig

PropertyDefault
method"hybrid" (also "slope", "interval")
peakAmpThresh0.00005f
valleyThresh0.00003f
lookahead20
avgIntervalHintnull
minPeakAreaFraction0f (relative valley-to-valley area gate; 0 = off)

PeakStatsConfig

PropertyDefault
maxPeakwidthCents50
minPeakwidthCents25
symmetricBoundstrue

Common Pitfalls

  1. tonicHz must be > 0. All methods convert Hz → cents relative to tonic; a zero or negative tonic produces NaN.
  2. Cleanup the contour first. Raw contours with octave errors produce misleading histograms; run through PitchProcessing.process(contour, PitchProcessingConfig.SCORING) before analyzing.
  3. targetIntervalsCents is in cents, not Hz. Use MusicTheory.EQ_TEMPERED_INTERVALS_CENTS_BASE for 12-TET (returns List<Int>; convert to FloatArray).
  4. Histogram smoothing is on by default (smoothSigma = 5f). Use HistogramConfig.RAW to disable.

See also