VADConfig

data class VADConfig(val backend: VADBackend = VADBackend.SPEECH, val sampleRate: Int = 16000, val threshold: Float = 0.5f, val minSpeechDuration: Float = 0.25f, val minSilenceDuration: Float = 0.25f, val windowSize: Int = 512, val numThreads: Int = 1, val modelPath: String? = null, val rmsThreshold: Float = 0.05f, val pitchProbThreshold: Float = 0.5f, val minPitch: Float = 50.0f)

Configuration for Voice Activity Detection.

Constructors

Link copied to clipboard
constructor(backend: VADBackend = VADBackend.SPEECH, sampleRate: Int = 16000, threshold: Float = 0.5f, minSpeechDuration: Float = 0.25f, minSilenceDuration: Float = 0.25f, windowSize: Int = 512, numThreads: Int = 1, modelPath: String? = null, rmsThreshold: Float = 0.05f, pitchProbThreshold: Float = 0.5f, minPitch: Float = 50.0f)

Types

Link copied to clipboard
class Builder

Builder for VADConfig.

Link copied to clipboard
object Companion

Properties

Link copied to clipboard

VAD engine to use (SPEECH, GENERAL, or SINGING_REALTIME)

Link copied to clipboard

Minimum pitch in Hz for GENERAL backend (default: 50)

Link copied to clipboard

Minimum silence duration in seconds (default: 0.25)

Link copied to clipboard

Minimum speech duration in seconds (default: 0.25)

Link copied to clipboard

Custom model path, null for bundled model

Link copied to clipboard

Number of inference threads (default: 1)

Link copied to clipboard

Pitch probability threshold for GENERAL backend (default: 0.5)

Link copied to clipboard

RMS threshold for GENERAL backend (default: 0.05)

Link copied to clipboard

Audio sample rate in Hz (default: 16000)

Link copied to clipboard

Detection threshold 0.0-1.0 (default: 0.5)

Link copied to clipboard

Processing window size in samples (default: 512 = 32ms at 16kHz)