VADConfig
data class VADConfig(val backend: VADBackend = VADBackend.SPEECH, val sampleRate: Int = 16000, val threshold: Float = 0.5f, val minSpeechDuration: Float = 0.25f, val minSilenceDuration: Float = 0.25f, val windowSize: Int = 512, val numThreads: Int = 1, val modelPath: String? = null, val rmsThreshold: Float = 0.05f, val pitchProbThreshold: Float = 0.5f, val minPitch: Float = 50.0f)
Configuration for Voice Activity Detection.
Constructors
Link copied to clipboard
constructor(backend: VADBackend = VADBackend.SPEECH, sampleRate: Int = 16000, threshold: Float = 0.5f, minSpeechDuration: Float = 0.25f, minSilenceDuration: Float = 0.25f, windowSize: Int = 512, numThreads: Int = 1, modelPath: String? = null, rmsThreshold: Float = 0.05f, pitchProbThreshold: Float = 0.5f, minPitch: Float = 50.0f)
Types
Properties
Link copied to clipboard
VAD engine to use (SPEECH, GENERAL, or SINGING_REALTIME)
Link copied to clipboard
Minimum silence duration in seconds (default: 0.25)
Link copied to clipboard
Minimum speech duration in seconds (default: 0.25)
Link copied to clipboard
Number of inference threads (default: 1)
Link copied to clipboard
Pitch probability threshold for GENERAL backend (default: 0.5)
Link copied to clipboard
RMS threshold for GENERAL backend (default: 0.05)
Link copied to clipboard
Audio sample rate in Hz (default: 16000)
Link copied to clipboard
Processing window size in samples (default: 512 = 32ms at 16kHz)