EUROSPEECH 2001 Scandinavia
The subject of this paper is robust voice activity detection (VAD) in noisy environments, especially in car environments. We present a comparison between several frame based VAD feature extraction algorithms in combination with different classifiers. Experiments are carried out under equal test conditions using clean speech, clean speech with added car noise and speech recorded in car environments. The lowest error rate is achieved applying features based on a likelihood ratio test which assumes normal distribution of speech and noise and a perceptron classifier. We propose modifications of this algorithm which reduce the frame error rate by approximately 30% relative in our experiments compared to the original algorithm.
Bibliographic reference. Stadermann, J. / Stahl, V. / Rose, G. (2001): "Voice activity detection in noisy environments", In EUROSPEECH-2001, 1851-1854.