8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Improved Voice Activity Detection Combining Noise Reduction and Subband Divergence Measures

Javier Ramirez, Josť Carlos Segura, Carmen Benitez, Angel de la Torre, Antonio Rubio

Universidad de Granada, Spain

Currently, new trends in wireless communications are demanding reliable human-machine interaction in real-life environments. However, there are obstacles inhibiting automatic speech recognition systems (ASR) working in noisy environments. The main difficulty is the degradation suffered by ASR systems due to a mismatch between training and test conditions. This paper shows an improved voice activity detector (VAD) combining noise reduction and subband divergence estimation for improving the reliability of speech recognizers operating in noisy environments. The algorithm formulates the decision rule by measuring the divergence between the subband spectral magnitude of speech and noise using the Kullback-Leibler (KL) distance on the denoised signal. Experiments demonstrate a sustained advantage over different VAD methods including standard VADs such as G.729 and AMR, which are used as a reference, recently reported algorithms, and the VADs of the advanced frontend (AFE) for distributed speech recognition (DSR).

Full Paper

Bibliographic reference.  Ramirez, Javier / Segura, Josť Carlos / Benitez, Carmen / Torre, Angel de la / Rubio, Antonio (2004): "Improved voice activity detection combining noise reduction and subband divergence measures", In INTERSPEECH-2004, 961-964.