INTERSPEECH 2004 - ICSLP
Currently, new trends in wireless communications are demanding reliable human-machine interaction in real-life environments. However, there are obstacles inhibiting automatic speech recognition systems (ASR) working in noisy environments. The main difficulty is the degradation suffered by ASR systems due to a mismatch between training and test conditions. This paper shows an improved voice activity detector (VAD) combining noise reduction and subband divergence estimation for improving the reliability of speech recognizers operating in noisy environments. The algorithm formulates the decision rule by measuring the divergence between the subband spectral magnitude of speech and noise using the Kullback-Leibler (KL) distance on the denoised signal. Experiments demonstrate a sustained advantage over different VAD methods including standard VADs such as G.729 and AMR, which are used as a reference, recently reported algorithms, and the VADs of the advanced frontend (AFE) for distributed speech recognition (DSR).
Bibliographic reference. Ramirez, Javier / Segura, Josť Carlos / Benitez, Carmen / Torre, Angel de la / Rubio, Antonio (2004): "Improved voice activity detection combining noise reduction and subband divergence measures", In INTERSPEECH-2004, 961-964.