11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Bayesian Approach to Voice Activity Detection Using Multiple Statistical Models and Discriminative Training

Tao Yu, John H. L. Hansen

University of Texas at Dallas, USA

In this study, the problem of voice activity detection (VAD) is formulated in a Bayesian hypothesis testing framework. Unlike traditional VAD schemes that employ a single statistical model, multiple models are assumed to be potentially engaged with a priori probabilities, due to the statical diversity of the environmental noise degrading the speech. Moreover, the optimal a priori probabilities are explored using discriminative training based method, which is suggested to directly reduce the miss-hit rate and false-alarm rate of the VAD. As shown in the evaluations, VAD performance, both in terms of absolute performance and consistency across a diverse set of noise conditions, can be significantly improved using the proposed Bayesian method.

Full Paper

Bibliographic reference.  Yu, Tao / Hansen, John H. L. (2010): "A Bayesian approach to voice activity detection using multiple statistical models and discriminative training", In INTERSPEECH-2010, 3114-3117.