In this study, the problem of voice activity detection (VAD) is formulated in a Bayesian hypothesis testing framework. Unlike traditional VAD schemes that employ a single statistical model, multiple models are assumed to be potentially engaged with a priori probabilities, due to the statical diversity of the environmental noise degrading the speech. Moreover, the optimal a priori probabilities are explored using discriminative training based method, which is suggested to directly reduce the miss-hit rate and false-alarm rate of the VAD. As shown in the evaluations, VAD performance, both in terms of absolute performance and consistency across a diverse set of noise conditions, can be significantly improved using the proposed Bayesian method.
Bibliographic reference. Yu, Tao / Hansen, John H. L. (2010): "A Bayesian approach to voice activity detection using multiple statistical models and discriminative training", In INTERSPEECH-2010, 3114-3117.