8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Evaluation of Real-Time Voice Activity Detection Based on High Order Statistics

David Cournapeau, Tatsuya Kawahara

Kyoto University, Japan

We have proposed a method for real-time, unsupervised voice activity detection (VAD). In this paper, problems of feature selection and classification scheme are addressed. The feature is based on High Order Statistics (HOS) to discriminate close and far-field talk, enhanced by a feature derived from the normalized autocorrelation. Comparative effectiveness on several HOS is shown. The classification is done in real-time with a recursive, online EM algorithm. The algorithm is evaluated on the CENSREC-1-C database, which is used for VAD evaluation for automatic speech recognition (ASR) [1], and the proposed method is confirmed to significantly outperform the baseline energy-based method.

Full Paper

Bibliographic reference.  Cournapeau, David / Kawahara, Tatsuya (2007): "Evaluation of real-time voice activity detection based on high order statistics", In INTERSPEECH-2007, 2945-2948.