We have proposed a method for real-time, unsupervised voice activity detection (VAD). In this paper, problems of feature selection and classification scheme are addressed. The feature is based on High Order Statistics (HOS) to discriminate close and far-field talk, enhanced by a feature derived from the normalized autocorrelation. Comparative effectiveness on several HOS is shown. The classification is done in real-time with a recursive, online EM algorithm. The algorithm is evaluated on the CENSREC-1-C database, which is used for VAD evaluation for automatic speech recognition (ASR) , and the proposed method is confirmed to significantly outperform the baseline energy-based method.
Bibliographic reference. Cournapeau, David / Kawahara, Tatsuya (2007): "Evaluation of real-time voice activity detection based on high order statistics", In INTERSPEECH-2007, 2945-2948.