Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Voice Activity Detector Based on Enhanced Cumulant of LPC Residual and On-Line EM Algorithm

David Cournapeau (1), Tatsuya Kawahara (1), Kenji Mase (2), Tomoji Toriyama (3)

(1) Kyoto University, Japan; (2) Nagoya University, Japan; (3) ATR-MIS, Japan

This paper addresses the problem of segmenting audio data recorded with embedded devices for the purpose of intelligent sensing in the context of multi-modal interactions. We propose a real-time method for robust speech detection in natural, noisy environments. It is based on a fusion of high order statistics of the LPC residual and autocorrelation, and adopts an on-line version of Expectation Maximization algorithm for the classification. Experimental evaluations show that the proposed method provides better detection performance under different types of natural noises, working robustly against other voices in the context of multi-speaker interactive situations. As the proposed method is based on features which have a low computational cost, and has a small latency, it is suitable for real-time tracking applications.

Full Paper

Bibliographic reference.  Cournapeau, David / Kawahara, Tatsuya / Mase, Kenji / Toriyama, Tomoji (2006): "Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm", In INTERSPEECH-2006, paper 1375-Tue3A1O.1.