EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Use of a CSP-Based Voice Activity Detector for Distant-Talking ASR

Luca Armani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer

ITCirst, Italy

This paper addresses the problem of voice activity detection for distant-talking speech recognition in noisy and reverberant environment. The proposed algorithm is based on the same Cross-power Spectrum Phase analysis that is used for talker location and tracking purposes. A normalized feature is derived, which is shown to be more effective than an energy-based one. The algorithm exploits that feature by dynamically updating the threshold as a non-linear average value computed during the preceding pause. Given a real multichannel database, recorded with the speaker at 2.5 meter distance from the microphones, experiments show that the proposed algorithm provides a relevant relative error rate reduction.

Full Paper

Bibliographic reference.  Armani, Luca / Matassoni, Marco / Omologo, Maurizio / Svaizer, Piergiorgio (2003): "Use of a CSP-based voice activity detector for distant-talking ASR", In EUROSPEECH-2003, 501-504.