INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Voice Activity Detection in MTF-Based Power Envelope Restoration

Masashi Unoki (1), Xugang Lu (2), Rico Petrick (3), Shota Morita (1), Masato Akagi (1), Rüdiger Hoffmann (3)

(1) JAIST, Japan
(2) NICT, Japan
(3) Technische Universität Dresden, Germany

This paper reports comparative evaluations of conventional voice activity detection (VAD) methods in reverberant environments. Both conventional and standard (G.729) methods are discussed. In general, these methods work well under clean conditions, but their performance is drastically affected by reverberation. Preliminary comparative evaluations showed that the false acceptance rate (FAR) is significantly increased due to the false rejection rate (FRR) being moderately increased by reverberation. We therefore developed a method using MTF-based power envelope restoration to improve the robustness of VAD in reverberant environments. This restoration method can blindly restore the power envelope of reverberant speech based on the MTF concept. The proposed method consists of an MTF-based restoration method as the front end and a conventional VAD method as the final decision. Experimental results demonstrated that the proposed method is superior to conventional methods with regard to robustness and providing accurate VAD (reducing both FAR and FRR) in reverberant environments.

Full Paper

Bibliographic reference.  Unoki, Masashi / Lu, Xugang / Petrick, Rico / Morita, Shota / Akagi, Masato / Hoffmann, Rüdiger (2011): "Voice activity detection in MTF-based power envelope restoration", In INTERSPEECH-2011, 2609-2612.