11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Methods for Robust Speech Recognition in Reverberant Environments: A Comparison

Rico Petrick (1), Thomas Fehér (1), Masashi Unoki (2), Rüdiger Hoffmann (1)

(1) Technische Universität Dresden, Germany
(2) JAIST, Japan

In this article the authors continue previous studies regarding the investigation of methods that aim to increase the recognition rate (RR) in reverberant environments of automatic speech recognition systems. Previously three robust front-end methods are tested, the harmonicity based feature analysis (HFA), the temporal power envelope feature analysis and their combination. This paper additionally introduces two well-known methods into the comparison. These are the dereverberation method using the inverse modulation transfer function and the delay-and-sum beamformer (DSB). Recognition experiments are accomplished for command word recognition. The results of this first comparison of such methods prove experimentally some drawn assumptions, e. g. the IMTF method achieves robustness only in the far field, the DSB improves the RR slightly but is outperformed by the HFA due to its indirectivity at low frequencies.

Full Paper

Bibliographic reference.  Petrick, Rico / Fehér, Thomas / Unoki, Masashi / Hoffmann, Rüdiger (2010): "Methods for robust speech recognition in reverberant environments: a comparison", In INTERSPEECH-2010, 582-585.