11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Feature versus Model Based Noise Robustness

Kris Demuynck, Xueru Zhang, Dirk Van Compernolle, Hugo Van hamme

Katholieke Universiteit Leuven, Belgium

Over the years, the focus in noise robust speech recognition has shifted from noise robust features to model based techniques such as parallel model combination and uncertainty decoding. In this paper, we contrast prime examples of both approaches in the context of large vocabulary recognition systems such as used for automatic audio indexing and transcription. We look at the approximations the techniques require to keep the computational load reasonable, the resulting computational cost, and the accuracy measured on the Aurora4 benchmark. The results show that a well designed feature based scheme is capable of providing recognition accuracies at least as good as the model based approaches at a substantially lower computational cost.

Full Paper

Bibliographic reference.  Demuynck, Kris / Zhang, Xueru / Compernolle, Dirk Van / Van hamme, Hugo (2010): "Feature versus model based noise robustness", In INTERSPEECH-2010, 721-724.