9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Analysis of Physiologically-Motivated Signal Processing for Robust Speech Recognition

Yu-Hsiang Bosco Chiu, Richard M. Stern

Carnegie Mellon University, USA

This paper discusses the relative impact that different stages of a popular auditory model have on improving the accuracy of automatic speech recognition in the presence of additive noise. Recognition accuracy is measured using the CMU SPHINX-III speech recognition system, and the DARPA Resource Management speech corpus for training and testing. It is shown that feature extraction based on auditory processing provides better performance in the presence of additive background noise than traditional MFCC processing and it is argued that an expansive nonlinearity in the auditory model contributes the most to noise robustness.

Full Paper

Bibliographic reference.  Chiu, Yu-Hsiang Bosco / Stern, Richard M. (2008): "Analysis of physiologically-motivated signal processing for robust speech recognition", In INTERSPEECH-2008, 1000-1003.