Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Robust Automatic Speech Recognition Using a Perceptually-Based Optimal Spectral Amplitude Estimator Speech Enhancement Algorithm in Various Low-SNR Environments

Hesham Tolba, Zili Li, Douglas O'Shaughnessy

Université du Québec, Canada

This paper addresses the problem of noise robustness of automatic speech recognition (ASR) systems in various noisy environments using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator (MMSE-STSA). This was accomplished by the integration of a Perceptual Weighting Filter (PWF) with the MMSESTSA algorithm in order to improve the preprocessing speech enhancement performance. The proposed PWF-based STSA algorithm is integrated in the front-end of an ASR system in order to evaluate its robustness in severe interfering noisy environments. Experiments were conducted using a noisy version of speech signals extracted from the TIMIT database. The Hidden Markov model Toolkit (HTK) was used throughout our experiments. Results show that the proposed approach when included in the front-end of an HTK-based ASR system, outperforms that of the conventional recognition process in interfering noisy environments for a wide range of SNRs down to -4 dB.

Full Paper

Bibliographic reference.  Tolba, Hesham / Li, Zili / O'Shaughnessy, Douglas (2005): "Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environments", In INTERSPEECH-2005, 937-940.