5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Improving The Noise And Spectral Robustness Of An Isolated-Word Recognizer Using An Auditory-Model Front End

Martin Hunke (1), Meeran Hyun (1), Steve Love (2), Thomas Holton (1)

(1) San Francisco State University, San Francisco, CA, USA
(2) Meridian Speech Technology, USA

In this study, the performance of an auditory-model feature-extraction 'front end' was assessed in an isolated-word speech recognition task using a common hidden Markov model (HMM) 'back end', and compared with the performance of other feature representation front-end methods including mel-frequency cepstral coefficients (MFCC) and two variants (J- and L-) of the relative spectral amplitude (RASTA) technique. The recognition task was performed in the presence of varying levels and types of additive noise and spectral distortion using standard HMM whole-word models with the Bellcore Digit database as a corpus. While all front ends achieved comparable recognition performance in clean speech, the performance of the auditory-model front end was generally significantly higher than other methods in recognition tasks involving background noise or spectral distortion. Training HMMs with speech processed by the auditory-model or L-RASTA front end in one type of noise also improved the recognition performance with other kinds of noise. This 'cross-training' effect did not occur with the MFCC or J-RASTA front end.

Full Paper

Bibliographic reference.  Hunke, Martin / Hyun, Meeran / Love, Steve / Holton, Thomas (1998): "Improving the noise and spectral robustness of an isolated-word recognizer using an auditory-model front end", In ICSLP-1998, paper 0715.