EUROSPEECH 2003 - INTERSPEECH 2003
Maintaining a high level of robustness for Automatic Speech Recognition (ASR) systems is especially challenging when the background noise has a time-varying nature. We have implemented a Model-Based Feature Enhancement (MBFE) technique that not only can easily be embedded in the feature extraction module of a recogniser, but also is intrinsically suited for the removal of non-stationary additive noise. To this end we combine statistical models of the cepstral feature vectors of both clean speech and noise, using a Vector Taylor Series approximation in the power spectral domain. Based on this combined HMM, a global MMSE-estimate of the clean speech is then calculated. Because of the scalability of the applied models, MBFE is flexible and computationally feasible. Recognition experiments with this feature enhancement technique on the Aurora2 connected digit recognition task showed significant improvements on the noise robustness of the HTK recogniser.
Bibliographic reference. Stouten, Veronique / Hamme, Hugo van / Demuynck, Kris / Wambacq, Patrick (2003): "Robust speech recognition using model-based feature enhancement", In EUROSPEECH-2003, 17-20.