EUROSPEECH 2001 Scandinavia
In this paper, a variety of techniques for robust digit recognition in noise are considered using the AURORA 2.0 corpus. Current recognizers perform as well as humans in small vocabulary tasks but computer recognition performance degrades substantially when noise is introduced into the speech, while human performance is much less sensitive. To make the recognizer robust, several methodologies are employed. These include, feature processing, enhancement before recognition and model adaptation. We considered a number of processing and adaptation scenarios depending on noise type. The best performance, as expected, was obtained in matched training conditions which in general has limited applicability in real world problems. As a feature processing step, using RCCs (Root Cepstrum Coeff.) instead of MFCCs gave substantial improvement. MFCC with front-end enhancement increased performance considerably, but results were far from that obtained with matched training. When we combine the RCC with enhancement, however, we get the best results. In the next step, we employed model adaptation techniques which outperformed MFCC+enhancement and gave much closer results to the matched condition limits. However, MFCC adaptation could not outperform RCC parameterization with front-end enhancement, which we show is much more computationally efficient than model adaptation.
Bibliographic reference. Yapanel, Umit / Hansen, John H. L. / Sarikaya, Ruhi / Pellom, Bryan (2001): "Robust digit recognition in noise: an evaluation using the AURORA corpus", In EUROSPEECH-2001, 209-212.