Sixth European Conference on Speech Communication and Technology
The performance of voice dialling systems often degrades rapidly as the intensity of the background noise increases. In this paper, we describe a neural network based speech enhancement technique for improving the speech recognition performance of a voice dialling sys-tem in very noisy real world type conditions. The speech samples were recorded in laboratory conditions and after-wards corrupted by adding car noise or babble noise recorded in a cafe. These noise corrupted speech samples were enhanced in cepstral domain by a context dependent multilayer perceptron (MLP) network before performing the recognition using a hidden Markov model (HMM) based speech recognition system. The accuracy of the test set increased 58%, 55% and 46% in the car noise envi-ronments having -5 dB, 0 dB and 5 dB SNRs, respec-tively. The accuracy of the test set increased 44%, 48% and 39% in the babble noise environments having SNR 5 dB, 10 dB and 15 dB, respectively. The accuracy remained approximately same for both car and babble noise environments when having SNR of 20 dB.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Haverinen, Hemmo / Salmela, Petri / Häkkinen, Juha / Lehtokangas, Mikko / Saarinen, Jukka (1999): "MLP network for enhancement of noisy MFCC vectors", In EUROSPEECH'99, 2371-2374.