Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Front-End Improvements to Reduce Stationary & Variable Channel and Noise Distortions in Continuous Speech Recognition Tasks

Xavier MenÚndez-Pidal, Ruxin Chen, Duanpei Wu, Mick Tanaka

SONY US Research Labs, San Jose, CA, USA

This paper introduces our actual work in front-end techniques to obtain robust speech recognition devices in mismatch conditions (additive noise mismatch and channel mismatch). Two algorithms have been combined to compensate the distortions due to different channel characteristics and additive noise: 1) A Cepstral Mean Normalization and Variance Scaling technique (MNVS) and 2) An Adaptive Gaussian Attenuation algorithm (AGA). Combining both techniques the channel distortion effects were reduced to 90% on the HTIMIT task and the additive noise effects were reduced to 80% on the TIMIT task corrupted with additive car noise.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  MenÚndez-Pidal, Xavier / Chen, Ruxin / Wu, Duanpei / Tanaka, Mick (1999): "Front-end improvements to reduce stationary & variable channel and noise distortions in continuous speech recognition tasks", In EUROSPEECH'99, 2849-2852.