7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Exploiting Variances in Robust Feature Extraction Based on a Parametric Model of Speech Distortion

Li Deng, Jasha Droppo, Alex Acero

Microsoft Research, USA

This paper presents a technique that exploits the denoised speech’s variance, estimated during the speech feature enhancement process, to improve noise-robust speech recognition. This technique provides an alternative to the Bayesian predictive classification decision rule by carrying out an integration over the feature space instead of over the model-parameter space, offering a much simpler system implementation and lower computational cost. We extend our earlier work [5] by using a new approach, based on a parametric model of speech distortion and thus free from the use of any stereo training data, to statistical feature enhancement, for which a novel algorithm for estimating the variance of the enhanced speech features is developed. Experimental evaluation using the full Aurora2 test data sets demonstrates an 11.4% digit error rate reduction averaged over all noisy and SNR conditions, compared with the best technique we have developed [2] prior to this work that did not exploit the variance information and that required no stereo training data.

Full Paper

Bibliographic reference.  Deng, Li / Droppo, Jasha / Acero, Alex (2002): "Exploiting variances in robust feature extraction based on a parametric model of speech distortion", In ICSLP-2002, 2449-2452.