11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A MMSE Estimator in Mel-Cepstral Domain for Robust Large Vocabulary Automatic Speech Recognition Using Uncertainty Propagation

Ramón Fernandez Astudillo, Reinhold Orglmeister

Chair of Electronics and Medical Signal Processing, Technical University Berlin, Germany

Uncertainty propagation techniques achieve a more robust automatic speech recognition by modeling the information missing after speech enhancement in the short-time Fourier transform (STFT) domain in probabilistic form. This information is then ropagated into the feature domain where recognition takes place and combined with observation uncertainty techniques like uncertainty decoding. In this paper we show how uncertainty propagation can also be used to yield minimum mean square error (MMSE) estimates of the clean speech directly in the recognition domain. We develop a MMSE estimator for the Mel-cepstral features by propagation of the Wiener filter posterior distribution and show how it outperforms conventional MMSE methods in the STFT domain on the AURORA4 large vocabulary test environment.

Full Paper

Bibliographic reference.  Astudillo, Ramón Fernandez / Orglmeister, Reinhold (2010): "A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation", In INTERSPEECH-2010, 713-716.