7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Sequential MAP Noise Estimation and a Phase-Sensitive Model of the Acoustic Environment

Li Deng, Jasha Droppo, Alex Acero

Microsoft Research, USA

In this paper we present an MMSE (minimum mean square error) speech feature enhancement algorithm, capitalizing on a new probabilistic, nonlinear environment model that effectively incorporates the phase relationship between the clean speech and the corrupting noise in acoustic distortion. The MMSE estimator based on this phase-sensitive model is derived and it achieves high efficiency by exploiting single-point Taylor series expansion to approximate the joint probability of clean and noisy speech as a multivariate Gaussian. As an integral component of the enhancement algorithm, we also present a new sequential MAP-based nonstationary noise estimator. Experimental results on the Aurora2 task demonstrate the importance of exploiting the phase relationship in the speech corruption process captured by the MMSE estimator. The phase-sensitive MMSE estimator reported in this paper performs significantly better than phase-insensitive spectral subtraction (54% error rate reduction), and also noticeably better than a phase-insensitive MMSE estimator as our previous state-of-the-art technique reported in [2] (7% error rate reduction), under otherwise identical experimental conditions of speech recognition.

Full Paper

Bibliographic reference.  Deng, Li / Droppo, Jasha / Acero, Alex (2002): "Sequential MAP noise estimation and a phase-sensitive model of the acoustic environment", In ICSLP-2002, 1813-1816.