12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Improved a posteriori Speech Presence Probability Estimation Based on Cepstro-Temporal Smoothing and Time-Frequency Correlation

Chao Li, Wenju Liu

Chinese Academy of Sciences, China

In this paper, we present a novel estimator for the SPP at each time-frequency point in the short-time Fourier transform (STFT) domain. Existing speech presence probability (SPP) estimators cannot perform quite reliably in nonstationary noise environment when applied to a speech enhancement task. To overcome this limitation, we propose a novel SPP estimation method. Firstly, the spectral outliers are eliminated by selectively smoothing the maximum likelihood estimate of a priori signal-noise ratio (SNR) in the cepstral domain. Furthermore, an adaptive tracking method for a priori SPP is derived by exploiting the strong correlation of speech presence in neighboring frequency bins of consecutive frames. The proposed approach outperforms the state-of-the-art approaches, resulting in less noise leakage and low speech distortions in both stationary and nonstationary noise environments.

Full Paper

Bibliographic reference.  Li, Chao / Liu, Wenju (2011): "Improved a posteriori speech presence probability estimation based on cepstro-temporal smoothing and time-frequency correlation", In INTERSPEECH-2011, 1201-1204.