8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

An Optimal Speech Enhancement Under Speech Uncertainty Probability and Masking Property of Auditory System

Xiaoshan Huang, Xiaoqun Zhao

Tongji University, China

Recently, I. Cohen has presented causal and noncausal algorithms to modify the classic decision-directed approach for prior SNR. It is well-known that prior SNR is critical to trade off the musical noise level and the audible clearness level in spectral subtraction speech enhancement. However, all these algorithms conflict with statistical signal model more or less. To adjust smoothing parameters which play an important role on the recursive procedure of prior SNR and noise spectrum estimate more reasonably, we present novel speech uncertainty state model which capitalizes on the masking property of auditory system, and propose a new modified approach which employs speech uncertainty probability to make automatic adaptation of smoothing parameters. Novel algorithm is capable of eliminating musical noise meanwhile lowering speech distortion by remaining original speech in the case of inaudible noise under masking threshold. Experiments confirm that novel algorithm is superior to classic methods, particularly at low SNR environment.

Full Paper

Acoustic Material

CleanSpeech.wav This clean speech section is extracted from Chinese male data pool.
White-5dB.wav This audio file is sythesized by the above clean speech with the white noise which is extracted from noise database. The segmental SNR is -5 dB.
Casual.wav This audio file is the result of White -5 dB audio file, utilizing Casual speech enhancement method which has been proposed by I. Cohen.
OM-MMLS.wav This audio file is the result of White -5 dB audio file, utilizing Minimum Mean-Square Error Log-Spectral Amplitude algorithm which has been proposed by Yariv Ephraim and David Malah.
Novel.wav This audio file is the result of White -5 dB audio file, usng the New modified method proposed by this paper's authors.

Bibliographic reference.  Huang, Xiaoshan / Zhao, Xiaoqun (2007): "An optimal speech enhancement under speech uncertainty probability and masking property of auditory system", In INTERSPEECH-2007, 862-865.