7th International Conference on Spoken Language Processing
September 16-20, 2002
To reduce the effects of additive noises, spectral subtraction (SS) is often used. We discuss SS on the power spectral domain. This method has two problems to make the estimation of clean speech dif- ficult:(1) There exists the estimation error between true power spectrum of noise and estimated one (2) The correlation between speech and noise also exists because of the phase difference. To overcome these problems, we proposed a spectral subtraction using a smoothing method of time direction. We consider the average of estimated speech power spectra over some frames as the estimated speech power spectrum. This operation makes the estimation of noise more accurate. We can reduce the effect of correlation between speech and noise.
In this paper, we tested this method on the AURORA 2 database, which consists of English connected digit added with various realistic noises. We achieved 47.26% relative improvement of word accuracy with acoustic models trained under clean condition and 11.95% with models trained under multi-condition.
Bibliographic reference. Kitaoka, Norihide / Nakagawa, Seiichi (2002): "Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 task", In ICSLP-2002, 477-480.