8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Combination of Temporal Domain SVD Based Speech Enhancement and GMM Based Speech Estimation for ASR in Noise - Evaluation on the AURORA2 Task -

Masakiyo Fujimoto, Yasuo Ariki

Ryukoku University, Japan

In this paper, we propose a noise robust speech recognition method by combination of temporal domain singular value decomposition( SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The bottleneck of GMM based approach is a noise estimation problem. For this noise estimation problem, we incorporated the adaptive noise estimation in GMM based approach. Furthermore, in order to obtain higher recognition accuracy, we employed a temporal domain SVD based speech enhancement method as a pre-processing module of the GMM based approach. In addition, to reduce the influence of the noise included in the noisy speech, we introduced an adaptive over-subtraction factor into the SVD based speech enhancement. Usually, a noise reduction method has a problem that it degrades the recognition rate because of spectral distortion caused by residual noise occurred through noise reduction and over estimation. To solve the problem in the noise reduction method, acoustic model adaptation is employed by using an unsupervised MLLR to the distorted speech signal. In evaluation on the AURORA2 tasks, our method showed the improvement in relative improvement of clean condition training task.

Full Paper

Bibliographic reference.  Fujimoto, Masakiyo / Ariki, Yasuo (2003): "Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -", In EUROSPEECH-2003, 1781-1784.