INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Comparative Study of Noise Estimation Algorithms for VTS-Based Robust Speech Recognition

Yong Zhao, Biing-Hwang Juang

Georgia Institute of Technology, USA

We conduct a comparative study to investigate two noise estimation approaches for robust speech recognition using vector Taylor series (VTS) developed in the past few years. The first approach, iterative root finding (IRF), directly differentiates the EM auxiliary function and approximates the root of the derivative function through recursive refinements. The second approach, twofold expectation maximization (TEM), estimates noise distributions by regarding them as hidden variables in a modified EM fashion. Mathematical derivations reveal the substantial connection between the two approaches. Two experiments are performed in evaluating the performance and convergence rate of the algorithms. The first is to fit a GMM model to artificially corrupted samples that are generated through Monte Carlo simulation. The second is to perform speech recognition on the Aurora 2 database.

Full Paper

Bibliographic reference.  Zhao, Yong / Juang, Biing-Hwang (2010): "A comparative study of noise estimation algorithms for VTS-based robust speech recognition", In INTERSPEECH-2010, 2090-2093.