8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Estimating VTLN Warping Factors by Distribution Matching

Janne Pylkkönen

Helsinki University of Technology, Finland

Several methods exist for estimating the warping factors for vocal tract length normalization (VTLN), most of which rely on an exhaustive search over the warping factors to maximize the likelihood of the adaptation data. This paper presents a method for warping factor estimation that is based on matching Gaussian distributions by Kullback-Leibler divergence. It is computationally more efficient than most maximum likelihood methods, but above all it can be used to incorporate the speaker normalization very early in the training process. This can greatly simplify and speed up the training. The estimation method is compared to the baseline maximum likelihood method in three large vocabulary continuous speech recognition tasks. The results confirm that the method performs well in a variety of tasks and configurations.

Full Paper

Bibliographic reference.  Pylkkönen, Janne (2007): "Estimating VTLN warping factors by distribution matching", In INTERSPEECH-2007, 270-273.