9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

A Computationally Efficient Approach to Warp Factor Estimation in VTLN Using EM Algorithm and Sufficient Statistics

P. T. Akhil, S. P. Rath, S. Umesh, D. R. Sanand

IIT Kanpur, India

In this paper, we develop a computationally efficient approach for warp factor estimation in Vocal Tract Length Normalization (VTLN). Recently we have shown that warped features can be obtained by a linear transformation of the unwarped features. Using the warp matrices we show that warp factor estimation can be efficiently performed in an EM framework. This can be done by collecting Sufficient Statistics by aligning the unwarped utterances only once. The likelihood of warped features, which are necessary for warp factor estimation, are computed by appropriately modifying the sufficient statistics using the warp matrices. We show using OGI, TIDIGITS and RM task that this approach has recognition performance that is comparable to conventional VTLN and yet is computationally more efficient.

Full Paper

Bibliographic reference.  Akhil, P. T. / Rath, S. P. / Umesh, S. / Sanand, D. R. (2008): "A computationally efficient approach to warp factor estimation in VTLN using EM algorithm and sufficient statistics", In INTERSPEECH-2008, 1713-1716.