10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Refactoring Acoustic Models Using Variational Expectation-Maximization

Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen

IBM T.J. Watson Research Center, USA

In probabilistic modeling, it is often useful to change the structure, or refactor>/i>, a model, so that it has a different number of components, different parameter sharing, or other constraints. For example, we may wish to find a Gaussian mixture model (GMM) with fewer components that best approximates a reference model. Maximizing the likelihood of the refactored model under the reference model is equivalent to minimizing their KL divergence. For GMMs, this optimization is not analytically tractable. However, a lower bound to the likelihood can be maximized using a variational expectation-maximization algorithm. Automatic speech recognition provides a good framework to test the validity of such methods, because we can train reference models of any given size for comparison with refactored models. We show that we can efficiently reduce model size by 50%, with the same recognition performance as the corresponding model trained from data.

Full Paper

Bibliographic reference.  Dognin, Pierre L. / Hershey, John R. / Goel, Vaibhava / Olsen, Peder A. (2009): "Refactoring acoustic models using variational expectation-maximization", In INTERSPEECH-2009, 212-215.