ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Refactoring acoustic models using variational expectation-maximization

Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen

In probabilistic modeling, it is often useful to change the structure, or refactor>/i>, a model, so that it has a different number of components, different parameter sharing, or other constraints. For example, we may wish to find a Gaussian mixture model (GMM) with fewer components that best approximates a reference model. Maximizing the likelihood of the refactored model under the reference model is equivalent to minimizing their KL divergence. For GMMs, this optimization is not analytically tractable. However, a lower bound to the likelihood can be maximized using a variational expectation-maximization algorithm. Automatic speech recognition provides a good framework to test the validity of such methods, because we can train reference models of any given size for comparison with refactored models. We show that we can efficiently reduce model size by 50%, with the same recognition performance as the corresponding model trained from data.

doi: 10.21437/Interspeech.2009-78

Cite as: Dognin, P.L., Hershey, J.R., Goel, V., Olsen, P.A. (2009) Refactoring acoustic models using variational expectation-maximization. Proc. Interspeech 2009, 212-215, doi: 10.21437/Interspeech.2009-78

  author={Pierre L. Dognin and John R. Hershey and Vaibhava Goel and Peder A. Olsen},
  title={{Refactoring acoustic models using variational expectation-maximization}},
  booktitle={Proc. Interspeech 2009},