11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Restructuring Exponential Family Mixture Models

Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder Olsen

IBM T.J. Watson Research Center, USA

Variational KL (varKL) divergence minimization was previously applied to restructuring acoustic models (AMs) using Gaussian mixture models by reducing their size while preserving their accuracy. In this paper, we derive a related varKL for exponential family mixture models (EMMs) and test its accuracy using the weighted local maximum likelihood agglomerative clustering technique. Minimizing varKL between a reference and a restructured AM led previously to the variational expectation maximization (varEM) algorithm; which we extend to EMMs. We present results on a clustering task using AMs trained on 50 hrs of Broadcast News (BN). EMMs are trained on fMMI-PLP features combined with frame level phone posterior probabilities given by the recently introduced sparse representation phone identification process. As we reduce model size, we test the word error rate using the standard BN test set and compare with baseline models of the same size, trained directly from data.

Full Paper

Bibliographic reference.  Dognin, Pierre L. / Hershey, John R. / Goel, Vaibhava / Olsen, Peder (2010): "Restructuring exponential family mixture models", In INTERSPEECH-2010, 62-65.