7th International Conference on Spoken Language Processing
September 16-20, 2002
We consider a family of Gaussian mixture models for use in HMM based speech recognition system. These "SPAM" models have state independent choices of subspaces to which the precision (inverse covariance) matrices andmeans are restricted to belong. They provide a flexible tool for robust, compact, and fast acoustic modeling. The focus of this paper is on the case where the means are unconstrained. The models in the case already generalize the recently introduced EMLLT models, which themselves interpolate between MLLT and full covariance models. We describe an algorithm to train both the state-dependent and state-independent parameters. Results are reported on one speech recognition task. The SPAM models are seen to yield significant improvements in accuracy over EMLLT models with comparable model size and runtime speed. We find a 10%relative reduction in error rate over an MLLT model can be obtained while decreasing the acoustic modeling time by 20%.
Bibliographic reference. Axelrod, Scott / Gopinath, Ramesh / Olsen, Peder (2002): "Modeling with a subspace constraint on inverse covariance matrices", In ICSLP-2002, 2177-2180.