8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Gaussian Mixture Optimization for HMM Based on Efficient Cross-Validation

Takahiro Shinozaki, Tatsuya Kawahara

Kyoto University, Japan

A Gaussian mixture optimization method is explored using cross-validation likelihood as an objective function instead of the conventional training set likelihood. The optimization is based on reducing the number of mixture components by selecting and merging a pair of Gaussians step by step base on the objective function so as to remove redundant components and improve the generality of the model. Cross-validation likelihood is more appropriate for avoiding over-fitting than the conventional likelihood and can be efficiently computed using sufficient statistics. It results in a better Gaussian pair selection and provides a termination criterion that does not rely on empirical thresholds. Large-vocabulary speech recognition experiments on oral presentations show that the cross-validation method gives a smaller word error rate with an automatically determined model size than a baseline training procedure that does not perform the optimization.

Full Paper

Bibliographic reference.  Shinozaki, Takahiro / Kawahara, Tatsuya (2007): "Gaussian mixture optimization for HMM based on efficient cross-validation", In INTERSPEECH-2007, 2061-2064.