10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Large Margin Estimation of Gaussian Mixture Model Parameters with Extended Baum-Welch for Spoken Language Recognition

Donglai Zhu, Bin Ma, Haizhou Li

Institute for Infocomm Research, Singapore

Discriminative training (DT) methods of acoustic models, such as SVM and MMI-training GMM, have been proved effective in spoken language recognition. In this paper we propose a DT method for GMM using the large margin (LM) estimation. Unlike traditional MMI or MCE methods, the LM estimation attempts to enhance the generalization ability of GMM to deal with new data that exhibits mismatch with training data. We define the multi-class separation margin as a function of GMM likelihoods, and derive update formulae of GMM parameters with the extended Baum-Welch algorithm. Results on the NIST language recognition evaluation (LRE) 2007 task show that the LM estimation achieves better performance and faster convergent speed than the MMI estimation.

Full Paper

Bibliographic reference.  Zhu, Donglai / Ma, Bin / Li, Haizhou (2009): "Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognition", In INTERSPEECH-2009, 2179-2182.