5th International Conference on Spoken Language Processing
In this paper, we present a new discriminative training method for Gaussian Mixture Models (GMM) and its application for the text-independent speaker recognition. The objective of this method is to maximize the frame level normalized likelihoods of the training data. That is why we call it the Maximum Normalized Likelihood Estimation (MNLE). In contrast to other discriminative algorithms, the objective function is optimized using a modified Expectation- Maximization (EM) algorithm which greatly simplifies the training procedure. The evaluation experiments using both clean and telephone speech showed improvement of the recognition rates compared to the Maximum Likelihood Estimation (MLE) trained speaker models, especially when the mismatch between the training and testing conditions is significant.
Bibliographic reference. Markov, Konstantin P. / Nakagawa, Seiichi (1998): "Discriminative training of GMM using a modified EM algorithm for speaker recognition", In ICSLP-1998, paper 0745.