ODYSSEY 2004 - The Speaker and Language Recognition Workshop
May 31 - June 3, 2004
EM training of GMM often suffers from the existence of local maxima and singularities in the likelihood space. In this paper, we present a new Modified Split-and-Merge EM algorithm (MSMEM) for speaker verification tasks, which performs split-and-merge operations to escape from local maxima and reduce the chances of generating singularities. With two modified criteria to select split-and-merge candidates for speaker verification task, the overall likelihoods of both training and testing data are improved. Furthermore, modified adaptive variance flooring is introduced in the new EM procedure. Experiments on synthetic data show the advantages of MSMEM. Global threshold EER results on a speaker verification task using the TIMIT database confirm the improvement of the system performance.
Bibliographic reference. Zhang, Yongxin / Scordilis, Michael S. (2004): "Optimization of GMM training for speaker verification", In ODYS-2004, 231-236.