ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition

Jing Deng, Thomas Fang Zheng, Zhanjiang Song, Jian Liu

The Gaussian mixture model-universal background model (GMMUBM) has been dominant in text-independent speaker recognition tasks. However the conventional GMM-UBM method assumes that each Gaussian mixture is independent and ignores the fact that within Gaussian mixtures, there do exist some useful high-level speaker-dependent characteristics, such as word usage or speaking habits. Based on the GMM-UBM method, a method is proposed to use Gaussian mixture correlation to model the high-level information for speaker recognition tasks. In this method, we first cluster the Gaussian mixtures of the UBM into a small number of classes in terms of the mean vectors; in the following step, a universal class transition probability matrix (UCTPM) is learned which is helpful in modeling the high-level speaker's characteristics embedded in Gaussian mixture correlation. During the training phase, a speaker-dependent class transition probability matrix is adapted from the UCTPM. Experiments over two different databases show that an average 20.38% error rate reduction (ERR) can be achieved compared with the conventional GMM-UBM method.


doi: 10.21437/Interspeech.2005-636

Cite as: Deng, J., Zheng, T.F., Song, Z., Liu, J. (2005) Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition. Proc. Interspeech 2005, 2033-2036, doi: 10.21437/Interspeech.2005-636

@inproceedings{deng05c_interspeech,
  author={Jing Deng and Thomas Fang Zheng and Zhanjiang Song and Jian Liu},
  title={{Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={2033--2036},
  doi={10.21437/Interspeech.2005-636}
}