ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Improved covariance modeling for GMM in speaker identification

Xi Zhou, Zhi-qiang Yao, Beiqian Dai

Gaussian Mixture Model (GMM) with diagonal covariance matrix is commonly used in text-independent speaker identification. However, diagonal covariance matrix implies strong assumption that the feature elements are independent. Even Gaussian mixtures with diagonal covariance can model the correlation to some extent; the model precision is still limited. To alleviate this problem, this paper proposes a framework for sharing linear transformations among the components and introduces a new unsupervised hierarchical clustering algorithm to implement it. In the framework, the full covariance of each component is represented by shared linear transformation and component-specific diagonal covariance. Different linear transformation estimation approaches, i.e., PCA, LDA and MLLT, are proposed and compared. Experiments show that our algorithm using each of the approaches has achieved significant identification error reduction over the best diagonal covariance models.

doi: 10.21437/Interspeech.2005-669

Cite as: Zhou, X., Yao, Z.-q., Dai, B. (2005) Improved covariance modeling for GMM in speaker identification. Proc. Interspeech 2005, 3113-3116, doi: 10.21437/Interspeech.2005-669

  author={Xi Zhou and Zhi-qiang Yao and Beiqian Dai},
  title={{Improved covariance modeling for GMM in speaker identification}},
  booktitle={Proc. Interspeech 2005},