Interspeech'2005 - Eurospeech
Gaussian Mixture Model (GMM) with diagonal covariance matrix is commonly used in text-independent speaker identification. However, diagonal covariance matrix implies strong assumption that the feature elements are independent. Even Gaussian mixtures with diagonal covariance can model the correlation to some extent; the model precision is still limited. To alleviate this problem, this paper proposes a framework for sharing linear transformations among the components and introduces a new unsupervised hierarchical clustering algorithm to implement it. In the framework, the full covariance of each component is represented by shared linear transformation and component-specific diagonal covariance. Different linear transformation estimation approaches, i.e., PCA, LDA and MLLT, are proposed and compared. Experiments show that our algorithm using each of the approaches has achieved significant identification error reduction over the best diagonal covariance models.
Bibliographic reference. Zhou, Xi / Yao, Zhi-qiang / Dai, Beiqian (2005): "Improved covariance modeling for GMM in speaker identification", In INTERSPEECH-2005, 3113-3116.