ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Principal mixture speaker adaptation for improved continuous speech recognition

Hui Ye, Pascale Fung, Taiyi Huang

Nowadays, almost all speaker-independent (SI) speech recognition systems use CDHMM with multivariate mixture Gaussian as observation density to cover speaker variabilities. It has been shown that given sufficient training data, the more mixtures are used in the HMM observation density, the better the system’s perform. However, acoustic HMM with more Gaussian densities is more complex and slows down recognition speed. Another efficient way to handle speaker variation is to use speaker adaptation (SA). Yet, even though speaker adaptation of full multivariate mixture Gaussian densities can increase recognition accuracy, it does not improve recognition speed. In this paper, we introduce a principal mixture speaker adaptation method which reduces HMM complexity by choosing only the principle mixtures corresponding to a particular speaker’s characteristics. We show that our method both improves recognition accuracy by 31.8% when compared to SI models, and reduces recognition speed by 30%, when compared to full mixture SA models.


Cite as: Ye, H., Fung, P., Huang, T. (2000) Principal mixture speaker adaptation for improved continuous speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 774-777

@inproceedings{ye00_icslp,
  author={Hui Ye and Pascale Fung and Taiyi Huang},
  title={{Principal mixture speaker adaptation for improved continuous speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 774-777}
}