ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Adaptive decision tree-based phone cluster models for speaker clustering

Chia-Hsin Hsieh, Chung-Hsien Wu, Han-Ping Shen

This study presents an approach to speaker clustering using adaptive decision tree-based phone cluster models (DT-PCMs). First, a large broadcast news database is used to train a set of phone models for universal speakers. The multi-space probability distributed-hidden Markov model (MSD-HMM) is adopted for phone modeling. Confusing phone models are merged into phone clusters. Next, for each state in the phone MSD-HMMs, a decision tree is constructed to store the contextual, phonetic, and speaker characteristics for data sharing over all speakers. For speaker clustering, each input speech segment is used to retrieve the Gaussian models from the DT-PCMs to construct the initial speaker-dependent phone cluster models. Finally, all the corresponding adapted speaker-dependent phone cluster models are used for speaker clustering via a cross-likelihood ratio measure. The experimental results show the DT-PCMs outperforms the conventional GMM-based approach.

doi: 10.21437/Interspeech.2008-276

Cite as: Hsieh, C.-H., Wu, C.-H., Shen, H.-P. (2008) Adaptive decision tree-based phone cluster models for speaker clustering. Proc. Interspeech 2008, 861-864, doi: 10.21437/Interspeech.2008-276

  author={Chia-Hsin Hsieh and Chung-Hsien Wu and Han-Ping Shen},
  title={{Adaptive decision tree-based phone cluster models for speaker clustering}},
  booktitle={Proc. Interspeech 2008},