ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR

Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin

This paper introduces a method that can better maximize likelihood (ML) in state decision tree clustering under a continuous density hidden Markov model (CDHMM) framework. Under ML criterion, the conventional phonetic context rule based triphone clustering process is re-examined by checking the fitness for each triphone cluster within its tree node class clustered by its yes/no answer to the phonetic context questions. If a triphone within its class better fits the other class (in a certain degree) by the ML standard, then its class-membership is re-assigned into the better-fit class. This method, applied at every level of three node during the tree building process, can improve the overall likelihood of the tree therefore should help to improve system performance at the end. Comparison experiment shows that the proposed method cuts word error rate (WER) by 6% to 11.2% from 11.9% obtained by conventional decision tree on WSJ 20k task.


Cite as: Yuan, B., Zhao, Q., Guo, Q., Zhang, X., Lin, Z. (2000) Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 1035-1038

@inproceedings{yuan00_icslp,
  author={Baosheng Yuan and Qingwei Zhao and Qing Guo and Xiangdong Zhang and Zhiwei Lin},
  title={{Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 1035-1038}
}