Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Optimal Maximum Likelihood on Phonetic Decision Tree Acoustic Model for LVCSR

Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin

Intel China Research Center, Beijing, China

This paper introduces a method that can better maximize likelihood (ML) in state decision tree clustering under a continuous density hidden Markov model (CDHMM) framework. Under ML criterion, the conventional phonetic context rule based triphone clustering process is re-examined by checking the fitness for each triphone cluster within its tree node class clustered by its yes/no answer to the phonetic context questions. If a triphone within its class better fits the other class (in a certain degree) by the ML standard, then its class-membership is re-assigned into the better-fit class. This method, applied at every level of three node during the tree building process, can improve the overall likelihood of the tree therefore should help to improve system performance at the end. Comparison experiment shows that the proposed method cuts word error rate (WER) by 6% to 11.2% from 11.9% obtained by conventional decision tree on WSJ 20k task.

Full Paper

Bibliographic reference.  Yuan, Baosheng / Zhao, Qingwei / Guo, Qing / Zhang, Xiangdong / Lin, Zhiwei (2000): "Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR", In ICSLP-2000, vol.2, 1035-1038.