Sixth European Conference on Speech Communication and Technology
Phonetic decision-tree based acoustic modeling has been widely used in speech recognition systems. However, the assumption that all states clustered in the same leaf node share both their Gaussians and mixture weights restricts the improvement of the acoustic models. In this paper, we propose a new structure called a two-level decision-tree. With this structure we can make better use of training data and improve the model accuracy and robustness. Two-level decision trees provide more flexibility to control the number of parameters. By tuning the balance of the first and second level tree nodes, we can get better performance with even fewer parameters than the traditional decision-tree based approach. Experiments on the Wall Street Journal tasks show that our approach can achieve about a 10% word error rate reduction over the conventional approach.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Liu, Chaojun / Wu, Xintian / Yan, Yonghong (1999): "High accuracy acoustic modeling using two-level decision-tree based state-tying", In EUROSPEECH'99, 1703-1706.