5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

High Resolution Decision Tree based Acoustic Modeling beyond CART

Wu Chou, Wolfgang Reichl

Bell Labs., Lucent Technologies, USA

In this paper, an m-level optimal subtree based phonetic decision tree clustering algorithm is described. Unlike prior approaches, the m-level optimal subtree in the proposed approach is to generate log likelihood estimates using multiple mixture Gaussians for phonetic decision tree based state tying. It provides a more accurate model of the log likelihood variations in node splitting and it is consistent with the acoustic space partition introduced by the set of phonetic questions applied during the decision tree state tying process. In order to reduce the algorithmic complexity, a caching scheme based on previous search results is also described. It leads to a significant speed up of the m-level optimal subtree construction without degradation of the recognition performance, making the proposed approach suitable for large vocabulary speech recognition tasks. Experimental results on a standard (Wall Street Journal) speech recognition task indicate that the proposed m-level optimal subtree approach outperforms the conventional approach of using single mixture Gaussians in phonetic decision tree based state tying.

Full Paper

Bibliographic reference.  Chou, Wu / Reichl, Wolfgang (1998): "High resolution decision tree based acoustic modeling beyond CART", In ICSLP-1998, paper 0607.