5th International Conference on Spoken Language Processing
In this paper, an m-level optimal subtree based phonetic decision tree clustering algorithm is described. Unlike prior approaches, the m-level optimal subtree in the proposed approach is to generate log likelihood estimates using multiple mixture Gaussians for phonetic decision tree based state tying. It provides a more accurate model of the log likelihood variations in node splitting and it is consistent with the acoustic space partition introduced by the set of phonetic questions applied during the decision tree state tying process. In order to reduce the algorithmic complexity, a caching scheme based on previous search results is also described. It leads to a significant speed up of the m-level optimal subtree construction without degradation of the recognition performance, making the proposed approach suitable for large vocabulary speech recognition tasks. Experimental results on a standard (Wall Street Journal) speech recognition task indicate that the proposed m-level optimal subtree approach outperforms the conventional approach of using single mixture Gaussians in phonetic decision tree based state tying.
Bibliographic reference. Chou, Wu / Reichl, Wolfgang (1998): "High resolution decision tree based acoustic modeling beyond CART", In ICSLP-1998, paper 0607.