Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Multi-Level Decision Trees for Static and Dynamic Pronunciation Models

Eric Fosler-Lussier

University of California, Berkeley, and International Computer Science Institute, Berkeley, CA, USA

We have been focusing on improving pronunciation models for automatic transcription of television and radio news reports by modeling phone, syllable, and word pronunciation distributions with decision trees. These models were employed in two sep-arate sets of experiments. First, decision trees facilitated selection of word pronunciations derived automatically from data for use in a standard speech recognizer dictionary. We have seen a small but significant improvement with these automatically con-structed dictionaries in our onepass decoding system. In a sec-ond set of experiments, we allowed decision tree models to de-termine the probability of word pronunciations dynamically, de-pendent on the linguistic context of the word during recognition. Dynamic models provided an additional insignificant decrease in error, but improvements were focused within the spontaneous speech portion of the test set.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Fosler-Lussier, Eric (1999): "Multi-level decision trees for static and dynamic pronunciation models", In EUROSPEECH'99, 463-466.