EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Compiling Large-Context Phonetic Decision Trees into Finite-State Transducers

Stanley F. Chen

IBM T.J. Watson Research Center, USA

Recent work has shown that the use of finite-state transducers (FST's) has many advantages in large vocabulary speech recognition. Most past work has focused on the use of triphone phonetic decision trees. However, numerous applications use decision trees that condition on wider contexts; for example, many systems at IBM use 11-phone phonetic decision trees. Alas, large-context phonetic decision trees cannot be compiled straightforwardly into FST's due to memory constraints. In this work, we discuss memory-efficient techniques for manipulating large-context phonetic decision trees in the FST framework. First, we describe a lazy expansion technique that is applicable when expanding small word graphs. For general applications, we discuss how to construct large-context transducers via a sequence of simple, efficient finite-state operations; we also introduce a memory-efficient implementation of determinization.

Full Paper

Bibliographic reference.  Chen, Stanley F. (2003): "Compiling large-context phonetic decision trees into finite-state transducers", In EUROSPEECH-2003, 1169-1172.