Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Towards Multi-Domain Speech Understanding Using a Two-Stage Recognizer

Grace Chung, Stephanie Seneff, Lee Hetherington

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA

This paper describes our efforts in designing a twostage recognizer with the objective of developing a multidomain speech understanding system. We envisage one first-stage recognition engine that is domain-independent, and multiple second-stage systems specializing in individual domains. A major novelty in our initial two-stage design is a front-end that incorporates angie -based hierarchical sublexical probability models encapsulated within affinite-state transducer (FST) paradigm. This first stage is a context-dependent syllable-level recognizer which outputs acoustic-phonetic networks to be processed in a second pass. The second stage incorporates higher order linguistic knowledge, from phonological to syntactic and semantic, in a tightly coupled search. This system has yielded up to a 28.5% reduction in understanding error, compared with a single stage context-dependent recognizer which does not use angie -based probabilities.


Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Chung, Grace / Seneff, Stephanie / Hetherington, Lee (1999): "Towards multi-domain speech understanding using a two-stage recognizer", In EUROSPEECH'99, 2655-2658.