This paper describes our efforts in designing a twostage recognizer with the objective of developing a multidomain speech understanding system. We envisage one first-stage recognition engine that is domain-independent, and multiple second-stage systems specializing in individual domains. A major novelty in our initial two-stage design is a front-end that incorporates angie -based hierarchical sublexical probability models encapsulated within affinite-state transducer (FST) paradigm. This first stage is a context-dependent syllable-level recognizer which outputs acoustic-phonetic networks to be processed in a second pass. The second stage incorporates higher order linguistic knowledge, from phonological to syntactic and semantic, in a tightly coupled search. This system has yielded up to a 28.5% reduction in understanding error, compared with a single stage context-dependent recognizer which does not use angie -based probabilities.
Cite as: Chung, G., Seneff, S., Hetherington, L. (1999) Towards multi-domain speech understanding using a two-stage recognizer. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2655-2658, doi: 10.21437/Eurospeech.1999-586
@inproceedings{chung99_eurospeech, author={Grace Chung and Stephanie Seneff and Lee Hetherington}, title={{Towards multi-domain speech understanding using a two-stage recognizer}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={2655--2658}, doi={10.21437/Eurospeech.1999-586} }