7th International Conference on Spoken Language Processing
September 16-20, 2002
A natural language spoken dialog system includes a large vocabulary automatic speech recognition (ASR) engine, whose output is used as the input of a spoken language understanding component. Two challenges in such a framework are that the ASR component is far from being perfect and the users can say the same thing in very different ways. So, it is very important to be tolerant to recognition errors and some amount of orthographic variability. In this paper, we present our work on developing new methods and investigating various ways of robust recognition and understanding of an utterance. To this end, we exploit word-level confusion networks (sausages), obtained fromASR word graphs (lattices) instead of the ASR 1-best hypothesis. Using sausages with an improved confidence model, we decreased the call-type classification error rate for AT&T’s How May I Help You (HMIHY) natural dialog system by 38%.
Bibliographic reference. Tur, Gokhan / Wright, Jerry / Gorin, Allen / Riccardi, Giuseppe / Hakkani-Tür, Dilek (2002): "Improving spoken language understanding using word confusion networks", In ICSLP-2002, 1137-1140.