This paper describes a language modeling and dialog management system for efficient and robust recognition of several arbitrarily ordered and inter-related components from very large datasets — such as with a complete addresses specified in a single sentence with address components in their natural sequence. A new two-pass speech recognition technique based on using multiple language models with embedded grammars is presented. Tests with this technique on complete address recognition task yielded good results and memory and CPU requirements are sufficiently low to make this technique viable for embedded environments. Additionally, a goal oriented algorithm for dialog based error recovery and disambiguation, that does not require manual identification of all possible dialog situations, is also presented. The combined system yields very high task completion accuracy, for only a few additional turns of interaction.
Bibliographic reference. Balchandran, Rajesh / Rachevsky, Leonid / Sansone, Larry (2009): "Language modeling and dialog management for address recognition", In INTERSPEECH-2009, 288-291.