We describe work towards developing a scalable and portable framework for enabling map-based multimodal dialogue interaction over the web. Working in the context of a restaurant-guide system, we show how large information databases harvested from the web can be accommodated in our speech recognizer, parser, and web-based GUI. We compare two dynamic language modeling techniques, which calculate context-dependent weights for the large sets of proper nouns associated with geographical entities such as restaurants and streets. We show that the more fine-grained approach results in a 7.8% reduction in concept error rate.
Cite as: Gruenstein, A., Seneff, S., Wang, C. (2006) Scalable and portable web-based multimodal dialogue interaction with geographical databases. Proc. Interspeech 2006, paper 1095-Mon2FoP.2, doi: 10.21437/Interspeech.2006-145
@inproceedings{gruenstein06_interspeech, author={Alexander Gruenstein and Stephanie Seneff and Chao Wang}, title={{Scalable and portable web-based multimodal dialogue interaction with geographical databases}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1095-Mon2FoP.2}, doi={10.21437/Interspeech.2006-145} }