We have built a multimodal-input multimedia-output guidance system called MMGS. The input of a user can be a combination of speech and hand-written gestures. The system, on the other hand, outputs a response that combines speech, three-dimensional graphics, and/or other information. This system can interact cooperatively with the user by resolving ellipses/anaphora and various ambiguities such as those caused by speech recognition errors. It is currently implemented on a SGI workstation and achieves nearly real-time processing.
Cite as: Takezawa, T., Morimoto, T. (1998) A multimodal-input multimedia-output guidance system: MMGS. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0958, doi: 10.21437/ICSLP.1998-269
@inproceedings{takezawa98_icslp, author={Toshiyuki Takezawa and Tsuyoshi Morimoto}, title={{A multimodal-input multimedia-output guidance system: MMGS}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0958}, doi={10.21437/ICSLP.1998-269} }