Third International Conference on Spoken Language Processing (ICSLP 94)
It has long been a hope, expectation, and prediction that speech would be the primary medium of communication between humans and machines. To date, this dream has not been realized. We predict that exploiting the multimodal nature of spoken language will facilitate the use of this medium. We begin our paper with a general frame-work for the analysis of speech recognition by humans and a theoretical model. We then present a system for auditory/visual speech synthesis that performs complete text-to-speech synthesis. This system should improve the quality as well as the attractiveness of speech as one of a machine's primary output communication medium. Mirroring the value of multimodal speech synthesis, multimodal channels Should also enhance speech recognition by machine.
Bibliographic reference. Massaro, Dominic W. / Cohen, Michael M. (1994): "Auditory/visual speech in multimodal human interfaces", In ICSLP-1994, 531-534.