ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Integration themes in multimodal human-computer interaction

Sharon Oviatt, Erik Olsen

This research examines how people integrate spoken and written input during multimodal human-computer interaction. Three studies used a semi-automatic simulation technique to collect data on people's free use of spoken and written input. Within-subject repeated-measures studies were designed, with data analyzed from 44 subjects and 240 tasks. The primary factors were evaluated that govern people's selection to write versus speak at given points during a human-computer exchange. Analyses revealed that people write digits more often than textual content, and proper names more often than other text. A fonn-kased presentation, in comparison with an unconstrained format, also increased the likelihood of writing. However, the most influential factor in patterning people's integrated use of speech and writing is contractive functionalityr, or the use of spoken and written input in a contrastive way to designate a shift in content or functionality, such as original versus corrected input, data versus command, and digits versus text. Different patterns of contrastive mode use accounted for approximately 57% of the integrated pen/voice use observed in these studies. Information also is summarized on preferential mode use, and simultaneity of pen/voice input. One long-term goal of this research is the development of quantitative predictive models of natural modality integration, which could provide guidance on the strategic design of robust multimodal systems.

Cite as: Oviatt, S., Olsen, E. (1994) Integration themes in multimodal human-computer interaction. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 551-554

