Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Analysis of Multimodal Interaction Data in Human Communication

Keiko Watanuki, Kenji Sakamoto, Fumio Togawa

Real World Computing Partnership, Novel Functions Sharp Laboratory in Integrated Media Laboratories, Sharp Corporation, Chiba, Japan

We are developing multimodal man-machine interfaces through which users can communicate by integrating speech, gaze, facial expressions, and gestures such as nodding and finger pointing. Such multimodal interfaces are expected to provide more flexible, natural and productive communications between humans and computers. To achieve this goal, we have taken the approach of modeling human behaviors in the context of ordinary face-to-face conversations. As a first step, we have implemented a system which utilizes video and audio recording equipment to capture verbal and nonverbal information in interpersonal communications. Using this system, we have collected data from a task-oriented conversation between a guest (subject) and a receptionist at company reception, and quantitatively analyzed this data with respect to multi-modalities. This paper presents data showing that head nodding and gaze are related to speech content, acting to supplement speech information. We also discuss issues related to the timing of turn taking and listener responses, which yield a natural rhythm for human/computer interaction.

Full Paper

Bibliographic reference.  Watanuki, Keiko / Sakamoto, Kenji / Togawa, Fumio (1994): "Analysis of multimodal interaction data in human communication", In ICSLP-1994, 899-902.