Auditory-Visual Speech Processing (AVSP) 2010

Hakone, Kanagawa, Japan
September 30-October 3, 2010

Robot as a Multimodal Human Interface Device

Tetsunori Kobayashi

Faculty of Science and Engineering, Waseda University, Tokyo, Japan

In this talk, we introduce a robot conversation system. Generally speaking, the conversation is not performed only through the exchange of speech information. It needs the exchange of visual information: facial expressions, body poses, and gestures convey rich information to achieve natural conversation. In this context, the body and the vision system of the robot, can be regard as the essential components for the conversational communication system. Here, we first emphasize the importance of visual information processing especially in sending/receiving the nuance of utterances and expressing/recognizing the role structure of participants. Then, we describe the implementation of these visual functions. We also mention about the cooperative use of the visual information along with the audio information. Finally, we show you the conversation robot SCHEMA which achieves natural conversation through the auditory-visual information processing.

Full Paper

Bibliographic reference.  Kobayashi, Tetsunori (2010): "Robot as a multimodal human interface device", In AVSP-2010, paper K1.