Auditory-Visual Speech Processing (AVSP) 2011

Volterra, Italy
September 1-2, 2011

Speech-Driven Lip Motion Generation for Tele-Operated Humanoid Robots

Carlos T. Ishi (1), Chaoran Liu (1), Hiroshi Ishiguro (2), Norihiro Hagita (1)

(1) ATR Intelligent Robotics and Communication Labs.; (2) ATR Social Media Research Laboratory Group Hiroshi Ishiguro Laboratory; Kyoto, Japan

In order to tele-operate the lip motion of a humanoid robot (such as android) from the utterances of the operator, we developed a speech-driven lip motion generation method. The proposed method is based on the rotation of the vowel space, given by the first and second formants, around the center vowel, and a mapping to the lip opening degrees. The method requires the calibration of only one parameter for speaker normalization, so that no other training of models is required. In a pilot experiment, the proposed audio-based method was perceived as more natural than vision-based approaches, regardless of the language.

Index Terms. lip motion, formant, humanoid robot, teleoperation, synchronization

Full Paper
Videos
"Geminoid"    "audio"    "mocap"    "vision"   
"Telenoid"    "audio"    "mocap"    "vision"   

Bibliographic reference.  Ishi, Carlos T. / Liu, Chaoran / Ishiguro, Hiroshi / Hagita, Norihiro (2011): "Speech-driven lip motion generation for tele-operated humanoid robots", In AVSP-2011, 131-135.