The background of the present work is the development of a tele-operation system where the lip motion of a remote humanoid robot is automatically controlled from the operatorfs voice. In the present paper, we introduce an improved version of our proposed speech-driven lip motion generation method, where lip height and width degrees are estimated based on vowel formant information. The method requires the calibration of only one parameter for speaker normalization, so that no training of dedicated models is necessary. Lip height control is evaluated in a female android robot Geminoid-F and in an animated face. Subjective evaluation indicated that naturalness of lip motion generated in the robot is improved by the inclusion of a partial lip width control (with stretching of the lip corners). Highest naturalness scores were achieved for the animated face, showing the effectiveness of the proposed method.
Index Terms: lip motion, formant, tele-operation, humanoid robot.
Bibliographic reference. Ishi, Carlos T. / Liu, Chaoran / Ishiguro, Hiroshi / Hagita, Norihiro (2012): "Evaluation of a formant-based speech-driven lip motion generation", In INTERSPEECH-2012, 114-117.