INTERSPEECH 2004 - ICSLP
A novel method is described for robot gestures and utterances during a dialogue based on the listener's understanding and interest, which are recognized from back-channels and head gestures. "Back-channels" are defined as sounds like "uh-huh" uttered by a listener during a dialogue, and "head gestures" are defined as nod and tilt motions of the listener's head. The back-channels are recognized using sound features such as power and fundamental frequency. The head gestures are recognized using the movement of the skin-color area and the optical flow data. Based on the estimated understanding and interest of the listener, the speed and size of robot motions are changed. This method was implemented in a humanoid robot called SIG2.
Bibliographic reference. Komatani, Kazunori / Ogata, Tetsuya / Okuno, Hiroshi G. / Tasaki, Tsuyoshi / Yamaguchi, Takeshi (2004): "Robot motion control using listener's back-channels and head gesture information", In INTERSPEECH-2004, 1033-1036.