Auditory-Visual Speech Processing 2005
British Columbia, Canada
This work aims to use the speech in noise task to assess talking head animations, and talking head animations to investigate the perception of speech-in- noise. The theoretical aim is to determine what visual information is important for speech, while the practical aim is to develop an effective talking head animation system adaptable to robots.
The first experiment used the ``cuboid'', a deliberately abstract face. Head, jaw and mouth movement were presented separately and in combination. Results showed an advantage of mouth movement independent of the other factors. This shows that even an abstract structure can carry useful facial speech information and that mouth movement is an essential component.
In this paper we test perception of facial speech using both a deliberately abstract structure and a more realistic head model. Two other experiments reported used ATR's in-house animation system  to look at the relative contribution of face and head movement. The first experiment replicated a combined head and face movement advantage . A 2 Head Movement (present/absent) x 2 Face Movement (present/absent) experiment showed a main effect of face movement, but no effect of head movement or any interaction.
We conclude that abstract faces can carry useful visual speech information and that, while mouth and face movement are primary, head and jaw movement do not interfere with and can help.
Bibliographic reference. Hill, Harold / Vaikiotis-Bateson, Eric (2005): "Using graphics to study the perception of speech-in-noise, and vice versa", In AVSP-2005, 63-64.