Auditory-Visual Speech Processing 2005

British Columbia, Canada
July 24-27, 2005

Using Graphics to Study the Perception of Speech-in-Noise, and vice versa

Harold Hill (1), Eric Vaikiotis-Bateson (2)

(1) Department of Vision Dynamics, Human Information Science Labs., ATRi, Kyoto, Japan
(2) Department of Linguistics, University of British Columbia, Canada

This work aims to use the speech in noise task to assess talking head animations, and talking head animations to investigate the perception of speech-in- noise. The theoretical aim is to determine what visual information is important for speech, while the practical aim is to develop an effective talking head animation system adaptable to robots.

The first experiment used the ``cuboid'', a deliberately abstract face. Head, jaw and mouth movement were presented separately and in combination. Results showed an advantage of mouth movement independent of the other factors. This shows that even an abstract structure can carry useful facial speech information and that mouth movement is an essential component.

In this paper we test perception of facial speech using both a deliberately abstract structure and a more realistic head model. Two other experiments reported used ATR's in-house animation system [1] to look at the relative contribution of face and head movement. The first experiment replicated a combined head and face movement advantage [2]. A 2 Head Movement (present/absent) x 2 Face Movement (present/absent) experiment showed a main effect of face movement, but no effect of head movement or any interaction.

We conclude that abstract faces can carry useful visual speech information and that, while mouth and face movement are primary, head and jaw movement do not interfere with and can help.

Full Paper

Bibliographic reference.  Hill, Harold / Vaikiotis-Bateson, Eric (2005): "Using graphics to study the perception of speech-in-noise, and vice versa", In AVSP-2005, 63-64.