AVSP 2003 - International Conference on Audio-Visual Speech Processing

September 4-7, 2003
St. Jorioz, France

Enhanced Auditory Detection with AV Speech: Perceptual Evidence for Speech and Non-Speech Mechanisms

Lynne E. Bernstein, Sumiko Takayanagi, Edward T. Auer Jr.

Department of Communication Neuroscience, House Ear Institute, Los Angeles, CA, USA

Speech in a noisy or reverberant environment is more detectable and more intelligible when the listener can see the talker. How to explain these perceptual phenomena is a fundamental problem for AV speech research. We have undertaken a series of behavioral and electrophysiological experiments to investigate the perceptual and neural bases for enhanced auditory speech detection in noise with AV stimuli. We hypothesize that the enhancement effect arises due to at least two neurophysiologically distinct mechanisms, one in no way specialized for speech and the other specific to speech stimuli. Here we report results of a perceptual experiment in which an auditory /ba/ token was presented adaptively to obtain its 71% detection threshold [1] in white noise. Participants were tested in three conditions, auditory-only speech, audiovisual speech, and auditory speech with a visual dynamic Lissajous figure. The Lissajous figure was a control for many of the complex visual features of speech. Evidence was obtained for two separate sources of AV detection enhancement: Detection thresholds were highest for the auditory-only speech, lower for the auditory speech with the Lissajous figure, and lowest for the audiovisual speech. Our Discussion section outlines the implications and limitations of the current results for explaining the AV speech detection enhancement effect.


Full Paper

Presentation. An audio-visual [ba] is presented in two conditions:
Video: Audio-Visual Speech
   Video: Lissajous Figure

Bibliographic reference.  Bernstein, Lynne E. / Takayanagi, Sumiko / T. Auer Jr., Edward (2003): "Enhanced auditory detection with AV speech: Perceptual evidence for speech and non-speech mechanisms", In AVSP 2003, 13-17.