AVSP 2003 - International Conference on Audio-Visual Speech Processing

September 4-7, 2003
St. Jorioz, France

Electrophysiology of Auditory-Visual Speech Integration

Virginie van Wassenhove (2), Ken W. Grant (1), David Poeppel (2)

(1) Army Audiology and Speech Center, Walter Reed Army Medical Center, Washington, DC, USA
(2) Neuroscience and Cognitive Science Program, Cognitive Neuroscience of Language Laboratory, University of Maryland, College Park, MD, USA

Abstract Twenty-six native English speakers identified auditory (A), visual (V), and congruent and incongruent auditory-visual (AV) syllables while undergoing electroencephalography (EEC) in three experiments. In Experiment 1, unimodal (A, V) and bimodal (AV) stimuli were presented in separate blocks. In Experiment 2 the same stimuli were pseudo-randomized in the same blocks, providing a replication of Experiment 1 while testing the effect of participants' expectancy on the AV condition. In Experiment 3, NcGurk fusion (audio /pa/ dubbed onto visual /ka/, eliciting the percept /ta/) and combination (audio /ka/ dubbed onto visual /pa/) stimuli were tested under visnal attention.

EEG recordings show early effects of visual influence on auditory evoked-related potentials (P1/N1/P2 complex). Specifically, a robust amplitude reduction of the Nl/P2 complex was observed (Experiments 1 and 2) that could not be solely accounted for by attentional effects (Experiment 3) The N1/P2 reduction was accompanied by a temporal facilitation (approximting ~2O ms) of the P1/N1 and N1/P2 transitions in AV conditions. Additionally, incongruient syllables showed a different profile from congruent AV /ta/ over a large latency range (~5O to 350 ms post-auditory onset), which was influenced by the accuracy of identification of the visual stinmli presented unimodally.

Our results suggest that (i) auditory processing is modulated early on by visual speech inputs, in agreement with an early locus of AV speech interaction, (ii) natural precedence of visual kinematics facilitates auditory speech processing in the time domain, and (iii) the degree of temporal gain is a function of the saliency of visual speech inputs.

Full Paper

Bibliographic reference.  Wassenhove, Virginie van / Grant, Ken W. / Poeppel, David (2003): "Electrophysiology of auditory-visual speech integration", In AVSP 2003, 37-42.