9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Integration of Audiovisual Speech and Priming Effects

Azra N. Ali

University of Huddersfield, UK

Humans report fusion /da/ when presented with audio /ba/ aligned with visual /ga/. Over the last three decades most researchers have neglected the velar McGurk fusion: audio /ba/ with visual /da/ eliciting /ga/ fusion, and audio /pa/ with visual /ta/ resulting in /ka/ fusion. Cathiard [1], claimed that these latter types of fusion are laboratory curiosities which do not occur embedded in French VCV syllables. We conduct two experiments; in the first experiment, incongruent segment is embedded in real English words and in the second experiment we use a priming approach to bias the perception towards either the audio channel or the visual or the expected fusion. Results show that velar fusion perceptions are not just a product of isolated nonsense syllables, but are robust percepts formed by integration of audio and visual channels. Thus, using both types of McGurk fusion has potential to be used as a probing tool for exploring the phonological organization of lexical entities.


  1. Cathiard, M., Schwartz, J-L. and Abry, C. Asking a Naive Question t the McGurk Effect: Why does Audio [B] give more [D] Percepts with Visual [G] than with Visual [D]? Auditory-Visual Speech Processing Proc. 138-142, 2001 (ISCA Archive,

Full Paper

Bibliographic reference.  Ali, Azra N. (2008): "Integration of audiovisual speech and priming effects", In INTERSPEECH-2008, 2044-2047.