Auditory-Visual Speech Processing (AVSP) 2011

Volterra, Italy
September 1-2, 2011

Audiovisual Speech Processing in Visual Speech Noise

Jeesun Kim, Chris Davis

MARCS Auditory Laboratories, University of Western Sydney, Australia

When the talker’s face (visual speech) can be seen, speech perception is both facilitated (for congruent visual speech) and interfered with (for incongruent visual speech). The current study investigated whether the degree of these visual speech effects was affected by the presence of an additional irrelevant talking face. In the experiment, auditory speech targets (vCv syllables) were presented in noise for subsequent speech identification. Participants were presented with the full display or upper-half (control) display of a talker’s face uttering single syllables either in central vision (Exp 1) or in the visual periphery (Exp 2). In addition, another talker was presented (silently uttering a sentence) either in the periphery (Exp 1) or in central vision (Exp 2). Participants’ eye-movements were monitored to ensure that participants always fixated centrally. Congruent AV speech facilitation and incongruent McGurk effects were tested by comparing percent correct syllable identification for full face visual speech stimuli compared to upper-face only conditions. The results showed more accurate identification for congruent stimuli and less accurate responses for incongruent ones (full face condition vs. the upper-half face control). The magnitude of the McGurk effect was greater when the face articulating the syllable was presented in central vision (with visual speech noise in the periphery) than when it was presented in the periphery (with central visual speech noise). The size of the congruent AV speech effect, however, did not differ as a function of central or peripheral presentation.

Index Terms. Visual speech; AV congruency; Peripheral visual speech

