Auditory-Visual Speech Processing (AVSP'98)

December 4-6, 1998
Terrigal - Sydney, Australia

Face Or Voice? Determinant Of Compellingness To The McGurk Effect

Kaoru Sekiyama

Kanazawa University, Japan

This study examined sources of talker differences in the McGurk effect, by questioning which of auditory speech or visual speech is more determining the size of the McGurk effect. Cross-talker dubbing was done between faces and voices of utterances (/ba/ and /ga/) pronounced by two talkers: one compelling talker (CT) to the McGurk effect, and one less compelling talker (LT), as measured in our previous studies. The two talkers and 18 subjects were native speakers of Japanese. There were three presentation conditions: Audio-only, video-only, and audiovisual. The results of unimodal conditions showed that CT was easier to speechread than LT, but that CT was more difficult to listen to than LT. The results of audiovisual condition showed that, although both the visual and auditory talker affect the size of the McGurk effect, the audio component is more responsible than the video component for the talker differences in the McGurk effect.


Full Paper

Bibliographic reference.  Sekiyama, Kaoru (1998): "Face or voice? Determinant of compellingness to the McGurk effect", In AVSP-1998, 33-36.