ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Audiovisual Recalibration of Vowel Categories

Matthias K. Franken, Frank Eisner, Jan-Mathijs Schoffelen, Daniel J. Acheson, Peter Hagoort, James M. McQueen

One of the most daunting tasks of a listener is to map a continuous auditory stream onto known speech sound categories and lexical items. A major issue with this mapping problem is the variability in the acoustic realizations of sound categories, both within and across speakers. Past research has suggested listeners may use visual information (e.g., lip-reading) to calibrate these speech categories to the current speaker. Previous studies have focused on audiovisual recalibration of consonant categories. The present study explores whether vowel categorization, which is known to show less sharply defined category boundaries, also benefit from visual cues.

Participants were exposed to videos of a speaker pronouncing one out of two vowels, paired with audio that was ambiguous between the two vowels. After exposure, it was found that participants had recalibrated their vowel categories. In addition, individual variability in audiovisual recalibration is discussed. It is suggested that listeners’ category sharpness may be related to the weight they assign to visual information in audiovisual speech perception. Specifically, listeners with less sharp categories assign more weight to visual information during audiovisual speech recognition.

doi: 10.21437/Interspeech.2017-122

Cite as: Franken, M.K., Eisner, F., Schoffelen, J.-M., Acheson, D.J., Hagoort, P., McQueen, J.M. (2017) Audiovisual Recalibration of Vowel Categories. Proc. Interspeech 2017, 655-658, doi: 10.21437/Interspeech.2017-122

  author={Matthias K. Franken and Frank Eisner and Jan-Mathijs Schoffelen and Daniel J. Acheson and Peter Hagoort and James M. McQueen},
  title={{Audiovisual Recalibration of Vowel Categories}},
  booktitle={Proc. Interspeech 2017},