FAAVSP - The 1st Joint Conference on Facial Analysis, Animation, and
Auditory-Visual Speech Processing

Vienna, Austria
September 11-13, 2015

Explaining the Visual and Masked-Visual Advantage in Speech Perception in Noise: The Role of Visual Phonetic Cues

Vincent Aubanel, Chris Davis, Jeesun Kim

The MARCS Institute, University of Western Sydney, Australia

Visual enhancement of speech intelligibility, although clearly established, still resists a clear description. We attempt to contribute to solving that problem by proposing a simple account based on phonetically motivated visual cues. This work extends a previous study quantifying the visual advantage in sentence intelligibility across three conditions with varying degrees of visual information available: auditory only, auditory visual orally masked and auditory-visual. We explore the role of lexical as well as visual factors, the latter derived from groupings in visemes. While lexical factors play an undiscriminative role across modality conditions, some measure of viseme confusability seems to capture part of the performance results. A simple characterisation of the phonetic content of sentences in terms of visual information occurring exclusively inside the mask region was found to be the strongest predictor for the auditory-visual masked condition only, demonstrating a direct link between localised visual information and auditory-visual speech processing performance. Index Terms: Auditory-visual speech processing, visemes, sentence intelligibility, visual advantage

Full Paper

Bibliographic reference.  Aubanel, Vincent / Davis, Chris / Kim, Jeesun (2015): "Explaining the visual and masked-visual advantage in speech perception in noise: the role of visual phonetic cues", In FAAVSP-2015, 132-136.