FAAVSP - The 1st Joint Conference on
Facial Analysis, Animation, and
Visual enhancement of speech intelligibility, although clearly established, still resists a clear description. We attempt to contribute to solving that problem by proposing a simple account based on phonetically motivated visual cues. This work extends a previous study quantifying the visual advantage in sentence intelligibility across three conditions with varying degrees of visual information available: auditory only, auditory visual orally masked and auditory-visual. We explore the role of lexical as well as visual factors, the latter derived from groupings in visemes. While lexical factors play an undiscriminative role across modality conditions, some measure of viseme confusability seems to capture part of the performance results. A simple characterisation of the phonetic content of sentences in terms of visual information occurring exclusively inside the mask region was found to be the strongest predictor for the auditory-visual masked condition only, demonstrating a direct link between localised visual information and auditory-visual speech processing performance. Index Terms: Auditory-visual speech processing, visemes, sentence intelligibility, visual advantage
Bibliographic reference. Aubanel, Vincent / Davis, Chris / Kim, Jeesun (2015): "Explaining the visual and masked-visual advantage in speech perception in noise: the role of visual phonetic cues", In FAAVSP-2015, 132-136.