AVSP 2003 - International Conference on Audio-Visual Speech Processing

September 4-7, 2003
St. Jorioz, France

Auditory Syllabic Identification Enhanced by Non-Informative Visible Speech

Jean-Luc Schwartz, Frédéric Berthommier, Christophe Savariaux

Institut de la Communication Parlée (ICP), CNRS UMR 5009, INPG / Université Stendhal, Grenoble, France

Recent experiments show that seeing lip movements may improve the detection of speech sounds embedded in noise. We show here that the "speech detection" benefit may result in a "speech identification" benefit different from lipreading per se. The experimental trick consists in dubbing the same lip gesture on a number of visually similar but auditorily different configurations, e.g. [y u ty tu ky ku dy du gy gu] in French. The visual stimulus does not enable to identify the syllable, but it provides a temporal cue improving the audio identification of these stimuli embedded in a large level of cocktail-party noise, and particularly the identification of plosive voicing. Replacing the visual speech cue (the lip rounding gesture) by a nonspeech one with the same temporal pattern (a red bar on a black background, increasing and decreasing in synchrony with the lips) removes the benefit.

Full Paper

Bibliographic reference.  Schwartz, Jean-Luc / Berthommier, Frédéric / Savariaux, Christophe (2003): "Auditory syllabic identification enhanced by non-informative visible speech", In AVSP 2003, 19-24.