Auditory-Visual Speech Processing 2007 (AVSP2007)

Kasteel Groenendaal, Hilvarenbeek, The Netherlands
August 31 - September 3, 2007

Audiovisual Lombard speech: Reconciling Production and Perception

Eric Vatikiotis-Bateson (1), Adriano V. Barbosa (1), Cheuk Yi Chow (1), Martin Oberg (1),, Johanna Tan (2), Hani C. Yehia (3)

(1) Department of Linguistics, University of British Columbia, Vancouver, Canada
(2) Clinical Audiology, Melbourne University, Melbourne, Australia
(3) Department of Electronics, Federal University of Minas Gerais, Belo Horizonte, Brazil

An earlier study compared audiovisual perception of speech ’produced in environmental noise’ (Lombard speech) and speech ’produced in quiet’ with the same environmental noise added. The results and showed that listeners make differential use of the visual information depending on the recording condition, but gave no indication of how or why this might be so. A possible confound in that study was that high audio presentation levels might account for the small visual enhancements observed for Lombard speech. This paper reports results for a second perception study using much lower acoustic presentation levels, compares them with the results of the previous study, and integrates the perception results with analyses of the audiovisual production data: face and head motion, audio amplitude (RMS), and parameters of the spectral acoustics (line spectrum pairs).

Full Paper

Bibliographic reference.  Vatikiotis-Bateson, Eric / Barbosa, Adriano V. / Chow, Cheuk Yi / Oberg, Martin / , Johanna Tan (2), Hani C. Yehia (3) / Tan, Johanna / Yehia, Hani C. (2007): "Audiovisual Lombard speech: reconciling production and perception", In AVSP-2007, paper P41.