Auditory-Visual Speech Processing 2007 (AVSP2007)

Kasteel Groenendaal, Hilvarenbeek, The Netherlands
August 31 - September 3, 2007

Effect of Speed Difference Between Time-Expanded Speech and Talkerís Moving Image on Word or Sentence Intelligibility

Shuichi Sakamoto (1), Akihiro Tanaka (2), Komi Tsumura (1), YŰiti Suzuki (1)

(1) Research Institute of Electrical Communication / Graduate School of Information Sciences, Tohoku University, Japan
(2) Department of Psychology, University of Tokyo, Japan

This study investigated effects, on a speech intelligibility, of asynchronicity between a speech signal and a talkerís moving image induced by time-expansion of the speech signal. First, a word intelligibility test (Exp. 1) was administered to younger listeners. Words were processed using STRAIGHT software to expand the speech signal by 0 to 400 ms. The word intelligibility test was administered under three conditions: visualonly, auditory-only, and auditory-visual (AV) conditions. Results showed that intelligibility scores under the AV condition were statistically higher than those under the auditory-only condition, even when the speech signal was expanded by 400 ms. Second, a sentence intelligibility test (Exp. 2) was administered to older adults. For all sentences, each phrase was expanded by 0 to 400 ms. This test was administered under the same conditions as those used for Exp. 1. Results showed that sentence intelligibility scores under the AV condition were statistically higher than those under the audio-only condition when the length of expansion was less than or equal to 200 ms. The results of Exp. 1 and Exp. 2 suggest that the talkerís moving image is effective to enhance speech intelligibility if the lag between the speech signal and the talkerís moving image is less than or equal to 200 ms.

Full Paper

Bibliographic reference.  Sakamoto, Shuichi / Tanaka, Akihiro / Tsumura, Komi / Suzuki, YŰiti (2007): "Effect of speed difference between time-expanded speech and talker≤s moving image on word or sentence intelligibility", In AVSP-2007, paper P18.