Auditory-Visual Speech Processing (AVSP) 2010

Hakone, Kanagawa, Japan
September 30-October 3, 2010

Effects of Speech-Rate Conversion on Asynchrony Perception of Audio-Visual Speech

Shiho Miyazawa (1), Akihiro Tanaka (2), Shuichi Sakamoto (3), Takehiko Nishimoto (4)

(1) Graduate School of Letters, Arts and Sciences, Waseda University, Japan
(2) Waseda Institute for Advanced Study, Waseda University, Japan
(3) Research Institute of Electrical Communication and Graduate School of Information Sciences, Tohoku University, Japan
(4) Faculity of Letters, Arts and Sciences, Waseda University, Japan

Previous studies showed that the time-expanded speech signal and the cue of moving image of talker’s face improve speech intelligibility. Sakamoto et al. (2008) investigated the detection thresholds of auditory-visual asynchrony for timeexpanded speech and a moving image of the talker’s face. Their results showed that detection thresholds in longer words were higher than those for shorter words. However, it is not clear whether the results are associated with the difference of number of mora or whole word length of stimuli. In this study, we examined detection thresholds of auditory-visual asynchrony between time-expanded speech and moving image of the talker’s face by using words that have different numbers of mora but the same duration. The results revealed that there is no significant difference in the detection thresholds between words with the smaller number of mora and words with longer number of mora. Thus, our results suggests that word length, not the number of mora, affects the detection thresholds between auditory and visual stimuli.

Index Terms: audio-visual asynchrony, detection rate, timeexpanded speech

Full Paper

