Third International Conference on Spoken Language Processing (ICSLP 94)
Earlier research have demonstrated that visual information provided by the movement of a talker's lip helps in the perception of speech. The purpose of this paper was to explore the limits of the lip-reading effect by desynchronizing visual and auditory information in speech. In the experiment reported here, audio-visual identification of Japanese sentences was examined as a function of desynchronizations: audio-visually with either 0, 120, 240, 480 ms of audio delay or precedence, and audio-alone, . The results of this study indicated that all the increased desynchronizations in the range from 120-480 ms with one exception produced significant decrements in intelligibility when compared individually with the audio-visual insynchrony condition. On the other hand, when audio was delayed or preceded by 120 ms, subjects were better able to identify the sentences than when auditory information alone was provided. This indicated that subjects benefited from the visual information across such asynchronies. Furthermore, such asynchronous time coincided with the mean mora duration of 123 ms measured for sentences from test lists. Thus it appears that subjects might be attempting to integrate the visual and auditory information in speech at the level of individual morae. These results of experiment may serve to consider the linguistic units of auditory-visual integration during speech perception.
Bibliographic reference. Hashimoto, Masahiro / Seki, Hideaki (1994): "Limitations of lip-reading advantage by desynchronizing visual and auditory information in speech", In ICSLP-1994, 1155-1158.