Visual Timing Information in Audiovisual Speech Perception: Evidence from Lexical Tone Contour

Hui Xie, Biao Zeng, Rui Wang


The present study investigated whether duration of lip movement could improve intelligibility of lexical pitch contours under noisy condition. Eighteen Chinese speakers were asked to identify a Mandarin lexical tone in one pair of tones under auditory only (AO) and audiovisual (AV) condition. Two types of tone pairs were used in the study: maximum contrastive pair (falling vs. dipping tones, the durational difference of lip movement was 100ms) and minimum contrastive pair (rising vs. falling tones, the difference was 33ms). The results showed that duration of lip movement enhanced discrimination in the maximum pair whereas the similar lengths of rising and dipping tones attenuated such visual benefit. The finding suggested that visual timing information could be a specific cue for audiovisual lexical tone perception.


 DOI: 10.21437/Interspeech.2018-1285

Cite as: Xie, H., Zeng, B., Wang, R. (2018) Visual Timing Information in Audiovisual Speech Perception: Evidence from Lexical Tone Contour. Proc. Interspeech 2018, 3781-3785, DOI: 10.21437/Interspeech.2018-1285.


@inproceedings{Xie2018,
  author={Hui Xie and Biao Zeng and Rui Wang},
  title={Visual Timing Information in Audiovisual Speech Perception: Evidence from Lexical Tone Contour},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3781--3785},
  doi={10.21437/Interspeech.2018-1285},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1285}
}