ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Pronunciation training: the role of eye and ear

Dominic W. Massaro, Stephanie Bigler, Trevor Chen, Marcus Perlman, Slim Ouni

For speech perception and production of a new language, we examined whether 1) they would be more easily learned by ear and eye relative to by ear alone, and 2) whether viewing the tongue, palate, and velum during production is more beneficial for learning than a standard frontal view of the speaker. In addition, we determine whether differences in learning under these conditions are due to enhanced receptive learning from additional visual information, or to more active learning motivated by the visual presentations. Test stimuli were two similar vowels in Mandarin and two similar stop consonants in Arabic, presented in different word contexts. Participants were tested with auditory speech and were either trained 1) unimodally with just auditory speech or bimodally with both auditory and visual speech, and 2) a standard frontal view versus an inside view of the vocal tract. The visual speech was generated by the appropriate multilingual versions of Baldi [1]. The results test the effectiveness of visible speech for learning a new language. Preliminary results indicate that visible speech can contribute positively to acquiring new speech distinctions and promoting active learning.

Massaro, D. W. (1998). Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, Massachusetts: MIT Press.

doi: 10.21437/Interspeech.2008-650

Cite as: Massaro, D.W., Bigler, S., Chen, T., Perlman, M., Ouni, S. (2008) Pronunciation training: the role of eye and ear. Proc. Interspeech 2008, 2623-2626, doi: 10.21437/Interspeech.2008-650

  author={Dominic W. Massaro and Stephanie Bigler and Trevor Chen and Marcus Perlman and Slim Ouni},
  title={{Pronunciation training: the role of eye and ear}},
  booktitle={Proc. Interspeech 2008},