Investigating the Role of L1 in Automatic Pronunciation Evaluation of L2 Speech

Ming Tu, Anna Grabek, Julie Liss, Visar Berisha


Automatic pronunciation evaluation plays an important role in pronunciation training and second language education. This field draws heavily on concepts from automatic speech recognition (ASR) to quantify how close the pronunciation of non-native speech is to native-like pronunciation. However, it is known that the formation of accent is related to pronunciation patterns of both the target language (L2) and the speaker's first language (L1). In this paper, we propose to use two native speech acoustic models, one trained on L2 speech and the other trained on L1 speech. We develop two sets of measurements that can be extracted from two acoustic models given accented speech. A new utterance-level feature extraction scheme is used to convert these measurements into a fixed-dimension vector which is used as an input to a statistical model to predict the accentedness of a speaker. On a data set consisting of speakers from 4 different L1 backgrounds, we show that the proposed system yields improved correlation with human evaluators compared to systems only using the L2 acoustic model.


 DOI: 10.21437/Interspeech.2018-1350

Cite as: Tu, M., Grabek, A., Liss, J., Berisha, V. (2018) Investigating the Role of L1 in Automatic Pronunciation Evaluation of L2 Speech. Proc. Interspeech 2018, 1636-1640, DOI: 10.21437/Interspeech.2018-1350.


@inproceedings{Tu2018,
  author={Ming Tu and Anna Grabek and Julie Liss and Visar Berisha},
  title={Investigating the Role of L1 in Automatic Pronunciation Evaluation of L2 Speech},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1636--1640},
  doi={10.21437/Interspeech.2018-1350},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1350}
}