ISCA Archive L2WS 2010
ISCA Archive L2WS 2010

Pronunciation proficiency estimation based on multilayer regression analysis using speaker- independent structural features

Masayuki Suzuki, Yu Qiao, Nobuaki Minematsu, Keikichi Hirose

Teachers can assess the pronunciations of students independently of extra-linguistic features such as age and gender observed in the students’ utterances. This capacity is, however, difficult to realize on machines because linguistic differences and extra-linguistic differences change acoustic features commonly. Therefore, the performance of automatic pronunciation assessment is inevitably affected by the extra-linguistic features. Recently, we proposed acoustic features that are independent of extra-linguistic factors, called structural features and realized a technique for pronunciation proficiency estimation that is extremely robust to these factors. In this paper, we extend this technique with multilayer regression analysis, where supervised learning is done at each layer by using teachers’ scores of that layer. Experiments of estimating the proficiency show that higher correlations between teachers and machines are obtained compared to our previous structure-based assessment.


Cite as: Suzuki, M., Qiao, Y., Minematsu, N., Hirose, K. (2010) Pronunciation proficiency estimation based on multilayer regression analysis using speaker- independent structural features. Proc. Second Language Studies: Acquisition, Learning, Education and Technology (L2WS 2010), paper O2-3

@inproceedings{suzuki10_l2ws,
  author={Masayuki Suzuki and Yu Qiao and Nobuaki Minematsu and Keikichi Hirose},
  title={{Pronunciation proficiency estimation based on multilayer regression analysis using speaker- independent structural features}},
  year=2010,
  booktitle={Proc. Second Language Studies: Acquisition, Learning, Education and Technology (L2WS 2010)},
  pages={paper O2-3}
}