Vowel Characteristics in the Assessment of L2 English Pronunciation

Calbert Graham, Paula Buttery, Francis Nolan


There is considerable need to utilise linguistically meaningful measures of second language (L2) proficiency that are based on perceptual cues used by humans to assess pronunciation. Previous research on non-native acquisition of vowel systems suggests a strong link between vowel production accuracy and speech intelligibility. It is well known that the acoustic and perceptual identification of vowels rely on formant frequencies. However, formant analysis may not be viable in large-scale corpus research, given the need for manual correction of tracking errors. Spectral analysis techniques have been shown to be a robust alternative to formant tracking. This paper explores the use of one such technique — the discrete cosine transform (DCT) — for modelling English vowel spectra in the productions of non-native English speakers. Mel-scaled DCT coefficients were calculated over a frequency band of 200–4000 Hz. Results show a statistically significant correlation between coefficients and the proficiency level of speakers, and suggest that this technique holds some promise in automated L2 pronunciation teaching and assessment.


DOI: 10.21437/Interspeech.2016-1630

Cite as

Graham, C., Buttery, P., Nolan, F. (2016) Vowel Characteristics in the Assessment of L2 English Pronunciation. Proc. Interspeech 2016, 1127-1131.

Bibtex
@inproceedings{Graham+2016,
author={Calbert Graham and Paula Buttery and Francis Nolan},
title={Vowel Characteristics in the Assessment of L2 English Pronunciation},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1630},
url={http://dx.doi.org/10.21437/Interspeech.2016-1630},
pages={1127--1131}
}