13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

The Automatic Assessment of Non-native Prosody: Combining Classical Prosodic Analysis with Acoustic Modelling

Florian Hönig (1), Tobias Bocklet (1,2), Korbinian Riedhammer (1), Anton Batliner (1), Elmar Nöth (1)

(1) Pattern Recognition Lab, University of Erlangen-Nuremberg, Germany
(2) Department of Phoniatrics and Pediatric Audiology, University Hospital Erlangen, Germany

In earlier studies, we assessed the degree of non-nativeness employing prosodic information. In this paper, we combine prosodic information with (1) features derived from a Gaussian Mixture Model used as Universal Background Model (GMM-UBM), a powerful approach used in speaker identification, and (2) openSMILE, a standard open-source toolkit for extracting acoustic features. We evaluate our approach with English speech from 94 non-native speakers. GMM-UBM or openSMILE modelling alone yields lower performance than our prosodic feature vector; however, adding information from the GMM-UBM modelling or openSMILE by late fusion improves results.

Index Terms: computer-assisted language learning, non-native prosody, rhythm, automatic assessment

Full Paper

Bibliographic reference.  Hönig, Florian / Bocklet, Tobias / Riedhammer, Korbinian / Batliner, Anton / Nöth, Elmar (2012): "The automatic assessment of non-native prosody: combining classical prosodic analysis with acoustic modelling", In INTERSPEECH-2012, 823-826.