SLaTE 2015 - Workshop on Speech and Language Technology in Education

Leipzig, Germany
September 4-5, 2015

How Many Speakers, How Many Texts – The Automatic Assessment of Non-native Prosody

Florian Hönig (1), Anton Batliner (1,2), Elmar Nöth (1)

(1) Pattern Recognition Lab, FAU Erlangen-Nuremberg, Germany
(2) Machine Intelligence & Signal Processing Group, TUM, Munich, Germany

We present an in-depth analysis of a method for automatically scoring the prosody of non-native speech. For studying its suitability for different application scenarios, we perform a systematic comparison of different evaluation schemes such as text (in-)dependence and/or speaker (in-)dependence. The focus lies on methodological issues, with the aim of promoting the careful evaluation of automatic assessment methods. Further contributions are the analysis of (1) a method that utilizes speaker IDs to improve performance, and (2) the analysis of performance as a function of the number of speakers and texts used for training the system.

Full Paper

Bibliographic reference.  Hönig, Florian / Batliner, Anton / Nöth, Elmar (2015): "How many speakers, how many texts – the automatic assessment of non-native prosody", In SLaTE-2015, 1-6.