ISCA International Workshop on Speech and Language Technology in Education (SLaTE 2009)
Wroxall Abbey Estate, Warwickshire, England
An inventory was compiled of pronunciation errors frequently made by foreigners speaking Dutch.
On the basis of this inventory artificial errors were created in a native development
corpus, which in turn
were used to optimize thresholds for the Goodness of Pronunciation (GOP) algorithm. In
the current study the
GOP algorithm is evaluated in three different ways: (1) using a native test corpus with
which reflect errors frequently made by non-natives, (2) within an actual application used
for practicing pronunciation, and (3) post-hoc, using the recorded interactions of the
application, to determine what the performance of the algorithm would have been if optimal
speaker and phone
specific thresholds had been used.
The results show that the performance of the GOP algorithm was satisfactory and that the procedure by which thresholds were determined by simulating realistic pronunciation errors was appropriate, because performance on the artificially introduced errors closely approximated performance on real data. This finding is particularly welcome if we consider that, in general, paucity of data is a common problem in this kind of research. Furthermore, it appeared that post-hoc threshold optimization only led to a slight increase in performance.
Index Terms: Goodness of Pronunciation (GOP), pronunciation error detection, Computer Assisted Pronunciation Training (CAPT)
Bibliographic reference. Kanters, Sandra / Cucchiarini, Catia / Strik, Helmer (2009): "The goodness of pronunciation algorithm: a detailed performance study", In SLaTE-2009, 49-52.