Third Workshop on Spoken Language Technologies for Under-resourced Languages

Cape Town, South Africa
May 7-9, 2012

Validating Smartphone-Collected Speech Corpora

Marelie H. Davel, Charl J. van Heerden, Etienne Barnard

North-West University, Vanderbijlpark, South Africa

We investigate the effectiveness with which the accuracy of a prompted speech corpus can be validated when minimal additional speech resources are available, and specifically when a language model in the target language is not available. We compare a word-based variant of Goodness of Pronunciation (GOP) with a phone-based dynamic programming (PDP) scoring technique. The first technique uses the acoustic likelihood ratio and the second the optimal alignment between an observed phone string (generated by a speech recogniser) and a reference phone string (obtained from a dictionary) to generate validation scores. We define a new technique to obtain a PDP scoring matrix in a data-driven fashion, examine different ways of using GOP for word scoring, and find that variants of both techniques provide results that are effective for corpus validation.

Index Terms: speech corpora, corpus validation, goodness of pronunciation, phone-based dynamic programming scores

Full Paper

Bibliographic reference.  Davel, Marelie H. / Heerden, Charl J. van / Barnard, Etienne (2012): "Validating smartphone-collected speech corpora", In SLTU-2012, 68-75.