12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Quality Assessment of Crowdsourcing Transcriptions for African Languages

Hadrien Gelas (1), Solomon Teferra Abate (2), Laurent Besacier (2), François Pellegrino (1)

(1) DDL (UMR 5596), France
(2) LIG (UMR 5217), France

We evaluate the quality of speech transcriptions acquired by crowdsourcing to develop ASR acoustic models (AM) for underresourced languages. We have developed AMs using reference (REF) transcriptions and transcriptions from crowdsourcing (TRK) for Swahili and Amharic. While the Amharic transcription was much slower than that of Swahili to complete, the speech recognition systems developed using REF and TRK transcriptions have almost similar (40.1 vs 39.6 for Amharic and 38.0 vs 38.5 for Swahili) word recognition error rate. Moreover, the character level disagreement rates between REF and TRK are only 3.3% and 6.1% for Amharic and Swahili, respectively. We conclude that it is possible to acquire quality transcriptions from the crowd for under-resourced languages using Amazon's Mechanical Turk. Recognizing such a great potential of it, we recommend some legal and ethical issues to consider.

Full Paper

Bibliographic reference.  Gelas, Hadrien / Abate, Solomon Teferra / Besacier, Laurent / Pellegrino, François (2011): "Quality assessment of crowdsourcing transcriptions for african languages", In INTERSPEECH-2011, 3065-3068.