12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

A Transcription Task for Crowdsourcing with Automatic Quality Control

Chia-ying Lee, James Glass


In this paper, we propose a two-stage transcription task design for crowdsourcing with an automatic quality control mechanism embedded in each stage. For the first stage, a support vector machine (SVM) classifier is utilized to quickly filter poor quality transcripts based on acoustic cues and language patterns in the transcript. In the second stage, word level confidence scores are used to estimate a transcription quality and provide instantaneous feedback to the transcriber. The proposed design was evaluated using Amazon Mechanical Turk (MTurk) and tested on seven hours of academic lecture speech, which is typically conversational in nature and contains technical material. Compared to baseline transcripts which were also collected from MTurk using a ROVERbased method, we observed that the new method resulted in higher quality transcripts while requiring less transcriber effort.

Full Paper

Bibliographic reference.  Lee, Chia-ying / Glass, James (2011): "A transcription task for crowdsourcing with automatic quality control", In INTERSPEECH-2011, 3041-3044.