8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Using Quick Transcriptions to Improve Conversational Speech Models

Owen Kimball, Chia-lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul

BBN Technologies, USA

Using large amounts of training data may prove to be critical to attaining very low error rates in conversational speech recognition. Recent collection efforts by the LDC[1] have produced a large corpus of such data, but to be useful, it must be transcribed. Historically, the cost of transcribing conversational speech has been very high, leading us to consider quick transcription methods that are significantly faster and less expensive than traditional methods. We describe the conventions used in transcription and an automatic utterance segmentation algorithm that provides necessary timing information. Experiments with models trained on a 20-hour set demonstrate that quick transcription works as well as careful transcription, even though the quick transcripts are produced roughly eight times as fast. We also show that when added to a large corpus of carefully transcribed data, quickly transcribed data gives significant improvements in a state-of-the-art ASR system.

Full Paper

Bibliographic reference.  Kimball, Owen / Kao, Chia-lin / Iyer, Rukmini / Arvizo, Teodoro / Makhoul, John (2004): "Using quick transcriptions to improve conversational speech models", In INTERSPEECH-2004, 2265-2268.