5th International Conference on Spoken Language Processing
The SWITCHBOARD (SWB) corpus is one of the most important benchmarks for recognition tasks involving large vocabulary conversational speech (LVCSR). The high error rates on SWB are largely attributable to an acoustic model mismatch, the high frequency of poorly articulated monosyllabic words, and large variations in pronunciations. It is imperative to improve the quality of segmentations and transcriptions of the training data to achieve better acoustic modeling. By adapting existing acoustic models to only a small subset of such improved transcriptions, we have achieved a 2% absolute improvement in performance.
Bibliographic reference. Deshmukh, Neeraj / Ganapathiraju, Aravind / Gleeson, Andi / Hamaker, Jonathan / Picone, Joseph (1998): "Resegmentation of SWITCHBOARD", In ICSLP-1998, paper 0685.