Interspeech'2005 - Eurospeech
This paper describes the BBN English Broadcast News transcription system developed for the EARS Rich Transcription 2004 (RT04) evaluation. In comparison to the BBN RT03 system, we achieved around 22% relative reduction in word error rate for all EARS BN development test sets. The use of additional acoustic training data acquired through Light Supervision based on thousands of hours of found data made the biggest contribution to the improvement. Better audio segmentation, through the use of an online speaker clustering algorithm and chopping speaker turns into moderately long utterances, also contributed substantially to the improvement. Other contributions, even of modest size but adding up nicely, include using discriminative training for all acoustic models, using word duration as an additional knowledge source during N-best rescoring, and using updated lexicon and language models.
Bibliographic reference. Nguyen, Long / Xiang, Bing / Afify, Mohamed / Abdou, Sherif / Matsoukas, Spyros / Schwartz, Richard / Makhoul, John (2005): "The BBN RT04 English broadcast news transcription system", In INTERSPEECH-2005, 1673-1676.