Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Development of the 1998 OGI-FONIX Broadcast News Transcription System

Xintian Wu, Yonghong Yan

Oregon Graduate Institute of Science and Technology, Portland, OR, USA

In speech recognition systems, it is generally required that the training environment be identical to the decoding environment. Any mismatch between them may result in performance degradation. This paper tries to improve the performance of a speech recognition system by compensating for the training and decoding mismatches. The baseline system [1][2] is a multiple pass decoding system capable of transcribing broadcast news, which achieved 30.5% word error rate on the 1997 DARPA HUB4E test set. Three approaches were investigated: (1) Delete long silence in both training and decoding utterances; (2) Enlarge the second-pass decoding dictionary; (3) Merge utterance fragments into a complete sentence. These approaches resulted in 2.8%, 0.3%, and 2.3% absolute error reductions on the 1997 test set, respectively. The combined approach achieved more than 4% absolute error reduction. On the oAEcial 1998 DARPA HUB4E evaluation, the resulting system achieved 27.9% word error rate for the 97 part evaluation data and 23.6% word error rate for the 98 part evaluation data.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Wu, Xintian / Yan, Yonghong (1999): "Development of the 1998 OGI-FONIX broadcast news transcription system", In EUROSPEECH'99, 683-686.