This paper describes a number of recent improvements to the HTK Broadcast News Transcription System. Changes to the system include the use of more acoustic training data; use of cluster-based variance normalisation and vocal tract length normalisation; the use of interpolated language models and enhanced adaptation using a full variance transform. These changes produce an reduction in word error rate of 13%. A simplified version of the system has also been constructed that runs in less than 10 times real-time and gives a 2.3% absolute higher error rate than the 300xRT full system.
Cite as: Woodland, P.C., Odell, J.J., Hain, T., Moore, G.L., Niesler, T.R., Tuerk, A., Whittaker, E.W.D. (1999) Improvements in accuracy and speed in the HTK broadcast news transcription system. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1043-1046, doi: 10.21437/Eurospeech.1999-170
@inproceedings{woodland99_eurospeech, author={P. C. Woodland and J. J. Odell and T. Hain and G. L. Moore and T. R. Niesler and Andreas Tuerk and E. W. D. Whittaker}, title={{Improvements in accuracy and speed in the HTK broadcast news transcription system}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={1043--1046}, doi={10.21437/Eurospeech.1999-170} }