In this paper we define an acoustic confidence measure based on the estimates of local posterior probabilities produced by a HMM/ANN large vocabulary continuous speech recognition system. We use this measure to segment continuous audio into regions where it is and is not appropriate to expend recognition effort. The segmentation is computationally inexpensive and provides reductions in both overall word error rate and decoding time. The technique is evaluated using material from the Broadcast News corpus.
Cite as: Barker, J., Williams, G., Renals, S. (1998) Acoustic confidence measures for segmenting broadcast news. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0643, doi: 10.21437/ICSLP.1998-605
@inproceedings{barker98_icslp, author={Jon Barker and Gethin Williams and Steve Renals}, title={{Acoustic confidence measures for segmenting broadcast news}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0643}, doi={10.21437/ICSLP.1998-605} }