Confidence-based thresholding plays an important role in practical speech recognition applications. Most previous works have focused on directly improving confidence estimation within the recognition engine. A complementary approach that does not require access to recognizer internal is to optimize confidence threshold settings. This paper describes a general multi-confidence thresholding algorithm that automatically learns different confidence thresholds for different utterances, based on discreet or continuous features associated with a speech utterance. The algorithm can be applied to any speech recognition engine with a confidence output. A learned multi-threshold setting is guaranteed to perform at least as well as a baseline singlethreshold system on training data. A significant improvement on overall accuracy can often be obtained on test data, as demonstrated with experiments on two real-world applications.
Cite as: Chang, S. (2006) Improving speech recognition accuracy with multi-confidence thresholding. Proc. Interspeech 2006, paper 1346-Wed1CaP.3, doi: 10.21437/Interspeech.2006-450
@inproceedings{chang06b_interspeech, author={Shuangyu Chang}, title={{Improving speech recognition accuracy with multi-confidence thresholding}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1346-Wed1CaP.3}, doi={10.21437/Interspeech.2006-450} }