Interactive vocal services are based on speech recognition systems which must be able to reject efficiently incorrect utterances such as out-of-vocabulary or noise tokens. A possible approach is a post-processing of the hypotheses delivered by the recogniser, based on the computation of a confidence measure (CM). A recognition hypothesis is rejected if its CM is below a chosen threshold. This paper presents a new way of computing a CM on a recognition hypothesis, based on the calculation of a likelihood ratio for each acoustic frame of the utterance. Promising results are reported on a large vocabulary of a telephone directory task. Significant falls in the error rates are observed, compared to a reference system which include a garbage model, with no post-processing of the recognised words.
Cite as: Moreau, N., Jouvet, D. (1999) Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 291-294, doi: 10.21437/Eurospeech.1999-76
@inproceedings{moreau99_eurospeech, author={Nicolas Moreau and Denis Jouvet}, title={{Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={291--294}, doi={10.21437/Eurospeech.1999-76} }