Sixth European Conference on Speech Communication and Technology
Interactive vocal services are based on speech recognition systems which must be able to reject efficiently incorrect utterances such as out-of-vocabulary or noise tokens. A possible approach is a post-processing of the hypotheses delivered by the recogniser, based on the computation of a confidence measure (CM). A recognition hypothesis is rejected if its CM is below a chosen threshold. This paper presents a new way of computing a CM on a recognition hypothesis, based on the calculation of a likelihood ratio for each acoustic frame of the utterance. Promising results are reported on a large vocabulary of a telephone directory task. Significant falls in the error rates are observed, compared to a reference system which include a garbage model, with no post-processing of the recognised words.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Moreau, Nicolas / Jouvet, Denis (1999): "Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data", In EUROSPEECH'99, 291-294.