Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Use of a Confidence Measure Based on Frame Level Likelihood Ratios for the Rejection of Incorrect Data

Nicolas Moreau, Denis Jouvet

France TÚlÚcom, CNET/DIH/DIPS, Lannion, France

Interactive vocal services are based on speech recognition systems which must be able to reject efficiently incorrect utterances such as out-of-vocabulary or noise tokens. A possible approach is a post-processing of the hypotheses delivered by the recogniser, based on the computation of a confidence measure (CM). A recognition hypothesis is rejected if its CM is below a chosen threshold. This paper presents a new way of computing a CM on a recognition hypothesis, based on the calculation of a likelihood ratio for each acoustic frame of the utterance. Promising results are reported on a large vocabulary of a telephone directory task. Significant falls in the error rates are observed, compared to a reference system which include a garbage model, with no post-processing of the recognised words.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Moreau, Nicolas / Jouvet, Denis (1999): "Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data", In EUROSPEECH'99, 291-294.