5th International Conference on Spoken Language Processing
In this paper we define and investigate a set of confidence measures based on hybrid Hidden Markov Model/Artificial Neural Network acoustic models. These measures are using the neural network to estimate the local phone posterior probabilities, which are then combined and normalized in different ways. Experimental results will show that the use of an appropriate duration normalization is very important to obtain good estimates of the phone and word confidences. The different measures are evaluated at the phone and word levels on both isolated word (PHONEBOOK) and continuous speech (BREF) recognition tasks. It will be shown that one of those confidence measures is well suited for utterance verification, and that (as one could expect) confidence measures at the word level perform better than those at the phone level. Finally, using the resulting approach on PHONEBOOK to rescore the N-best list is shown yielding a 34% decrease in word error rate.
Bibliographic reference. Bernardis, Giulia / Bourlard, Hervé (1998): "Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems", In ICSLP-1998, paper 0318.