In the context of command-and-control applications, we exploit confidence measures in order to classify utterances into two categories: utterances within the vocabulary which are recognized correctly, and other (out-of-vocabulary= OOV and misrecognized) utterances. We investigate the classification error rate (CER) of several classes of confidence measures and transformations based on a database containing 3345 utterances by 50 male and female individuals, employing data-independent and data-dependent measures. The transformations we investigated include mapping to single confidence measures, LDA-transformed measures, and other linear combinations of these measures. These combinations are computed by means of neural networks trained with Bayes-optimal, and with Gardner-Derrida-optimal criteria. Compared to a recognition system without confidence measures, the selection of (various combinations of) confidence measures, and the selection of suitable neural network architectures and training methods, continuously improves the CER from 16.7% to 6.6% (-60% relative). Furthermore, a linear perceptron generalizes better than a non-linear backpropagation network.
Cite as: Dolfing, J.G.A., Wendemuth, A. (1998) Combination of confidence measures in isolated word recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0481, doi: 10.21437/ICSLP.1998-815
@inproceedings{dolfing98_icslp, author={J. G. A. Dolfing and Andreas Wendemuth}, title={{Combination of confidence measures in isolated word recognition}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0481}, doi={10.21437/ICSLP.1998-815} }