This paper presents a novel unsupervised calibration framework of word confidence measures for automatic speech recognition. It makes it possible to improve the quality of confidence measures in situations where the training of parametric models is hindered by a lack of human-labeled in-domain data. The proposed method calibrates confidence scores by utilizing recognition results stored in deployed systems rather than human-labeled data. In order to stabilize correct/incorrect decision of words, the confidence score of the target word is calibrated based on the confidence scores of identical words, called "examples," found in the stored recognition results. The confidence scores of examples are weighted according to the importance of each example, and the calibrated confidence score of the target word is calculated as the importance weighted average of the scores of the examples. The importance of each example is determined by context similarity between the target word and the example. Experiments confirm that the proposed calibration method can improve the correct/incorrect decision of recognized words compared to word posterior probabilities and the conventional calibration method on the unknown domain call center task.
Bibliographic reference. Asami, Taichi / Kobashikawa, Satoshi / Masataki, Hirokazu / Yoshioka, Osamu / Takahashi, Satoshi (2013): "Unsupervised confidence calibration using examples of recognized words and their contexts", In INTERSPEECH-2013, 2217-2221.