Selecting well-recognized transcripts is critical if information retrieval systems are to extract business intelligence from massive spoken document databases. To achieve this goal, we target spoken document confidence measures that represent the recognition rates of each document. We focus on the incoherent word occurrences over several utterances in ill-recognized transcripts of spoken documents. The proposed method uses contextual coherence as a measure of spoken document confidence. The contextual coherence is formulated as the mean of pointwise mutual information (PMI). We also propose a smoothing method of PMI, which deals with the data sparseness problem. Compared to the conventional method, our smoothing technique offers improved correlation coefficients between spoken document confidence scores and recognition rates from 0.573 to 0.672. Moreover, an even higher correlation coefficient, 0.710, is achieved by combining the contextual-based and decoder-based confidence measures.
Bibliographic reference. Asami, Taichi / Nomoto, Narichika / Kobashikawa, Satoshi / Yamaguchi, Yoshikazu / Masataki, Hirokazu / Takahashi, Satoshi (2011): "Spoken document confidence estimation using contextual coherence", In INTERSPEECH-2011, 1961-1964.