In this paper, we propose a novel confidence measure to improve the performance of spoken term detection (STD). The proposed confidence measure is based on the context consistency between a hypothesized word and its context in word lattice. When calculating the context consistency of a hypothesized word, the proposed confidence measure considers not only the semantic similarity between words but also the uncertainty of the context. To measure the uncertainty of the context, we employ the word occurrence probability, which is obtained by combining the overlapping hypotheses in word posterior lattice. Additionally, we also use two effective measures of semantic similarity to acquire more accurate context consistency for confidence measure. The experiments conducted on the Hub-4NE Mandarin database show that the proposed confidence measure can achieve improvements over the confidence measure which ignores the word occurrence probability of context word.
Index Terms: confidence measure, spoken term detection, context consistency, semantic similarity, word occurrence probability
Bibliographic reference. Li, Haiyang / Han, Jiqing / Zheng, Tieran / Zheng, Guibin (2012): "A novel confidence measure based on context consistency for spoken term detection", In INTERSPEECH-2012, 2430-2433.