In spoken term detection (STD) task, the confidence measure is used to assess the reliability of detected terms. The widely used confidence measure in STD is based on the normalized lattice posterior probability. In this paper, however, several distinct confidence estimation methods are investigated to improve the baseline lattice confidence: the acoustic and duration confidences are estimated by hybrid Hidden Markov Model / Artificial Neural Network (HMM/ANN) and phonetic duration model respectively. These two confidences plus lattice confidence are linearly interpolated to produce a more reliable confidence measure. The experimental results show the feasibility and effectiveness of our combination approach. The proposed method substantially improves the STD performance, for a 4.8%.11.1% relative equal error rate (EER) reduction on three evaluation sets compared with the baseline lattice confidence.
Bibliographic reference. Ma, Zejun / Wang, Xiaorui / Xu, Bo (2011): "Fusing multiple confidence measures for Chinese spoken term detection", In INTERSPEECH-2011, 1925-1928.