International Symposium on Chinese Spoken Language Processing (ISCSLP 2002)

Taipei, Taiwan
August 23-24, 2002

Comparison and Combination of Confidence Measures in Isolated Word Recognition

Zhenyu Xiong, Mingxing Xu, Wenhu Wu

Center of Speech Technology, State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, Beijing, China

In this paper, we describe our work on the field of confidence measures for isolate word recognition system based on hidden Markov models (HMMs). Three kinds of frame level likelihood ratios are extracted as basic confidence features, and phone level confidence measures are derived from these features. Word level confidence measures are derived from phone level confidence features or from frame features directly. These different kinds of word level confidence measures are experimentally compared on a Chinese name database. The experiment shows that the confidences based on phone level features are better than those derived from frame features directly, and a kind of frame features based on filler model outperforms other two kinds. And then a Fisher linear discriminant projection and a non-linear backpropagation neural network are utilized to combine these different kinds of word level confidence features. An evaluation on the Chinese name database shows that the non-linear network approach exceeds the Fisher linear approach, and improves the performance in comparison to the baseline in which only a single kind of word level confidence feature is used.


Full Paper

Bibliographic reference.  Xiong, Zhenyu / Xu, Mingxing / Wu, Wenhu (2002): "Comparison and combination of confidence measures in isolated word recognition", In ISCSLP 2002, paper 67.