This paper examines the task of Spoken Term Detection (STD) for the Chinese language. We propose to use Linear Logistic Regression (LLR) to combine various Chinese STD systems built with different decoding units, detection units, features and phone sets. In order to solve the missing-sample problem in STD system combination, side-information reflecting the reliability of the scores for fusion is used to condition the parameters of the standard LLR model. In addition, a two-stage combination solution is proposed to overcome the data-sparse problem. The experimental results show that the proposed methods improve the overall detection performance significantly. Compared with the best single system, a relative 11.3% improvement is achieved.
Bibliographic reference. Meng, Sha / Zhang, Wei-Qiang / Liu, Jia (2010): "Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression", In INTERSPEECH-2010, 685-688.