8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Constrained Minimization Technique for Topic Identification Using Discriminative Training and Support Vector Machines

Imed Zitouni (1), Minkyu Lee (2), Hui Jiang (3)

(1) IBM T.J. Watson Research Center, USA
(2) Lucent Technologies, USA
(3) York University, Toronto, Canada

This paper describes the constrained minimization approach to combine multiple classifiers in order to improve classification accuracy. Since errors of individual classifiers in the ensemble should somehow be uncorrelated to yield higher classification accuracy, we propose a combination strategy where the combined classifier accuracy is a function of the correlation between classification errors of the individual classifiers. To obtain powerful single classifiers, different techniques are investigated including support vector machines and latent semantic indexing (LSI) matrix, which is a popular vector-space model. We also investigate discriminative training (DT) of the LSI matrix on constrained minimization approach. DT minimizes the classification error by increasing the score separation of the correct from competing documents. Experimental evaluation is carried out on a banking call routing and on switchboard databases with a set of 23 and 67 topics respectively. Results show that the combined classifier we propose outperforms the accuracy of individual baseline classifiers by 44%.

Full Paper

Bibliographic reference.  Zitouni, Imed / Lee, Minkyu / Jiang, Hui (2004): "Constrained minimization technique for topic identification using discriminative training and support vector machines", In INTERSPEECH-2004, 181-184.