INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Large Margin Multinomial Mixture Model for Text Categorization

Zhen-Yu Pan, Hui Jiang

York University, Canada

In this paper, we present a novel discriminative training method for multinomial mixture models (MMM) in text categorization based on the principle of large margin. Under some approximation and relaxation conditions, large margin estimation (LME) of MMMs can be formulated as linear programming (LP) problems, which can be efficiently and reliably solved by many general optimization tools even for very large models. The text categorization experiments on the standard RCV1 text corpus show that the LME method of MMMs can largely improve classification accuracy over the traditional training method based on the EM algorithm. Comparing with the EM method, the proposed LME method can achieve over 20% relative error reduction on three independent test sets of RCV1.

Full Paper

Bibliographic reference.  Pan, Zhen-Yu / Jiang, Hui (2008): "Large margin multinomial mixture model for text categorization", In INTERSPEECH-2008, 1566-1569.