ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Large margin multinomial mixture model for text categorization

Zhen-Yu Pan, Hui Jiang

In this paper, we present a novel discriminative training method for multinomial mixture models (MMM) in text categorization based on the principle of large margin. Under some approximation and relaxation conditions, large margin estimation (LME) of MMMs can be formulated as linear programming (LP) problems, which can be efficiently and reliably solved by many general optimization tools even for very large models. The text categorization experiments on the standard RCV1 text corpus show that the LME method of MMMs can largely improve classification accuracy over the traditional training method based on the EM algorithm. Comparing with the EM method, the proposed LME method can achieve over 20% relative error reduction on three independent test sets of RCV1.

doi: 10.21437/Interspeech.2008-258

Cite as: Pan, Z.-Y., Jiang, H. (2008) Large margin multinomial mixture model for text categorization. Proc. Interspeech 2008, 1566-1569, doi: 10.21437/Interspeech.2008-258

  author={Zhen-Yu Pan and Hui Jiang},
  title={{Large margin multinomial mixture model for text categorization}},
  booktitle={Proc. Interspeech 2008},