Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Portland, Oregon, USA
September 14, 2012

Semi-Supervised Learning for Text Classification using Feature Affinity Regularization

Bin Zhang, Mari Ostendorf

University of Washington, Seattle, WA, USA

Most conventional semi-supervised learning methods attempt to directly include unlabeled data into training objectives. This paper presents an alternative approach that learns feature affinity information from unlabeled data, which is incorporated into the training objective as regularization of a maximum entropy model. The regularization favors models for which correlated features have similar weights. The method is evaluated in text classification, where feature affinity can be computed from feature co-occurrences in unlabeled data. Experimental results show that this method consistently outperforms baseline methods.

Index Terms: semi-supervised learning, text classification, maximum entropy, feature affinity matrix, regularization

Full Paper    

Bibliographic reference.  Zhang, Bin / Ostendorf, Mari (2012): "Semi-supervised learning for text classification using feature affinity regularization", In MLSLP-2012, 26-29.