15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Learning Phrase Patterns for Text Classification Using a Knowledge Graph and Unlabeled Data

Alex Marin (1), Roman Holenstein (2), Ruhi Sarikaya (2), Mari Ostendorf (1)

(1) University of Washington, USA
(2) Microsoft, USA

This paper explores a novel method for learning phrase pattern features for text classification, employing a mapping of selected words into a knowledge graph and self-training over unlabeled data. Using Support Vector Machine classification, we obtain improvements over lexical and fully-supervised phrase pattern features in domain and intent detection for language understanding, particularly in conjunction with the use of unlabeled data. Our best results are obtained using unlabeled data filtered for both model training and feature learning based on the confidence of the baseline classifiers.

Full Paper

Bibliographic reference.  Marin, Alex / Holenstein, Roman / Sarikaya, Ruhi / Ostendorf, Mari (2014): "Learning phrase patterns for text classification using a knowledge graph and unlabeled data", In INTERSPEECH-2014, 253-257.