This paper presents an unsupervised method that uses limited amount of labeled data and a large pool of unlabeled data to improve natural language call routing performance. The method uses multiple classifiers to select a subset of the unlabeled data to augment limited labeled data. We evaluated four widely used text classification algorithms; Naive Bayes Classification (NBC), Support Vector machines (SVM), Boosting and Maximum Entropy (MaxEnt). The NBC method is found to be poorest performer compared to other three classification methods. Combining SVM, Boosting and MaxEnt resulted in significant improvements in call classification accuracy compared to any single classifier performance across varying amounts of labeled data.
Cite as: Sarikaya, R., Kuo, H.-K.J., Goel, V., Gao, Y. (2005) Exploiting unlabeled data using multiple classifiers for improved natural language call-routing. Proc. Interspeech 2005, 433-436, doi: 10.21437/Interspeech.2005-301
@inproceedings{sarikaya05_interspeech, author={Ruhi Sarikaya and Hong-Kwang Jeff Kuo and Vaibhava Goel and Yuqing Gao}, title={{Exploiting unlabeled data using multiple classifiers for improved natural language call-routing}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={433--436}, doi={10.21437/Interspeech.2005-301} }