12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Shrinkage-Based Features for Natural Language Call Routing

Ruhi Sarikaya, Stanley F. Chen, Bhuvana Ramabhadran

IBM T.J. Watson Research Center, USA

The feature set used with a classifier can have a large impact on classification performance. This paper presents a set of shrinkagebased features for Maximum Entropy and other classifiers in the exponential family. These features are inspired by the exponential class-based language model, Model M. We motivate the use of these features for the task of text classification and evaluate them on a natural language call routing task. The proposed features along with a new word clustering method result in significant improvements in action classification accuracy over typical word-based features, particularly for small amounts of training data.

Full Paper

Bibliographic reference.  Sarikaya, Ruhi / Chen, Stanley F. / Ramabhadran, Bhuvana (2011): "Shrinkage-based features for natural language call routing", In INTERSPEECH-2011, 1309-1312.