7th International Conference on Spoken Language Processing
September 16-20, 2002
Natural language call classification can be performed using a latent semantic indexing (LSI) matrix, a popular vector-space model used in information retrieval. Traditionally this matrix was constructed by counting different words or word sequences found in requests for different destinations, with appropriate heuristic weightings to emphasize words that are salient or important for classification. At ICSLP’2000, we introduced discriminative training (DT) of the LSI matrix. DT considers both positive and negative examples during training to minimize the classification error and increase the score separation of the correct hypothesis from competitors. Some parameters become negative after DT, resulting from suppressive learning not traditionally possible: important anti-features are thus obtained. DT improves portability by making the classifier robust to different feature selection and by decreasing the amount of training data needed. Results are reported for call routing and the Switchboard topic identification task, where a 70% relative improvement was obtained after DT.
Bibliographic reference. Kuo, Hong-Kwang Jeff / Lee, Chin-Hui / Zitouni, Imed / Fosler-Lussier, Eric / Ammicht, Egbert (2002): "Discriminative training for call classification and routing", In ICSLP-2002, 1145-1148.