8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Unsupervised Categorisation Approaches for Technical Support Automated Agents

Amparo Albalate (1), Dimitar Dimitrov (1), Roberto Pieraccini (2)

(1) University of Ulm, Germany
(2) SpeechCycle, USA

In this paper we describe an unsupervised approach for the automated categorisation of utterances into predefined categories of symptoms (or problems) within the framework of a technical support automated agent. The utterance classification is performed based on an iterative K-means clustering method. In order to improve the lower accuracy typical of unsupervised algorithms, we have analysed two different enhancements of the classification algorithm. The first method exploits the affinity among words by automatically extracting classes of semantically equivalent terms. The second approach consists of a disambiguation technique based on a new criterion to estimate the relevance of terms for the classification. An analysis of the results of an experimental evaluation performed on a corpus of 34848 utterances concludes the paper.

Full Paper

Bibliographic reference.  Albalate, Amparo / Dimitrov, Dimitar / Pieraccini, Roberto (2007): "Unsupervised categorisation approaches for technical support automated agents", In INTERSPEECH-2007, 1625-1628.