Sixth European Conference on Speech Communication and Technology
We propose a system for the Topic Detection and Tracking (TDT) detection task concerned with the unsupervised grouping of news stories according to topic. We use an incremental k -means algorithm for clustering stories. For comparing stories, we utilize a probabilistic document similarity metric and a traditional vector-space metric. We note that that the clustering algorithm requires two different types of metrics and adapt similarity metrics for each purpose. The system achieves a topic-weighted miss rate of 12% at a false accept rate of 0.22%.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Walls, Frederick / Jin, Hubert / Sista, Sreenivasa / Schwartz, Richard (1999): "Topic detection in broadcast news", In EUROSPEECH'99, 2451-2454.