Sixth European Conference on Speech Communication and Technology
In this paper we present algorithms for story segmentation, topic detection, and topic tracking. The algorithmsuse a combination of machine learning, statistical naturallanguage processing and information retrieval techniques.The story segmentation algorithm is a two stage algorithm that uses a decision tree based probabilistic modelin the first stage and incorporates aspects of our topicdetection system via an information-retrieval based refinement scheme in the second stage. The topic detectionand tracking algorithm is an incremental clustering algorithm that employs a novel dynamic cluster-dependentsimilarity measure between documents and clusters. Per-formance of these algorithms are measured on the 1998DARPA sponsored Topic Detection and Tracking Phase2 (TDT2) evaluation task.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Dharanipragada, S. / Franz, Martin / McCarley, J. S. / Roukos, Salim / Ward, T. (1999): "Story segmentation and topic detection for recognized speech", In EUROSPEECH'99, 2435-2438.