Accessing Information in Spoken Audio

April 19-20, 1999
Cambridge, UK

Text Segmentation and Event Tracking on Broadcast News Via a Hidden Markov Model Approach

P. van Mulbregt, I. Carp, L. Gillick, S. Lowe and J. Yamron

Dragon Systems, Newton, MA, USA

Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general methodology based on Hidden Markov Models and classical language modeling techniques for automatically inferring story boundaries (segmentation}) and for retrieving stories relating to a specific topic (tracking). We will present in detail the features and performance of the Segmentation and Tracking systems submitted by Dragon Systems for the 1998 Topic Detection and Tracking evaluation.

Full Paper (PDF)   Full Paper (Zipped Postscript)

Bibliographic reference.  Mulbregt, P. van / Carp, I. / Gillick, L. / Lowe, S. / Yamron, J. (1999): "Text segmentation and event tracking on broadcast news via a Hidden Markov Model approach", In Access-Audio-1999, 90-95.