5th International Conference on Spoken Language Processing
Expertise in the automatic transcription of broadcast speech has progressed to the point of being able to use the resulting transcripts for information retrieval purposes. In this paper, we first describe a corpus of automatically recognized broadcast news, a method for segmenting the broadcast into stories, and finally apply this method to retrieve stories relating to a specific topic. The method is based on Hidden Markov Models and is in analogy with the usual implementation of HMMs in speech recognition.
Bibliographic reference. Mulbregt, Paul van / Carp, Ira / Gillick, Lawrence / Lowe, Steve / Yamron, Jon (1998): "Text segmentation and topic tracking on broadcast news via a hidden Markov model approach", In ICSLP-1998, paper 0116.