 |
2003 ISCA Workshop on
Multilingual Spoken Document Retrieval
(MSDR2003)
Hong Kong
April 4-5, 2003 |
 |
Segmentation and Indexation of Broadcast News
Rui Amaral (1), Isabel Trancoso (2)
(1) EST-Setúbal - Inst. Politécnico Setúbal, Portugal
(2) IST/INESC ID Lisboa, Portugal
This paper describes a topic segmentation and indexation
system for broadcast news that is integrated in an alert
system for selective dissemination of multimedia
information. The goal of this work is to enhance the
retrieval and navigation through specific spoken audio
segments that have been automatically transcribed, using
speech recognition. Our segmentation algorithm is based
on simple heuristics related with anchor detection. The
indexation is based on hierarchical concept trees,
containing 22 main thematic domains, for which Hidden
Markov models were created. Only the three top levels in
this thesaurus are currently used for indexation. On-going
work on the identification of some cues related to the
structure of TV broadcast news programs is also
described.
Full Paper
Bibliographic reference.
Amaral, Rui / Trancoso, Isabel (2003):
"Segmentation and indexation of broadcast news",
In MSDR-2003, 31-36.