16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Diachronic Semantic Cohesion for Topic Segmentation of TV Broadcast News

Abdessalam Bouchekif (1), Géraldine Damnati (1), Yannick Estève (2), Delphine Charlet (1), Nathalie Camelin (2)

(1) Orange Labs, France
(2) LIUM, France

This paper proposes a new way to integrate semantic relations into a topic segmentation process by defining the notion of semantic cohesion. In the context of a sliding window based automatic topic segmentation algorithm, semantic relations are incorporated in the similarity measure between adjacent blocs. Additionally, in the context of TV Brodcast News topic segmentation, we propose a new protocole to gather relevant data for semantic relations computation, showing that a small set of diachronic data can be more relevant for the task than using a large amount of general or asynchronous data. Experiments on a corpus of 86 various French TV Broadcast News shows recorded during one week, in conjunction with text articles collected through the Google News homepage at the same period for semantic relation estimation show significant improvement in topic segmentation performance.

Full Paper

Bibliographic reference.  Bouchekif, Abdessalam / Damnati, Géraldine / Estève, Yannick / Charlet, Delphine / Camelin, Nathalie (2015): "Diachronic semantic cohesion for topic segmentation of TV broadcast news", In INTERSPEECH-2015, 2932-2936.