INTERSPEECH 2014
15th Annual Conference of the International Speech Communication Association

Singapore
September 14-18, 2014

Speech Cohesion for Topic Segmentation of Spoken Contents

Abdessalam Bouchekif, Géraldine Damnati, Delphine Charlet

Orange Labs, France

In this paper, we introduce the notion of speech cohesion for topic segmentation of a spoken content. The aim is to integrate speaker information and lexical information within a single cohesion value. Based on a lexical cohesion system, we propose an approach that directly integrates the speaker distribution when processing the cohesion. A potential boundary is effective if the joint distribution of terms and speakers is different enough from one side of the boundary to the other. Beyond speaker distribution, we also propose to take into account speaker identification and to confront speaker identities to identities mentioned in the spoken content in order to reinforce cohesion of a topic segment. Experiments run on three corpora of various Broadcasts News formats collected from 9 French TV channels, show a significant improvement in the overall topic segmentation process.

Full Paper

Bibliographic reference.  Bouchekif, Abdessalam / Damnati, Géraldine / Charlet, Delphine (2014): "Speech cohesion for topic segmentation of spoken contents", In INTERSPEECH-2014, 1890-1894.