ISCA Archive ExLing 2006
ISCA Archive ExLing 2006

Speaker based segmentation on broadcast news - on the use of ISI technique

S. Ouamour, M. Guerti, H. Sayoud

In this paper we propose a new segmentation technique called ISI or "Interlaced Speech Indexing", developed and implemented for the task of broadcast news indexing. It consists in finding the identity of a well-defined speaker and the moments of his interventions inside an audio document, in order to access rapidly, directly and easily to his speech and then to his talk. Our segmentation procedure is based on an interlaced equidistant segmentation (IES) associated with our new ISI algorithm. This approach uses a speaker identification method based on Second Order Statistical Measures. As SOSM measures, we choose the "µGc" one, which is based on the covariance matrix. However, experiments showed that this method needs, at least, a speech length of 2 seconds, which means that the segmentation resolution will be 2 seconds. By combining the SOSM with the new Indexing technique (ISI), we demonstrate that the average segmentation error is reduced to only 0.5 second, which is more accurate and more interesting for real-time applications. Results indicate that this association provides a high resolution and a high tracking performance: the indexing score (percentage of correctly labelled segments) is 95% on TIMIT database and 92.4% on Hub4 Broadcast news 96 database.

Cite as: Ouamour, S., Guerti, M., Sayoud, H. (2006) Speaker based segmentation on broadcast news - on the use of ISI technique. Proc. First ITRW on Experimental Linguistics, 193-196

  author={S. Ouamour and M. Guerti and H. Sayoud},
  title={{Speaker based segmentation on broadcast news - on the use of ISI technique}},
  booktitle={Proc. First ITRW on Experimental Linguistics},