15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

An Iterative Speaker Re-Diarization Scheme for Improving Speaker-Based Entity Extraction in Multimedia Archives

Houman Ghaemmaghami, David Dean, Sridha Sridharan

Queensland University of Technology, Australia

In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.

Full Paper

Bibliographic reference.  Ghaemmaghami, Houman / Dean, David / Sridharan, Sridha (2014): "An iterative speaker re-diarization scheme for improving speaker-based entity extraction in multimedia archives", In INTERSPEECH-2014, 577-581.