16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Novel Clustering Selection Criterion for Fast Binary Key Speaker Diarization

Héctor Delgado (1), Xavier Anguera (2), Corinne Fredouille (3), Javier Serrano (1)

(1) Universidad Autónoma de Barcelona, Spain
(2) Sinkronigo, Spain
(3) LIA, France

Speaker diarization has become an important building block in many speech-related systems. Given the great increase of audiovisual media, fast systems are required in order to process large amounts of data in a reasonable time. In this regard, the recently proposed speaker diarization system based on binary key speaker modeling provides a very fast alternative to state-of-the-art systems at the cost of a slight decrease in performance. This decrease is mainly due to drawbacks in the final clustering selection algorithm, which is far from returning the optimum clustering the system is actually able to generate. At the same time, we have identified potential points of our system which can be further sped up. This paper aims to face these two issues by first lightening the processing at the main identified bottleneck, and second by proposing an alternative clustering selection technique capable of providing near-optimum clustering outputs. Experimental results on the REPERE test database validate the effectiveness of the proposed improvements, obtaining a relative performance gain of 20% and execution times of 0.037 xRT (being xRT the Real-Time factor).

Full Paper

Bibliographic reference.  Delgado, Héctor / Anguera, Xavier / Fredouille, Corinne / Serrano, Javier (2015): "Novel clustering selection criterion for fast binary key speaker diarization", In INTERSPEECH-2015, 3091-3095.