A Study of New Approaches to Speaker Diarization

Douglas Reynolds (1), Patrick Kenny (2), Fabio Castaldo (3)

(1) MIT, USA
(2) CRIM, Canada
(3) Politecnico di Torino, Italy

This paper reports on work carried out at the 2008 JHU Summer Workshop examining new approaches to speaker diarization. Four different systems were developed and experiments were conducted using summed-channel telephone data from the 2008 NIST SRE. The systems are a baseline agglomerative clustering system, a new Variational Bayes system using eigenvoice speaker models, a streaming system using a mix of low dimensional speaker factors and classic segmentation and clustering, and a new hybrid system combining the baseline system with a new cosine-distance speaker factor clustering. Results are presented using the Diarization Error Rate as well as by the EER when using diarization outputs for a speaker detection task. The best configurations of the diarization system produced DERs of 3.5-4.6% and we demonstrate a weak correlation of EER and DER.

Bibliographic reference.  Reynolds, Douglas / Kenny, Patrick / Castaldo, Fabio (2009): "A study of new approaches to speaker diarization", In INTERSPEECH-2009, 1047-1050.