8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Speaker Diarization using bottom-up clustering based on a Parameter-derived Distance between adapted GMMs

Michael Betser (1), Frédéric Bimbot (2), Mathieu Ben (3), Guillaume Gravier (2)

(1) Campus Universitaire de Beaulieu, France
(2) CNRS, France
(3) Université de Rennes, France

In this paper, we present an approach for speaker diarization based on segmentation followed by bottom-up clustering, where clusters are modeled using adapted Gaussian mixture models. We propose a novel inter-cluster distance in the model parameter space which is easily computable and which can both be used as the dissimilarity measure in the clustering scheme and as a stop criterion. Using adapted Gaussian mixture models enables a good description of the feature vector distribution within a cluster while adaptation prevents over-training for clusters with few data. Experiments carried out on broadcast news data in French demonstrate the potential of the proposed approach which exhibits performance similar to BIC clustering. However, our clustering method appeared to be more sensitive to segmentation errors than the BIC approach.

Full Paper

Bibliographic reference.  Betser, Michael / Bimbot, Frédéric / Ben, Mathieu / Gravier, Guillaume (2004): "Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs", In INTERSPEECH-2004, 2329-2332.