10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Signature Cluster Model Selection for Incremental Gaussian Mixture Cluster Modeling in Agglomerative Hierarchical Speaker Clustering

Kyu J. Han, Shrikanth S. Narayanan

University of Southern California, USA

Agglomerative hierarchical speaker clustering (AHSC) has been widely used for classifying speech data by speaker characteristics. Its bottom-up, one-way structure of merging the closest cluster pair at every recursion step, however, makes it difficult to recover from incorrect merging. Hence, making AHSC robust to incorrect merging is an important issue. In this paper we address this problem in the framework of AHSC based on incremental Gaussian mixture models, which we previously introduced for better representing variable cluster size. Specifically, to minimize contamination in cluster models by heterogeneous data, we select and keep updating a representative (or signature) model for each cluster during AHSC. Experiments on meeting speech excerpts (4 hours total) verify that the proposed approach improves average speaker clustering performance by approximately 20% (relative).

Full Paper

Bibliographic reference.  Han, Kyu J. / Narayanan, Shrikanth S. (2009): "Signature cluster model selection for incremental Gaussian mixture cluster modeling in agglomerative hierarchical speaker clustering", In INTERSPEECH-2009, 2547-2550.