12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Speaker Clustering Based on Non-Negative Matrix Factorization

Masafumi Nishida, Seiichi Yamamoto

Doshisha University, Japan

This paper addresses unsupervised speaker clustering for multiparty conversations. Hierarchical clustering methods were mainly used in previous studies. However, these methods require many processes, such as distance calculation and cluster merging, when there are many utterances in conversation data. We propose a clustering method based on non-negative matrix factorization. The proposed method can perform fast and robust clustering by decomposing a matrix consisting of distances between models. We conducted speaker clustering experiments using a Bayesian information criterion based method, a method based on the likelihood ratio between Gaussian mixture models, and the proposed method. Experimental results showed that the proposed method achieves higher clustering accuracy than these conventional methods.

Full Paper

Bibliographic reference.  Nishida, Masafumi / Yamamoto, Seiichi (2011): "Speaker clustering based on non-negative matrix factorization", In INTERSPEECH-2011, 949-952.