This paper introduces and evaluates a novel approach for unsupervised speaker change detection. In many unsupervised speaker change detection algorithms, each audio segment is typically modeled with a multivariate single Gaussian density, where it is assumed that the distribution of the speech features of the segment is Gaussian. However, this assumption is too strong in many cases. Therefore, this paper presents an alternative to the single Gaussian model: Gaussian model in reproducing kernel Hilbert space (RKHS) or Kernel-Gaussian model (KGM). KGM first projects speech features into RKHS via a nonlinear mapping. Then it models the features in RKHS with a Gaussian density. The mapping procedure enables KGM to capture nonlinear structure of speech features. An implementation of KGM is proposed and evaluated. Experiments on different datasets show that better results are achieved by KGM compared to the single Gaussian model.
Bibliographic reference. Gao, Jie / Zhang, Xiang / Zhao, Qingwei / Yan, Yonghong (2008): "Robust speaker change detection using Kernel-Gaussian model", In INTERSPEECH-2008, 2494-2497.