ITRW on Non-Linear Speech Processing (NOLISP 05)

Barcelona, Spain
April 19-22, 2005

Speaker Change Detection using Support Vector Machines

V. Kartik, D. Srikrishna Satish, C. Chandra Sekhar

Speech and Vision Laboratory, Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India

Speaker change detection is important for automatic segmentation of multispeaker speech data into homogeneous segments with each segment containing the data of one speaker only. Existing approaches for speaker change detection are based on the dissimilarity of the distributions of the data before and after a speaker change point. In this paper, we propose a classification based technique for speaker change detection. Patterns extracted from the data around the speaker change points are used as positive examples. Patterns extracted from the data between the speaker change points are used as negative examples. The positive and negative examples are used in training a support vector machine for speaker change detection. The trained SVM is used to scan the continuous speech signal of multispeaker data and hypothesize the points of speaker change. We consider two methods for extraction of fixed length patterns that are given as input to the support vector machine. In the first method, the spectral feature vectors of a fixed number of frames are concatenated to derive a pattern vector. In the second method, the sequence of feature vector frames is considered as a trajectory, and the outerproduct matrix of the trajectory matrix is vectorized to derive a pattern vector. The performance of the proposed approach for speaker change detection and the two methods for pattern extraction is studied on the extended data of the NIST 2003 speaker recognition evaluation database.

Full Paper

Bibliographic reference.  Kartik, V. / Srikrishna Satish, D. / Chandra Sekhar, C. (2005): "Speaker change detection using support vector machines", In NOLISP-2005, 130-136.