Odyssey 2008: The Speaker and Language Recognition Workshop
Stellenbosch, South Africa
We present a new approach to construct kernels used on support vector machines for speaker verification. The idea is to learn new kernels by taking linear combination of many kernels such as the Generalized Linear Discriminant Sequence kernels (GLDS) and Gaussian Mixture Models (GMM) supervector kernels. In this new linear kernel combination, the weights are speaker dependent rather than universal weights on score level fusion and there is no need to extra-data to estimate them. An experiment on the NIST 2006 speaker recognition evaluation dataset (all trials) was done using three different kernel functions (GLDS kernel, Gaussian and linear GMM supervector kernels). We compared our kernel combination to the optimal linear score fusion obtained using logistic regression. The optimal weights was trained on all 1conv4w-1conv4w trials of NIST-SRE 2005. Testing on NIST-SRE 2006 database, we had an equal error rate of 5.9% using the kernel combination method which is better than the optimal score fusion system (6.1%).
Bibliographic reference. Dehak, Réda / Dehak, Najim / Kenny, Patrick / Dumouchel, Pierre (2008): "Kernel combination for SVM speaker verification", In Odyssey-2008, paper 021.