ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Optimization of discriminative kernels in SVM speaker verification

Shi-Xiong Zhang, Man-Wai Mak

An important aspect of SVM-based speaker verification systems is the design of sequence kernels. These kernels should be able to map variable-length observation sequences to fixed-size supervectors that capture the dynamic characteristics of speech utterances and allow speakers to be easily distinguished. Most existing kernels in SVM speaker verification are obtained by assuming a specific form for the similarity function of supervectors. This paper relaxes this assumption to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate a target speaker from impostors using either regression analysis or SVM training. The idea was applied to both low- and high-level speaker verification. In both cases, results show that the proposed kernels outperform the state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores.

doi: 10.21437/Interspeech.2009-380

Cite as: Zhang, S.-X., Mak, M.-W. (2009) Optimization of discriminative kernels in SVM speaker verification. Proc. Interspeech 2009, 1275-1278, doi: 10.21437/Interspeech.2009-380

  author={Shi-Xiong Zhang and Man-Wai Mak},
  title={{Optimization of discriminative kernels in SVM speaker verification}},
  booktitle={Proc. Interspeech 2009},