15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

SVM Based Speaker Recognition: Harnessing Trials with Multiple Enrollment Sessions

Jason Pelecanos (1), Weizhong Zhu (1), Sibel Yaman (2)

(1) IBM T.J. Watson Research Center, USA
(2) Apple, USA

In this paper we extend a variation of the trial-based SVM speaker verification work proposed by Cumani et al to exploit multiple enrollment sessions. Specifically, Cumani proposed the use of a 2nd order SVM kernel for the binary classification of basic trials. In this new work, trials with multiple enrollment sessions are modelled by stacking the i-vectors of the test and enrollment sessions. We further exploit the fact that the score should be independent of the enrollment recording order and present a simplified 2nd order polynomial kernel scoring function accordingly.
   In the second part of this work we examine the utility of enrollment pruning for multi-session enrollments. Past work demonstrates that pruning can be beneficial for PLDA based systems. We examine the effects of enrollment pruning in the context of the proposed SVM model.
   The results demonstrate that the multi-session enrollment SVM kernel is generally better than the model trained using single sessions. The model is also comparable in performance to the PLDA based approach. Further gains are observed through combination of the PLDA and SVM scores.

Full Paper

Bibliographic reference.  Pelecanos, Jason / Zhu, Weizhong / Yaman, Sibel (2014): "SVM based speaker recognition: harnessing trials with multiple enrollment sessions", In INTERSPEECH-2014, 691-695.