16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

JFA for Speaker Recognition with Random Digit Strings

Themos Stafylakis (1), Patrick Kenny (1), Md. Jahangir Alam (1), Marcel Kockmann (2)

(1) CRIM, Canada
(2) VoiceTrust, Canada

In this paper, we examine the use of Joint Factor Analysis methods on RSR2015 digits. A tied-mixture model is used for segmentation of the utterances into digits, while Joint Factor Analysis and a Joint Density model are deployed for features and backend, respectively. A novel approach for digit-dependent fusion of UBM-component log-likelihood ratios is introduced, yielding the best results so far. The fusion of 5 different JFA features gives an equal-error rate of 3.6%, compared to 6.3% attained by the a baseline GMM-UBM model with score normalization.

Full Paper

Bibliographic reference.  Stafylakis, Themos / Kenny, Patrick / Alam, Md. Jahangir / Kockmann, Marcel (2015): "JFA for speaker recognition with random digit strings", In INTERSPEECH-2015, 190-194.