Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification

Jianbo Ma, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Kong Aik Lee


Short duration speaker verification is a challenging problem partly due to utterance duration mismatch. This paper proposes a novel method that modifies the standard Gaussian probabilistic linear discriminant analysis (G-PLDA) to use two separate generative models for i-vectors from long and short utterances which are jointly trained. The proposed twin model G-PLDA employs distinct models for i-vectors corresponding to different durations from the same speaker but shares the same latent variables. Unlike the standard G-PLDA, this twin model G-PLDA takes the differences between utterances of varying durations into account. Hyper-parameter estimation and scoring formulae for the twin model G-PLDA are presented. Experimental results obtained using NIST 2010 data show that the proposed technique leads to relative improvements of 8.5% and 15.6% when tested on utterances of 5 second and 3 second durations respectively.


DOI: 10.21437/Interspeech.2016-683

Cite as

Ma, J., Sethu, V., Ambikairajah, E., Lee, K.A. (2016) Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification. Proc. Interspeech 2016, 1853-1857.

Bibtex
@inproceedings{Ma+2016,
author={Jianbo Ma and Vidhyasaharan Sethu and Eliathamby Ambikairajah and Kong Aik Lee},
title={Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-683},
url={http://dx.doi.org/10.21437/Interspeech.2016-683},
pages={1853--1857}
}