To deal with the performance degradation of speaker recognition due to duration mismatch between enrollment and test utterances, a novel strategy to modify the standard normal prior distribution of the i-vector during probabilistic linear discriminant analysis (PLDA) modeling is employed. This new modified-prior PLDA model incorporates the covariance matrix scaled with duration of each utterance for each speaker, which achieves more discriminative characteristics by learning the duration variability as well as session variation in the i-vector space. Furthermore, an efficient Quality Measure Function (QMF) method which adopts duration variation as a compensation technique is employed to eliminate the linear shift in the score domain. To evaluate the robustness of the proposed approach, experiments were conducted on the NIST SRE10 core-core task in condition-5 with varying test utterance duration, in which the i-vectors of test utterances were extracted from full segment and randomly truncated segments of duration 10s and 20s. The results demonstrated the efficiency of modified-prior PLDA in different duration conditions, and the combined score calibration further improved the performance of speaker recognition.
Bibliographic reference. Hong, QingYang / Li, Lin / Li, Ming / Huang, Ling / Wan, Lihong / Zhang, Jun (2015): "Modified-prior PLDA and score calibration for duration mismatch compensation in speaker recognition system", In INTERSPEECH-2015, 1037-1041.