Speaker verification systems are usually evaluated by a weighted average of its false acceptance (FA) rate and false rejection (FR) rate. The weights are known as the operating point (OP) and depend on the applications. Recent researches suggest that, for the purpose of score calibration of speaker verification systems, it is beneficial to let discriminative training emphasize on the operating points of interest, i.e., use application-specific loss functions. In score calibration, a transformation is applied to the scores in order to make them better represent likelihood ratios. The same application-specific training objective can be used in discriminative training of all parameters of a speaker verification system. In this study, we apply application-specific loss functions in discriminative PLDA training.We observe improvement an improvement in MDC for the male trials of the NIST SRE10 telephone for the targeted operating point compared to the baseline, discriminative PLDA training with logistic regression loss.
Cite as: Rohdin, J., Biswas, S., Shinoda, K. (2014) Discriminative PLDA training with application-specific loss functions for speaker verification. Proc. The Speaker and Language Recognition Workshop (Odyssey 2014), 26-32, doi: 10.21437/Odyssey.2014-5
@inproceedings{rohdin14_odyssey, author={Johan Rohdin and Sangeeta Biswas and Koichi Shinoda}, title={{Discriminative PLDA training with application-specific loss functions for speaker verification}}, year=2014, booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2014)}, pages={26--32}, doi={10.21437/Odyssey.2014-5} }