Partial AUC Metric Learning Based Speaker Verification Back-End

Zhongxin Bai, Xiao-Lei Zhang, Jingdong Chen


Equal error rate (EER) is a widely used evaluation metric for speaker verification, which reflects the performance of a verification system at a given decision threshold. However, a value of threshold tuned from one application scenario is generally not optimal when the system is used in another scenario. This motivates the need for optimizing the performance at a range of decision thresholds. To fulfill this objective, we propose to optimize the parameters of a squared Mahalanobis distance metric for directly maximizing the partial area under the ROC curve (pAUC) given an interested range of false positive rate. Experimental results on the NIST SRE 2016 and the core tasks of the Speakers in the Wild (SITW) datasets illustrate the effectiveness of the proposed algorithm.


 DOI: 10.21437/Odyssey.2020-53

Cite as: Bai, Z., Zhang, X., Chen, J. (2020) Partial AUC Metric Learning Based Speaker Verification Back-End. Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, 380-384, DOI: 10.21437/Odyssey.2020-53.


@inproceedings{Bai2020,
  author={Zhongxin Bai and Xiao-Lei Zhang and Jingdong Chen},
  title={{Partial AUC Metric Learning Based Speaker Verification Back-End}},
  year=2020,
  booktitle={Proc. Odyssey 2020 The Speaker and Language Recognition Workshop},
  pages={380--384},
  doi={10.21437/Odyssey.2020-53},
  url={http://dx.doi.org/10.21437/Odyssey.2020-53}
}