i-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification

Zhili Tan, Man-Wai Mak


This paper proposes applying multi-task learning to train deep neural networks (DNNs) for calibrating the PLDA scores of speaker verification systems under noisy environments. To facilitate the DNNs to learn the main task (calibration), several auxiliary tasks were introduced, including the prediction of SNR and duration from i-vectors and classifying whether an i-vector pair belongs to the same speaker or not. The possibility of replacing the PLDA model by a DNN during the scoring stage is also explored. Evaluations on noise contaminated speech suggest that the auxiliary tasks are important for the DNNs to learn the main calibration task and that the uncalibrated PLDA scores are an essential input to the DNNs. Without this input, the DNNs can only predict the score shifts accurately, suggesting that the PLDA model is indispensable.


 DOI: 10.21437/Interspeech.2017-656

Cite as: Tan, Z., Mak, M. (2017) i-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification. Proc. Interspeech 2017, 1562-1566, DOI: 10.21437/Interspeech.2017-656.


@inproceedings{Tan2017,
  author={Zhili Tan and Man-Wai Mak},
  title={i-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1562--1566},
  doi={10.21437/Interspeech.2017-656},
  url={http://dx.doi.org/10.21437/Interspeech.2017-656}
}