The I2R’s Submission to VOiCES Distance Speaker Recognition Challenge 2019

Hanwu Sun, Kah Kuan Teh, Ivan Kukanov, Huy Dat Tran


This paper is about the I2R’s submission to the VOiCES from a distance speaker recognition challenge 2019. The submissions were based on the fusion of two x-vectors and two i-vectors subsystems. Main efforts have been focused on the frontend de-reverberation processing, PLDA backend design, score normalization and fusion studies in order to improve the system performance on single channel distant/far-field audio, under noisy conditions. We contribute to the fixed condition task under specific training and development data set. The experimental results showed that the de-reverberation approach can achieve 5% to 10% relative improvement on both EER and DCF for all subsystems and more than 10% improvement in the final fusion system on the Dev dataset and more than 15% relative improvement on the final evaluation dataset. Our final fusion system achieved about 2% EER rate and 0.240 minDCF on the Development Dataset.


 DOI: 10.21437/Interspeech.2019-1997

Cite as: Sun, H., Teh, K.K., Kukanov, I., Tran, H.D. (2019) The I2R’s Submission to VOiCES Distance Speaker Recognition Challenge 2019. Proc. Interspeech 2019, 2478-2482, DOI: 10.21437/Interspeech.2019-1997.


@inproceedings{Sun2019,
  author={Hanwu Sun and Kah Kuan Teh and Ivan Kukanov and Huy Dat Tran},
  title={{The I2R’s Submission to VOiCES Distance Speaker Recognition Challenge 2019}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={2478--2482},
  doi={10.21437/Interspeech.2019-1997},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1997}
}