The NEC-TT 2018 Speaker Verification System

Kong Aik Lee, Hitoshi Yamamoto, Koji Okabe, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda

This paper describes the NEC-TT speaker verification system for the 2018 NIST speaker recognition evaluation (SRE’18). We present the details of data partitioning, x-vector speaker embedding, data augmentation, speaker diarization, and domain adaptation techniques used in NEC-TT SRE’18 speaker verification system. For the speaker embedding front-end, we found that the amount and diversity of training data are essential to improve the robustness of the x-vector extractor. This was achieved with data augmentation and mixed-bandwidth training in our submission. For the multi-speaker test scenario, we show that x-vector based speaker diarization is promising and holds potential for future research. For the scoring back-end, we used two variants of probabilistic linear discriminant analysis (PLDA), namely, the Gaussian PLDA and heavy-tailed PLDA. We show that correlation alignment (CORAL) and CORAL+ unsupervised PLDA adaptation are effective to deal with domain mismatch.

 DOI: 10.21437/Interspeech.2019-1517

Cite as: Lee, K.A., Yamamoto, H., Okabe, K., Wang, Q., Guo, L., Koshinaka, T., Zhang, J., Shinoda, K. (2019) The NEC-TT 2018 Speaker Verification System. Proc. Interspeech 2019, 4355-4359, DOI: 10.21437/Interspeech.2019-1517.

  author={Kong Aik Lee and Hitoshi Yamamoto and Koji Okabe and Qiongqiong Wang and Ling Guo and Takafumi Koshinaka and Jiacen Zhang and Koichi Shinoda},
  title={{The NEC-TT 2018 Speaker Verification System}},
  booktitle={Proc. Interspeech 2019},