This paper presents the development of ZXIC speaker verification system submitted to the task 1 of Interspeech 2022 Far-Field Speaker Verification Challenge (FFSVC2022). Deep neural network based discriminative embeddings, such as x-vectors, have been shown to perform well in speaker verification tasks. In far-field speaker verification system, mismatch between training and testing data and mismatch between enrollment and authentication utterances impact the system performance a lot. To alleviate this mismatch and improve the system performance, in this paper we propose a novel multi-reader domain adaption learning framework based on asymmetric metric learning. In this challenge, we also explore advanced neural network based embedding extractor structures including ECAPA-TDNN and ResNet-SE. A number of experiments on these architectures show that our proposed method is effective and improves the systems performance a lot. The final submitted systems are the fusion of several models. In FFSVC2022, our best system achieves a minimum of the detection cost function (minDCF) of 0.511and an equal error rate (EER) of 4.409% on the evaluation set.
Cite as: Lei, Y., Cao, Z., Kong, D., Xu, K. (2022) ZXIC Speaker Verification System for FFSVC 2022 Challenge. Proc. The 2022 Far-field Speaker Verification Challenge (FFSVC2022), 1-5, doi: 10.21437/FFSVC.2022-1
@inproceedings{lei22_ffsvc, author={Yuan Lei and Zhou Cao and Dehui Kong and Ke Xu}, title={{ZXIC Speaker Verification System for FFSVC 2022 Challenge}}, year=2022, booktitle={Proc. The 2022 Far-field Speaker Verification Challenge (FFSVC2022)}, pages={1--5}, doi={10.21437/FFSVC.2022-1} }