Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification

Siqi Zheng, Gang Liu, Hongbin Suo, Yun Lei


This study aims to improve the performance of speaker verification system when no labeled out-of-domain data is available. An autoencoder-based semi-supervised curriculum learning scheme is proposed to automatically cluster unlabeled data and iteratively update the corpus during training. This new training scheme allows us to (1) progressively expand the size of training corpus by utilizing unlabeled data and correcting previous labels at run-time; and (2) improve robustness when generalizing to multiple conditions, such as out-of-domain and text-independent speaker verification tasks. It is also discovered that a denoising autoencoder can significantly enhance the clustering accuracy when it is trained on carefully-selected subset of speakers. Our experimental results show a relative reduction of 30%–50% in EER compared to the baseline.


 DOI: 10.21437/Interspeech.2019-1440

Cite as: Zheng, S., Liu, G., Suo, H., Lei, Y. (2019) Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification. Proc. Interspeech 2019, 4360-4364, DOI: 10.21437/Interspeech.2019-1440.


@inproceedings{Zheng2019,
  author={Siqi Zheng and Gang Liu and Hongbin Suo and Yun Lei},
  title={{Autoencoder-Based Semi-Supervised Curriculum Learning for Out-of-Domain Speaker Verification}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={4360--4364},
  doi={10.21437/Interspeech.2019-1440},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1440}
}