Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System

Su-Yu Chang, Kai-Cheng Wu, Chia-Ping Chen


In this paper, we study a countermeasure module to detect spoofing attacks with converted or synthesized speech in tandem automatic speaker verification (ASV). Our approach integrates representation learning and transfer learning methods. For representation learning, good embedding network functions are learned from audio signals with the goal to distinguish different types of spoofing attacks. For transfer learning, the embedding network functions are used to initialize fine-tuning networks. We experiment well-known neural network architectures and front-end raw features to diversify and strengthen the information source for embedding. We participate in the 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2019) and evaluate the proposed methods with the logical access condition tasks for detecting converted speech and synthesized speech. On the ASVspoof 2019 development set, our best single system achieves a minimum tandem decision cost function of nearly 0 during system development. On the ASVspoof 2019 evaluation set, our primary system achieves a minimum tandem decision cost of 0.1791, and an equal error rate (EER) of 9.08%. Our system does not have over-training issue as it achieves decent performance with unseen test data of the types presented in training, yet the generalization gap is not small with mismatched test data types.


 DOI: 10.21437/Interspeech.2019-2014

Cite as: Chang, S., Wu, K., Chen, C. (2019) Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System. Proc. Interspeech 2019, 1063-1067, DOI: 10.21437/Interspeech.2019-2014.


@inproceedings{Chang2019,
  author={Su-Yu Chang and Kai-Cheng Wu and Chia-Ping Chen},
  title={{Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1063--1067},
  doi={10.21437/Interspeech.2019-2014},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2014}
}