ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Ensemble Learning for Countermeasure of Audio Replay Spoofing Attack in ASVspoof2017

Zhe Ji, Zhi-Yi Li, Peng Li, Maobo An, Shengxiang Gao, Dan Wu, Faru Zhao

To enhance the security and reliability of automatic speaker verification (ASV) systems, ASVspoof 2017 challenge focuses on the detection problem of known and unknown audio replay attacks. We proposed an ensemble learning classifier for CNCB team’s submitted system scores, which across uses a variety of acoustic features and classifiers. An effective post-processing method is studied to improve the performance of Constant Q cepstral coefficients (CQCC) and to form a base feature set with some other classical acoustic features. We also proposed using an ensemble classifier set, which includes multiple Gaussian Mixture Model (GMM) based classifiers and two novel GMM mean supervector-Gradient Boosting Decision Tree (GSV-GBDT) and GSV-Random Forest (GSV-RF) classifiers. Experimental results have shown that the proposed ensemble learning system can provide substantially better performance than baseline. On common training condition of the challenge, Equal Error Rate (EER) of primary system on development set is 1.5%, compared to baseline 10.4%. EER of primary system (S02 in ASVspoof 2017 board) on evaluation data set are 12.3% (with only train dataset) and 10.8% (with train+dev dataset), which are also much better than baseline 30.6% and 24.8%, given by ASVSpoof 2017 organizer, with 59.7% and 56.4% relative performance improvement.

doi: 10.21437/Interspeech.2017-1246

Cite as: Ji, Z., Li, Z.-Y., Li, P., An, M., Gao, S., Wu, D., Zhao, F. (2017) Ensemble Learning for Countermeasure of Audio Replay Spoofing Attack in ASVspoof2017. Proc. Interspeech 2017, 87-91, doi: 10.21437/Interspeech.2017-1246

  author={Zhe Ji and Zhi-Yi Li and Peng Li and Maobo An and Shengxiang Gao and Dan Wu and Faru Zhao},
  title={{Ensemble Learning for Countermeasure of Audio Replay Spoofing Attack in ASVspoof2017}},
  booktitle={Proc. Interspeech 2017},