Residual Networks for Resisting Noise: Analysis of an Embeddings-based Spoofing Countermeasure

Bence Halpern, Finnian Kelly, Rob van Son, Anil Alexander


In this paper we propose a spoofing countermeasure based on Constant Q-transform (CQT) features with a ResNet embeddings extractor and a Gaussian Mixture Model (GMM) classifier. We present a detailed analysis of this approach using the Logical Access portion of the ASVspoof2019 evaluation database, and demonstrate that it provides complementary information to the baseline evaluation systems. We additionally evaluate the CQT-ResNet approach in the presence of various types of real noise, and show that it is more robust than the baseline systems. Finally, we explore some explainable audio approaches to offer the human listener insight into the types of information exploited by the network in discriminating spoofed speech from real speech.


 DOI: 10.21437/Odyssey.2020-46

Cite as: Halpern, B., Kelly, F., van Son, R., Alexander, A. (2020) Residual Networks for Resisting Noise: Analysis of an Embeddings-based Spoofing Countermeasure. Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, 326-332, DOI: 10.21437/Odyssey.2020-46.


@inproceedings{Halpern2020,
  author={Bence Halpern and Finnian Kelly and Rob  {van Son} and Anil Alexander},
  title={{Residual Networks for Resisting Noise: Analysis of an Embeddings-based Spoofing Countermeasure}},
  year=2020,
  booktitle={Proc. Odyssey 2020 The Speaker and Language Recognition Workshop},
  pages={326--332},
  doi={10.21437/Odyssey.2020-46},
  url={http://dx.doi.org/10.21437/Odyssey.2020-46}
}