Robust Bayesian and Light Neural Networks for Voice Spoofing Detection

Radosław Białobrzeski, Michał Kośmider, Mateusz Matuszewski, Marcin Plata, Alexander Rakowski

We present a replay attack detection system consisting of two convolutional neural network models. The first model consists of a small Bayesian neural network, motivated by the hypothesis that Bayesian models are robust to overfitting. The second one uses a bigger architecture, LCNN, extended with several regularization techniques to improve generalization. Our experiments, considering both size of the networks and use of the Bayesian approach, indicated that smaller networks are sufficient to achieve competitive results. To better estimate the performance against unseen spoofing methods, the final models were selected using novel Attack-Out Cross-Validation. In this procedure each model was tested on a subset of data containing not only previously unseen speakers, but also unseen spoofing attacks. The system was submitted to ASVspoof 2019 challenge’s PA condition and achieved a t-DCF score of 0.0219 and EER of 0.88% on the evaluation dataset, which is a 10 times relative improvement over the baseline.

 DOI: 10.21437/Interspeech.2019-2676

Cite as: Białobrzeski, R., Kośmider, M., Matuszewski, M., Plata, M., Rakowski, A. (2019) Robust Bayesian and Light Neural Networks for Voice Spoofing Detection. Proc. Interspeech 2019, 1028-1032, DOI: 10.21437/Interspeech.2019-2676.

  author={Radosław Białobrzeski and Michał Kośmider and Mateusz Matuszewski and Marcin Plata and Alexander Rakowski},
  title={{Robust Bayesian and Light Neural Networks for Voice Spoofing Detection}},
  booktitle={Proc. Interspeech 2019},