Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization

Shreya Khare, Rahul Aralikatte, Senthil Mani


Fooling deep neural networks with adversarial input have exposed a significant vulnerability in the current state-of-the-art systems in multiple domains. Both black-box and white-box approaches have been used to either replicate the model itself or to craft examples which cause the model to fail. In this work, we propose a framework which uses multi-objective evolutionary optimization to perform both targeted and un-targeted black-box attacks on Automatic Speech Recognition (ASR) systems. We apply this framework on two ASR systems: Deepspeech and Kaldi-ASR, which increases the Word Error Rates (WER) of these systems by upto 980%, indicating the potency of our approach. During both un-targeted and targeted attacks, the adversarial samples maintain a high acoustic similarity of 0.98 and 0.97 with the original audio.


 DOI: 10.21437/Interspeech.2019-2420

Cite as: Khare, S., Aralikatte, R., Mani, S. (2019) Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization. Proc. Interspeech 2019, 3208-3212, DOI: 10.21437/Interspeech.2019-2420.


@inproceedings{Khare2019,
  author={Shreya Khare and Rahul Aralikatte and Senthil Mani},
  title={{Adversarial Black-Box Attacks on Automatic Speech Recognition Systems Using Multi-Objective Evolutionary Optimization}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3208--3212},
  doi={10.21437/Interspeech.2019-2420},
  url={http://dx.doi.org/10.21437/Interspeech.2019-2420}
}