Compression of Acoustic Event Detection Models with Quantized Distillation

Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang


Acoustic Event Detection (AED), aiming at detecting categories of events based on audio signals, has found application in many intelligent systems. Recently deep neural network significantly advances this field and reduces detection errors to a large scale. However how to efficiently execute deep models in AED has received much less attention. Meanwhile state-of-the-art AED models are based on large deep models, which are computational demanding and challenging to deploy on devices with constrained computational resources. In this paper, we present a simple yet effective compression approach which jointly leverages knowledge distillation and quantization to compress larger network (teacher model) into compact network (student model). Experimental results show proposed technique not only lowers error rate of original compact network by 15% through distillation but also further reduces its model size to a large extent (2% of teacher, 12% of full-precision student) through quantization.


 DOI: 10.21437/Interspeech.2019-1747

Cite as: Shi, B., Sun, M., Kao, C., Rozgic, V., Matsoukas, S., Wang, C. (2019) Compression of Acoustic Event Detection Models with Quantized Distillation. Proc. Interspeech 2019, 3639-3643, DOI: 10.21437/Interspeech.2019-1747.


@inproceedings{Shi2019,
  author={Bowen Shi and Ming Sun and Chieh-Chi Kao and Viktor Rozgic and Spyros Matsoukas and Chao Wang},
  title={{Compression of Acoustic Event Detection Models with Quantized Distillation}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={3639--3643},
  doi={10.21437/Interspeech.2019-1747},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1747}
}