Statistical Model Compression for Small-Footprint Natural Language Understanding

Grant P. Strimel, Kanthashree Mysore Sathyendra, Stanislav Peshterliev


In this paper we investigate statistical model compression applied to natural language understanding (NLU) models. Small-footprint NLU models are important for enabling offline systems on hardware restricted devices and for decreasing on-demand model loading latency in cloud-based systems. To compress NLU models, we present two main techniques, parameter quantization and perfect feature hashing. These techniques are complementary to existing model pruning strategies such as L1 regularization. We performed experiments on a large scale NLU system. The results show that our approach achieves 14-fold reduction in memory usage compared to the original models with minimal predictive performance impact.


 DOI: 10.21437/Interspeech.2018-1333

Cite as: Strimel, G.P., Sathyendra, K.M., Peshterliev, S. (2018) Statistical Model Compression for Small-Footprint Natural Language Understanding. Proc. Interspeech 2018, 571-575, DOI: 10.21437/Interspeech.2018-1333.


@inproceedings{Strimel2018,
  author={Grant P. Strimel and Kanthashree Mysore Sathyendra and Stanislav Peshterliev},
  title={Statistical Model Compression for Small-Footprint Natural Language Understanding},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={571--575},
  doi={10.21437/Interspeech.2018-1333},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1333}
}