Structured Word Embedding for Low Memory Neural Network Language Model

Kaiyu Shi, Kai Yu


Neural network language model (NN LM), such as long short term memory (LSTM) LM, has been increasingly popular due to its promising performance. However, the model size of an uncompressed NN LM is still too large to be used in embedded or portable devices. The dominant part of memory consumption of NN LM is the word embedding matrix. Directly compressing the word embedding matrix usually leads to performance degradation. In this paper, a product quantization based structured embedding approach is proposed to significantly reduce memory consumption of word embeddings without hurting LM performance. Here, each word embedding vector is cut into partial embedding vectors which are then quantized separately. Word embedding matrix can then be represented by an index vector and a code-book tensor of the quantized partial embedding vectors. Experiments show that the proposed approach can achieve 10 to 20 times embedding parameter reduction rate with negligible performance loss.


 DOI: 10.21437/Interspeech.2018-1057

Cite as: Shi, K., Yu, K. (2018) Structured Word Embedding for Low Memory Neural Network Language Model. Proc. Interspeech 2018, 1254-1258, DOI: 10.21437/Interspeech.2018-1057.


@inproceedings{Shi2018,
  author={Kaiyu Shi and Kai Yu},
  title={Structured Word Embedding for Low Memory Neural Network Language Model},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={1254--1258},
  doi={10.21437/Interspeech.2018-1057},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1057}
}