ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Back-off language model compression

Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat

With the availability of large amounts of training data relevant to speech recognition scenarios, scalability becomes a very productive way to improve language model performance. We present a technique that represents a back-off n-gram language model using arrays of integer values and thus renders it amenable to effective block compression. We propose a few such compression algorithms and evaluate the resulting language model along two dimensions: memory footprint, and speed reduction relative to the uncompressed one. We experimented with a model that uses a 32-bit word vocabulary (at most 4B words) and logprobabilities/ back-off-weights quantized to 1 byte, respectively. The best compression algorithm achieves 2.6 bytes/n-gram at ~18X slower than uncompressed. For faster LM operation we found it feasible to represent the LM at .4.0 bytes/n-gram, and ~3X slower than the uncompressed LM. The memory footprint of a LM containing one billion n-grams can thus be reduced to 3.4 Gbytes without impacting its speed too much.

doi: 10.21437/Interspeech.2009-113

Cite as: Harb, B., Chelba, C., Dean, J., Ghemawat, S. (2009) Back-off language model compression. Proc. Interspeech 2009, 352-355, doi: 10.21437/Interspeech.2009-113

  author={Boulos Harb and Ciprian Chelba and Jeffrey Dean and Sanjay Ghemawat},
  title={{Back-off language model compression}},
  booktitle={Proc. Interspeech 2009},