10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Back-Off Language Model Compression

Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat

Google Inc., USA

With the availability of large amounts of training data relevant to speech recognition scenarios, scalability becomes a very productive way to improve language model performance. We present a technique that represents a back-off n-gram language model using arrays of integer values and thus renders it amenable to effective block compression. We propose a few such compression algorithms and evaluate the resulting language model along two dimensions: memory footprint, and speed reduction relative to the uncompressed one. We experimented with a model that uses a 32-bit word vocabulary (at most 4B words) and logprobabilities/ back-off-weights quantized to 1 byte, respectively. The best compression algorithm achieves 2.6 bytes/n-gram at ~18X slower than uncompressed. For faster LM operation we found it feasible to represent the LM at .4.0 bytes/n-gram, and ~3X slower than the uncompressed LM. The memory footprint of a LM containing one billion n-grams can thus be reduced to 3.4 Gbytes without impacting its speed too much.

Full Paper

Bibliographic reference.  Harb, Boulos / Chelba, Ciprian / Dean, Jeffrey / Ghemawat, Sanjay (2009): "Back-off language model compression", In INTERSPEECH-2009, 352-355.