15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Backoff Inspired Features for Maximum Entropy Language Models

Fadi Biadsy, Keith Hall, Pedro J. Moreno, Brian Roark

Google, USA

Maximum Entropy (MaxEnt) language models are linear models that are typically regularized via well-known L1 or L2 terms in the likelihood objective, hence avoiding the need for the kinds of backoff or mixture weights used in smoothed n-gram language models using Katz backoff and similar techniques. Even though backoff cost is not required to regularize the model, we investigate the use of backoff features in MaxEnt models, as well as some backoff-inspired variants. These features are shown to improve model quality substantially, as shown in perplexity and word-error rate reductions, even in very large scale training scenarios of tens or hundreds of billions of words and hundreds of millions of features.

Full Paper

Bibliographic reference.  Biadsy, Fadi / Hall, Keith / Moreno, Pedro J. / Roark, Brian (2014): "Backoff inspired features for maximum entropy language models", In INTERSPEECH-2014, 2645-2649.