This work introduces a new maximum entropy language model that decomposes the model parameters into a low rank component that learns regularities in the training data and a sparse component that learns exceptions (e.g. keywords). The low rank solution corresponds to a continuous-space language model. This model generalizes the standard l1-regularized maximum entropy model, and has an efficient accelerated first-order training algorithm. In conversational speech language modeling experiments, we see perplexity reductions of 2-5%.
Index Terms: language modeling, maximum entropy, sparse plus low rank decomposition
Bibliographic reference. Hutchinson, Brian / Ostendorf, Mari / Fazel, Maryam (2012): "A sparse plus low rank maximum entropy language model", In INTERSPEECH-2012, 1676-1679.