Symposium on Machine Learning in Speech and Language Processing (MLSLP)

Bellevue, WA, USA
June 27, 2011

Performance Prediction and Shrinking Language Models

Stanley Chen, Stephen Chu, Ahmad Emami, Lidia Mangu, Bhuvana Ramabhadran, Ruhi Sarikaya, Abhinav Sethy

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA

In this talk, we present a simple empirical law that vastly outperforms the Akaike and Bayesian Information Criterions at predicting the test set likelihood of an exponential language model. We discuss under what conditions this relationship holds; how it can be used to improve the design of language models; and whether these ideas can be applied to other types of statistical models as well. Specifically, we show how this relationship led to the design of "Model M", a class-based language model that outperforms all previous models of this type.


Bibliographic reference.  Chen, Stanley / Chu, Stephen / Emami, Ahmad / Mangu, Lidia / Ramabhadran, Bhuvana / Sarikaya, Ruhi / Sethy, Abhinav (2011): "Performance prediction and shrinking language models", In MLSLP-2011.