This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoothed unigram ratio between an adaptation text and the background language model to scale only the n-gram probabilities corresponding to translation options gathered by the SMT decoder. The effects of the unigram ratio are scaled by adding an additional feature weight to the log-linear discriminative model. We present results on the IWSLT 2012 TED talk translation task and show that Lazy MDI provides comparable language model adaptation performance to classic MDI.
Cite as: Ruiz, N., Federico, M. (2012) MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation. Proc. International Workshop on Spoken Language Translation (IWSLT 2012), 244-251
@inproceedings{ruiz12b_iwslt, author={Nicholas Ruiz and Marcello Federico}, title={{MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation}}, year=2012, booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2012)}, pages={244--251} }