ISCA Archive IWSLT 2012
ISCA Archive IWSLT 2012

MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation

Nicholas Ruiz, Marcello Federico

This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoothed unigram ratio between an adaptation text and the background language model to scale only the n-gram probabilities corresponding to translation options gathered by the SMT decoder. The effects of the unigram ratio are scaled by adding an additional feature weight to the log-linear discriminative model. We present results on the IWSLT 2012 TED talk translation task and show that Lazy MDI provides comparable language model adaptation performance to classic MDI.


Cite as: Ruiz, N., Federico, M. (2012) MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation. Proc. International Workshop on Spoken Language Translation (IWSLT 2012), 244-251

@inproceedings{ruiz12b_iwslt,
  author={Nicholas Ruiz and Marcello Federico},
  title={{MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation}},
  year=2012,
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2012)},
  pages={244--251}
}