International Workshop on Spoken Language Translation (IWSLT) 2012

Hong Kong
December 6-7, 2012

MDI Adaptation for the Lazy: Avoiding Normalization in LM Adaptation for Lecture Translation

Nicholas Ruiz, Marcello Federico

FBK - Fondazione Bruno Kessler, Povo (TN), Italy

This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoothed unigram ratio between an adaptation text and the background language model to scale only the n-gram probabilities corresponding to translation options gathered by the SMT decoder. The effects of the unigram ratio are scaled by adding an additional feature weight to the log-linear discriminative model. We present results on the IWSLT 2012 TED talk translation task and show that Lazy MDI provides comparable language model adaptation performance to classic MDI.

Full Paper   

Bibliographic reference.  Ruiz, Nicholas / Federico, Marcello (2012): "MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation", In IWSLT-2012, 244-251.