International Workshop on Spoken Language Translation (IWSLT) 2012

Hong Kong
December 6-7, 2012

MDI Adaptation for the Lazy: Avoiding Normalization in LM Adaptation for Lecture Translation

Nicholas Ruiz, Marcello Federico

FBK - Fondazione Bruno Kessler, Povo (TN), Italy

This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoothed unigram ratio between an adaptation text and the background language model to scale only the n-gram probabilities corresponding to translation options gathered by the SMT decoder. The effects of the unigram ratio are scaled by adding an additional feature weight to the log-linear discriminative model. We present results on the IWSLT 2012 TED talk translation task and show that Lazy MDI provides comparable language model adaptation performance to classic MDI.

