International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

Dynamic Distortion in a Discriminative Reordering Model for Statistical Machine Translation

Sirvan Yahyaei (1), Christof Monz (2)

(1) School of Electronic Engineering and Computer Science, Queen Mary University of London, UK
(2) ISLA, Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands

Most phrase-based statistical machine translation systems use a so-called distortion limit to keep the size of the search space manageable. In addition, a distance-based distortion penalty is used as a feature to keep the decoder to translate monotonically unless there is sufficient support for a jump from other features, particularly the language models.
   To overcome the issue of setting the optimum distortion parameters in the phrase-based decoders and the fact that different sentences have different reordering requirements, a method to predict the necessary distortion limit for each sentence and each hypothesis expansion is proposed. A discriminative reordering model is built for that purpose and also integrated into the decoder as an extra feature. Many lexicalised and syntactic features of the source sentences are employed to predict the next reordering move of the decoder. The model scores each reordering before the sentence translation, so the optimum distortion limit can be estimated based on these score. Various experiments on Turkish to English and Arabic to English pairs are performed and substantial improvements are reported.

