International Workshop on Spoken Language Translation (IWSLT) 2006
Keihanna Science City, Kyoto, Japan
This paper proposes the use of rules automatically extracted
from word aligned training data to model word
reordering phenomena in phrase-based statistical machine
translation. Scores computed from matching rules are used
as additional feature functions in the rescoring stage of the
automatic translation process from various languages to English,
in the ambit of a popular traveling domain task. Rules
are defined either on Part-of-Speech or words. Part-of-
Speech rules are extracted from and applied to Chinese,
while lexicalized rules are extracted from and applied to Chinese,
Japanese and Arabic.
Both Part-of-Speech and lexicalized rules yield an absolute improvement of the BLEU score of 0.4-0.9 points without affecting the NIST score, on the Chinese-to-English translation task. On other language pairs which differ a lot in the word order, the use of lexicalized rules allows to observe significant improvements as well.
Full Paper Presentation
Bibliographic reference. Chen, Boxing / Cettolo, Mauro / Federico, Marcello (2006): "Reordering rules for phrase-based statistical machine translation", In IWSLT-2006, 182-189.