International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

A Combination of Hierarchical Systems with Forced Alignments from Phrase-Based Systems

Carmen Heger, Joern Wuebker, David Vilar, Hermann Ney

Human Language Technology and Pattern Recognition Group, RWTH Aachen University, Aachen, Germany

Currently most state-of-the-art statistical machine translation systems present a mismatch between training and generation conditions. Word alignments are computed using the well known IBM models for single-word based translation. Afterwards phrases are extracted using extraction heuristics, unrelated to the stochastic models applied for finding the word alignment. In the last years, several research groups have tried to overcome this mismatch, but only with limited success. Recently, the technique of forced alignments has shown to improve translation quality for a phrase-based system, applying a more statistically sound approach to phrase extraction. In this work we investigate the first steps to combine forced alignment with a hierarchical model. Experimental results on IWSLT and WMT data show improvements in translation quality of up to 0.7% BLEU and 1.0% TER.

Full Paper

Bibliographic reference.  Heger, Carmen / Wuebker, Joern / Vilar, David / Ney, Hermann (2010): "A combination of hierarchical systems with forced alignments from phrase-based systems", In IWSLT-2010, 291-297.