International Workshop on Spoken Language Translation (IWSLT) 2010
Currently most state-of-the-art statistical machine translation systems present a mismatch between training and generation conditions. Word alignments are computed using the well known IBM models for single-word based translation. Afterwards phrases are extracted using extraction heuristics, unrelated to the stochastic models applied for finding the word alignment. In the last years, several research groups have tried to overcome this mismatch, but only with limited success. Recently, the technique of forced alignments has shown to improve translation quality for a phrase-based system, applying a more statistically sound approach to phrase extraction. In this work we investigate the first steps to combine forced alignment with a hierarchical model. Experimental results on IWSLT and WMT data show improvements in translation quality of up to 0.7% BLEU and 1.0% TER.
Bibliographic reference. Heger, Carmen / Wuebker, Joern / Vilar, David / Ney, Hermann (2010): "A combination of hierarchical systems with forced alignments from phrase-based systems", In IWSLT-2010, 291-297.