International Workshop on Spoken Language Translation (IWSLT) 2011

San Francisco, CA, USA
December 8-9, 2011

The RWTH Aachen Machine Translation System for IWSLT 2011

Joern Wuebker, Matthias Huck, Saab Mansour, Markus Freitag, Minwei Feng, Stephan Peitz, Christoph Schmidt, Hermann Ney

Human Language Technology and Pattern Recognition Group, Computer Science Department, RWTH Aachen University, Aachen, Germany

In this paper the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, Chinese-English) and SLT (English-French) tracks. Both hierarchical and phrase-based SMT decoders are applied. A number of different techniques are evaluated, including domain adaptation via monolingual and bilingual data selection, phrase training, different lexical smoothing methods, additional reordering models for the hierarchical system, various Arabic and Chinese segmentation methods, punctuation prediction for speech recognition output, and system combination. By application of these methods we can show considerable improvements over the respective baseline systems.

Full Paper

Bibliographic reference.  Wuebker, Joern / Huck, Matthias / Mansour, Saab / Freitag, Markus / Feng, Minwei / Peitz, Stephan / Schmidt, Christoph / Ney, Hermann (2011): "The RWTH Aachen machine translation system for IWSLT 2011", In IWSLT-2011, 106-113.