International Workshop on Spoken Language Translation (IWSLT) 2007

Trento, Italy
October 15-16, 2007

The CMU-UKA Statistical Machine Translation Systems for IWSLT 2007

Ian Lane (1), Andreas Zollmann (1), Thuy Linh Nguyen (1), Nguyen Bach (1), Ashish Venugopal (1), Stephan Vogel (1), Kay Rottmann (2), Ying Zhang (1), Alex Waibel(1,2)

InterACT Research Laboratories:
(1) Carnegie Mellon University, Pittsburgh, USA / (2) University of Karlsruhe, Karlsruhe, Germany

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese-to-English, Chinese-to-English and Arabic-to-English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japanese-to-English we focused on two problems: first, punctuation recovery, and second, how to incorporate topic-knowledge into the translation framework. Our Chinese-to-English submission focused on syntaxaugmented SMT and for the Arabic-to-English task we focused on incorporating morphological-decomposition into the SMT framework. This research strategy enabled us to evaluate a wide variety of approaches which proved effective for the language pairs they were evaluated on.

Full Paper     Presentation

Bibliographic reference.  Lane, Ian / Zollmann, Andreas / Nguyen, Thuy Linh / Bach, Nguyen / Venugopal, Ashish / Vogel, Stephan / Rottmann, Kay / Zhang, Ying / Waibe, Alex (2007): "The CMU-UKA statistical machine translation systems for IWSLT 2007", In IWSLT-2007, 61-68.