International Workshop on Spoken Language Translation (IWSLT) 2005

Pittsburgh, PA, USA
October 24-25, 2005

The CMU Statistical Machine Translation System for IWSLT 2005

Sanjika Hewavitharana, Bing Zhao, Almut Silja Hildebrand, Matthias Eck, Chiori Hori, Stephan Vogel, Alex Waibel

Interactive Systems Laboratories, Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA

In this paper we describe the CMU statistical machine translation system used in the IWSLT 2005 evaluation campaign. This system is based on phrase-to-phrase translations extracted from a bilingual corpus. We experimented with two different phrase extraction methods; PESA on-the-fly phrase extraction and alignment free extraction method. The translation model, language model and other features were combined in a log-linear model during decoding. We present our experiments on model adaptation for new data in a different domain, as well as combining different translation hypotheses to obtain better translations.
   We participated in the supplied data track for manual transcriptions in the translation directions: Arabic- English, Chinese-English, Japanese-English and Korean- English. For Chinese-English direction we also worked on ASR output of the supplied data, and with additional data in unrestricted and C-STAR tracks.

Full Paper   

Bibliographic reference.  Hewavitharana, Sanjika / Zhao, Bing / Hildebrand, Almut Silja / Eck, Matthias / Hori, Chiori / Vogel, Stephan / Waibel, Alex (2005): "The CMU statistical machine translation system for IWSLT 2005", In IWSLT-2005, 53-60.