International Workshop on Spoken Language Translation (IWSLT) 2005
Pittsburgh, PA, USA
In this paper we describe the CMU statistical machine
translation system used in the IWSLT 2005 evaluation
campaign. This system is based on phrase-to-phrase
translations extracted from a bilingual corpus. We experimented
with two different phrase extraction methods;
PESA on-the-fly phrase extraction and alignment free extraction
method. The translation model, language model
and other features were combined in a log-linear model
during decoding. We present our experiments on model
adaptation for new data in a different domain, as well as
combining different translation hypotheses to obtain better
We participated in the supplied data track for manual transcriptions in the translation directions: Arabic- English, Chinese-English, Japanese-English and Korean- English. For Chinese-English direction we also worked on ASR output of the supplied data, and with additional data in unrestricted and C-STAR tracks.
Bibliographic reference. Hewavitharana, Sanjika / Zhao, Bing / Hildebrand, Almut Silja / Eck, Matthias / Hori, Chiori / Vogel, Stephan / Waibel, Alex (2005): "The CMU statistical machine translation system for IWSLT 2005", In IWSLT-2005, 53-60.