ISCA Archive IWSLT 2005
ISCA Archive IWSLT 2005

The CMU statistical machine translation system for IWSLT 2005

Sanjika Hewavitharana, Bing Zhao, Almut Silja Hildebrand, Matthias Eck, Chiori Hori, Stephan Vogel, Alex Waibel

In this paper we describe the CMU statistical machine translation system used in the IWSLT 2005 evaluation campaign. This system is based on phrase-to-phrase translations extracted from a bilingual corpus. We experimented with two different phrase extraction methods; PESA on-the-fly phrase extraction and alignment free extraction method. The translation model, language model and other features were combined in a log-linear model during decoding. We present our experiments on model adaptation for new data in a different domain, as well as combining different translation hypotheses to obtain better translations. We participated in the supplied data track for manual transcriptions in the translation directions: Arabic- English, Chinese-English, Japanese-English and Korean- English. For Chinese-English direction we also worked on ASR output of the supplied data, and with additional data in unrestricted and C-STAR tracks.


Cite as: Hewavitharana, S., Zhao, B., Hildebrand, A.S., Eck, M., Hori, C., Vogel, S., Waibel, A. (2005) The CMU statistical machine translation system for IWSLT 2005. Proc. International Workshop on Spoken Language Translation (IWSLT 2005), 53-60

@inproceedings{hewavitharana05_iwslt,
  author={Sanjika Hewavitharana and Bing Zhao and Almut Silja Hildebrand and Matthias Eck and Chiori Hori and Stephan Vogel and Alex Waibel},
  title={{The CMU statistical machine translation system for IWSLT 2005}},
  year=2005,
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2005)},
  pages={53--60}
}