International Workshop on Spoken Language Translation (IWSLT) 2004

Keihanna Science City, Kyoto, Japan
September 30-October 1, 2004

EBMT, SMT, Hybrid and More: ATR Spoken Language Translation System

Eiichiro Sumita, Yasuhiro Akiba, Takao Doi, Andrew Finch, Kenji Imamura, Hideo Okuma, Michael Paul, Mitsuo Shimohata, Taro Watanabe

ATR Spoken Language Translation Research Laboratories, Keihanna Science City, Kyoto, Japan

This paper introduces ATR's project named Corpus-Centered Computation (C3), which aims at developing a translation technology suitable for spoken language translation. C3 places corpora at the center of its technology. Translation knowledge is extracted from corpora, translation quality is gauged by referring to corpora, the best translation among multiple-engine outputs is selected based on corpora, and the corpora themselves are paraphrased or filtered by automated processes to improve the data quality on which translation engines are based.
   In particular, this paper reports the hybridization architecture of different machine translation systems, our technologies, their performance on the IWSLT04 task, and paraphrasing methods.

