International Workshop on Spoken Language Translation (IWSLT) 2004
Keihanna Science City, Kyoto, Japan
This paper introduces ATR's project named Corpus-Centered
Computation (C3), which aims at developing a translation
technology suitable for spoken language translation.
C3 places corpora at the center of its technology. Translation
knowledge is extracted from corpora, translation quality
is gauged by referring to corpora, the best translation
among multiple-engine outputs is selected based on corpora,
and the corpora themselves are paraphrased or filtered by
automated processes to improve the data quality on which
translation engines are based.
In particular, this paper reports the hybridization architecture of different machine translation systems, our technologies, their performance on the IWSLT04 task, and paraphrasing methods.
Full Paper Presentation
Bibliographic reference. Sumita, Eiichiro / Akiba, Yasuhiro / Doi, Takao / Finch, Andrew / Imamura, Kenji / Okuma, Hideo / Paul, Michael / Shimohata, Mitsuo / Watanabe, Taro (2004): "EBMT, SMT, hybrid and more: ATR spoken language translation system", In IWSLT-2004, 13-20.