International Workshop on Spoken Language Translation (IWSLT) 2008
Honolulu, Hawaii, USA
This paper reports on the first participation of TCH (Toshiba (China) Research and Development Center) at the IWSLT evaluation campaign. We participated in all the 5 translation tasks with Chinese as source language or target language. For Chinese-English and English-Chinese translation, we used hybrid systems that combine rule-based machine translation (RBMT) method and statistical machine translation (SMT) method. For Chinese-Spanish translation, phrase-based SMT models were used. For the pivot task, we combined the translations generated by a pivot based statistical translation model and a statistical transfer translation model (firstly, translating from Chinese to English, and then from English to Spanish). Moreover, for better performance of MT, we improved each module in the MT systems as follows: adapting Chinese word segmentation to spoken language translation, selecting out-of-domain corpus to build language models, using bilingual dictionaries to correct word alignment results, handling NE translation and selecting translations from the outputs of multiple systems. According to the automatic evaluation results on the full test sets, we top in all the 5 tasks.
Full Paper Presentation (pdf)
Bibliographic reference. Wang, Haifeng / Wu, Hua / Hu, Xiaoguang / Liu, Zhanyi / Li, Jianfeng / Ren, Dengjun / Niu, Zhengyu (2008): "The TCH machine translation system for IWSLT 2008", In IWSLT-2008, 124-131.