This paper illustrates the ICT Statistical Machine Translation system used in the evaluation campaign of the International Workshop on Spoken Language Translation 2010. We participate in the DIALOG tasks for Chinese-to-English and English-to-Chinese translation respectively. For both tasks, our system has achieved significant improvement with several effective methods as follows: 1) refining the data preprocessing, including Chinese word segmentation, named entity recognition, etc. 2) reducing the number of Out-of- Vocabulary(OOV) on the final test set by applying a fuzzy matching strategy. 3) considering generating a better input for the decoder from the N-best lists of ASR output as a special kind of translation task for the ASR task. 4) improving the performance of every single decoder, and reranking the n-best list for the final results submitted.
Cite as: Xiong, H., Xie, J., Yu, H., Liu, K., Luo, W., Mi, H., Liu, Y., Lü, Y., Liu, Q. (2010) The ICT statistical machine translation system for IWSLT 2010. Proc. International Workshop on Spoken Language Translation (IWSLT 2010), 73-79
@inproceedings{xiong10_iwslt, author={Hao Xiong and Jun Xie and Hui Yu and Kai Liu and Wei Luo and Haitao Mi and Yang Liu and Yajuan Lü and Qun Liu}, title={{The ICT statistical machine translation system for IWSLT 2010}}, year=2010, booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2010)}, pages={73--79} }