International Workshop on Spoken Language Translation (IWSLT) 2010

Paris, France
December 2-3, 2010

The MSRA Machine Translation System for IWSLT 2010

Chi-Ho Li (1), Nan Duan (1), Yinggong Zhao (1), Shujie Liu (1), Lei Cui (1), Mei-yuh Hwang (2), Amittai Axelrod (2), Jianfeng Gao (2), Yaodong Zhang (2), Li Deng (2)

(1) Natural Language Computing, Microsoft Research Asia, Beijing, China
(2) Natural Language Processing, Microsoft Research, Redmond, WA, USA

This paper describes the systems of, and the experiments by, Microsoft Research Asia (MSRA), with the support of Microsoft Research (MSR), in the IWSLT 2010 evaluation campaign. We participated in all tracks of the DIALOG task (Chinese/English). While we follow the general training and decoding routine of statistical machine translation (SMT) and that of MT output combination, it is our first time to try our ideas in post-processing output of automatic speech recognition (ASR) before feeding it to SMT decoders. Our findings are: (1) it does not help to use the complete N-best ASR output; rather, the best translation performance is achieved by taking the top one candidate after Minimum Bayes Risk re-ranking of the N-best ASR output; (2) as to punctuation recovery, the best performance is achieved by splitting the problem into two steps, viz. the prediction of punctuation position and the prediction of punctuation given a position.

Full Paper

Bibliographic reference.  Li, Chi-Ho / Duan, Nan / Zhao, Yinggong / Liu, Shujie / Cui, Lei / Hwang, Mei-yuh / Axelrod, Amittai / Gao, Jianfeng / Zhang, Yaodong / Deng, Li (2010): "The MSRA machine translation system for IWSLT 2010", In IWSLT-2010, 135-138.