This paper reports on FBK’s Machine Translation (MT) submissions at the IWSLT 2012 Evaluation on the TED talk translation tasks. We participated in the English-French and the Arabic-, Dutch-, German-, and Turkish-English translation tasks. Several improvements are reported over our last year baselines. In addition to using fill-up combinations of phrase-tables for domain adaptation, we explore the use of corpora filtering based on cross-entropy to produce concise and accurate translation and language models. We describe challenges encountered in under-resourced languages (Turkish) and language-specific preprocessing needs.
Cite as: Ruiz, N., Bisazza, A., Cattoni, R., Federico, M. (2012) FBK’s machine translation systems for IWSLT 2012’s TED lectures. Proc. International Workshop on Spoken Language Translation (IWSLT 2012), 61-68
@inproceedings{ruiz12_iwslt, author={Nicholas Ruiz and Arianna Bisazza and Roldano Cattoni and Marcello Federico}, title={{FBK’s machine translation systems for IWSLT 2012’s TED lectures}}, year=2012, booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2012)}, pages={61--68} }