Nowadays, most of the statistical translation systems are based on phrases (i.e. groups of words). We describe a phrase-based system using a modified method for the phrase extraction which deals with larger phrases while keeping a reasonable number of phrases. Also, different alignments to extract phrases are allowed and additional features are used which lead to a clear improvement in the performance of translation. Finally, the system manages to do reordering. We report results in terms of translation accuracy by using the BTEC corpus in the tasks of Chinese to English and Arabic to English, in the framework of IWSLT'05 evaluation.
Cite as: Costa-jussà, M.R., Fonollosa, J.A.R. (2005) Tuning a phrase-based statistical translation system for the IWSLT 2005 Chinese to English and Arabic to English tasks. Proc. International Workshop on Spoken Language Translation (IWSLT 2005), 175-180
@inproceedings{costajussa05_iwslt, author={Marta R. Costa-jussà and José A. R. Fonollosa}, title={{Tuning a phrase-based statistical translation system for the IWSLT 2005 Chinese to English and Arabic to English tasks}}, year=2005, booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2005)}, pages={175--180} }