International Workshop on Spoken Language Translation (IWSLT) 2005
Pittsburgh, PA, USA
This work summarizes a comparison between two approaches to Statistical Machine Translation (SMT), namely Ngram-based and Phrase-based SMT. In both approaches, the translation process is based on bilingual units related by word-to-word alignments (pairs of source and target words), while the main differences are based on the extraction process of these units and the statistical modeling of the translation context. The study has been carried out on two different translation tasks (in terms of translation difficulty and amount of available training data), and allowing for distortion (reordering) in the decoding process. Thus it extends a previous work were both approaches were compared under monotone conditions. We finally report comparative results in terms of translation accuracy, computation time and memory size. Results show how the ngram-based approach outperforms the phrase-based approach by achieving similar accuracy scores in less computational time and with less memory needs.
Full Paper Presentation
Bibliographic reference. Crego, Josep M. / Costa-jussà, Marta R. / Mariño, José B. / Fonollosa, José A. R. (2005): "N-gram-based versus phrase-based statistical machine translation", In IWSLT-2005, 167-174.