International Workshop on Spoken Language Translation (IWSLT) 2008

Honolulu, Hawaii, USA
October 20-21, 2008

Phrase-Based Statistical Machine Translation with Pivot Languages

Nicola Bertoldi (1), Madalina Barbaiani (2), Marcello Federico (1), Roldano Cattoni (1)

(1) FBK-irst - Ricerca Scientifica e Tecnologica, Povo (TN), Italy
(2) Research Group on Mathematical Linguistics, Rovira i Virgili University, Tarragona, Spain

Translation with pivot languages has recently gained attention as a means to circumvent the data bottleneck of statistical machine translation (SMT). This paper tries to give a mathematically sound formulation of the various approaches presented in the literature and introduces new methods for training alignment models through pivot languages. We present experimental results on Chinese-Spanish translation via English, on a popular traveling domain task. In contrast to previous literature, we report experimental results by using parallel corpora that are either disjoint or overlapped on the pivot language side. Finally, our original method for generating training data through random sampling shows to perform as well as the best methods based on the coupling of translation systems.

Full Paper     Presentation (pdf)

Bibliographic reference.  Bertoldi, Nicola / Barbaiani, Madalina / Federico, Marcello / Cattoni, Roldano (2008): "Phrase-based statistical machine translation with pivot languages", In IWSLT-2008, 143-149.