International Workshop on Spoken Language Translation (IWSLT) 2009

Tokyo, Japan
December 1-2, 2009

Enriching SCFG Rules Directly From Efficient Bilingual Chart Parsing

Martin Čmejrek, Bowen Zhou, Bing Xiang

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA

In this paper, we propose a new method for training translation rules for a Synchronous Context-free Grammar. A bilingual chart parser is used to generate the parse forest, and EM algorithm to estimate expected counts for each rule of the ruleset. Additional rules are constructed as combinations of reliable rules occurring in the parse forest. The new method of proposing additional translation rules is independent of word alignments. We present the theoretical background for this method, and initial experimental results on German-English translations of Europarl data.

