International Workshop on Spoken Language Translation (IWSLT) 2011
San Francisco, CA, USA
In statistical machine translation systems, phrases with similar meanings often have similar but not identical distributions of translations. This paper proposes a new soft clustering method to smooth the conditional translation probabilities for a given phrase with those of semantically similar phrases. We call this semantic smoothing (SS). Moreover, we fabricate new phrase pairs that were not observed in training data, but which may be used for decoding. In learning curve experiments against a strong baseline, we obtain a consistent pattern of modest improvement from semantic smoothing, and further modest improvement from phrase pair fabrication.
Bibliographic reference. Chen, Boxing / Kuhn, Roland / Foster, George (2011): "Semantic smoothing and fabrication of phrase pairs for SMT", In IWSLT-2011, 144-150.