ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Bilingual LSA-based translation lexicon adaptation for spoken language translation

Yik-Cheung Tam, Tanja Schultz

We present a bilingual LSA (bLSA) framework for translation lexicon adaptation. The idea is to apply marginal adaptation on a translation lexicon so that the lexicon marginals match to in-domain marginals. In the framework of speech translation, the bLSA method transfers topic distributions from the source to the target side, such that the translation lexicon can be adapted before translation based on the source document. We evaluated the proposed approach on our Mandarin RT04 spoken language translation system. Results showed that the conditional likelihood on the test sentence pairs is improved significantly using an adapted translation lexicon compared to an unadapted baseline. The proposed approach showed improvement on BLEU-score in SMT. When both the target-side LM and the translation lexicon were adapted and applied simultaneously for SMT decoding, the gain on BLEU-score was more than additive compared to the scenarios when the adapted models were individually applied.

doi: 10.21437/Interspeech.2007-647

Cite as: Tam, Y.-C., Schultz, T. (2007) Bilingual LSA-based translation lexicon adaptation for spoken language translation. Proc. Interspeech 2007, 2461-2464, doi: 10.21437/Interspeech.2007-647

  author={Yik-Cheung Tam and Tanja Schultz},
  title={{Bilingual LSA-based translation lexicon adaptation for spoken language translation}},
  booktitle={Proc. Interspeech 2007},