ISCA Archive IWSLT 2011
ISCA Archive IWSLT 2011

Named entity translation using anchor texts

Wang Ling, Pável Calado, Bruno Martins, Isabel Trancoso, Alan Black, Luísa Coheur

This work describes a process to extract Named Entity (NE) translations from the text available in web links (anchor texts). It translates a NE by retrieving a list of web documents in the target language, extracting the anchor texts from the links to those documents and finding the best translation from the anchor texts, using a combination of features, some of which, are specific to anchor texts. Experiments performed on a manually built corpora, suggest that over 70% of the NEs, ranging from unpopular to popular entities, can be translated correctly using sorely anchor texts. Tests on a Machine Translation task indicate that the system can be used to improve the quality of the translations of state-of-the-art statistical machine translation systems.

Cite as: Ling, W., Calado, P., Martins, B., Trancoso, I., Black, A., Coheur, L. (2011) Named entity translation using anchor texts. Proc. International Workshop on Spoken Language Translation (IWSLT 2011), 206-213

  author={Wang Ling and Pável Calado and Bruno Martins and Isabel Trancoso and Alan Black and Luísa Coheur},
  title={{Named entity translation using anchor texts}},
  booktitle={Proc. International Workshop on Spoken Language Translation (IWSLT 2011)},