International Workshop on Spoken Language Translation (IWSLT) 2011
San Francisco, CA, USA
This work describes a process to extract Named Entity (NE) translations from the text available in web links (anchor texts). It translates a NE by retrieving a list of web documents in the target language, extracting the anchor texts from the links to those documents and finding the best translation from the anchor texts, using a combination of features, some of which, are specific to anchor texts. Experiments performed on a manually built corpora, suggest that over 70% of the NEs, ranging from unpopular to popular entities, can be translated correctly using sorely anchor texts. Tests on a Machine Translation task indicate that the system can be used to improve the quality of the translations of state-of-the-art statistical machine translation systems.
Bibliographic reference. Ling, Wang / Calado, Pável / Martins, Bruno / Trancoso, Isabel / Black, Alan / Coheur, Luísa (2011): "Named entity translation using anchor texts", In IWSLT-2011, 206-213.