1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages
Porto Salvo, Portugal
This paper presents a research on parallel corpora-based bilingual terminology extraction based on the occurrence of bilingual morphosyntactic patterns in the probabilistic translation dictionaries generated by NATools. To evaluate this method, we carried out an experiment in which both the level of lexical cohesion of the term candidates and their specificity with respect to a non-terminological corpus of the target language were taken into account. The evaluation results show a high degree of accuracy of the terminology extraction based on probabilistic translation dictionaries complemented by bilingual syntactic patterns.
Index Terms: bilingual terminology extraction, probabilistic translation dictionaries
Bibliographic reference. Simõe, Alberto / Gómez Guinovart, Xavier (2009): "Terminology extraction from English-portuguese and English-galician parallel corpora based on probabilistic translation dictionaries and bilingual syntactic patterns", In SLTECH-2009, 13-16.