Literature DB >> 17238328

Contribution to terminology internationalization by word alignment in parallel corpora.

Louise Deléger1, Magnus Merkel, Pierre Zweigenbaum.   

Abstract

BACKGROUND AND OBJECTIVES: Creating a complete translation of a large vocabulary is a time-consuming task, which requires skilled and knowledgeable medical translators. Our goal is to examine to which extent such a task can be alleviated by a specific natural language processing technique, word alignment in parallel corpora. We experiment with translation from English to French.
METHODS: Build a large corpus of parallel, English-French documents, and automatically align it at the document, sentence and word levels using state-of-the-art alignment methods and tools. Then project English terms from existing controlled vocabularies to the aligned word pairs, and examine the number and quality of the putative French translations obtained thereby. We considered three American vocabularies present in the UMLS with three different translation statuses: the MeSH, SNOMED CT, and the MedlinePlus Health Topics.
RESULTS: We obtained several thousand new translations of our input terms, this number being closely linked to the number of terms in the input vocabularies.
CONCLUSION: Our study shows that alignment methods can extract a number of new term translations from large bodies of text with a moderate human reviewing effort, and thus contribute to help a human translator obtain better translation coverage of an input vocabulary. Short-term perspectives include their application to a corpus 20 times larger than that used here, together with more focused methods for term extraction.

Entities:  

Mesh:

Year:  2006        PMID: 17238328      PMCID: PMC1839560     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  8 in total

1.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

2.  Looking for French-English translations in comparable medical corpora.

Authors:  Yun-Chuang Chiao; P Zweigenbaum
Journal:  Proc AMIA Symp       Date:  2002

3.  Automatic processing of multilingual medical terminology: applications to thesaurus enrichment and cross-language information retrieval.

Authors:  H Déjean; E Gaussier; J-M Renders; F Sadat
Journal:  Artif Intell Med       Date:  2005-02       Impact factor: 5.326

4.  Predicting Lexical Relations between Biomedical Terms: towards a Multilingual Morphosemantics-based System.

Authors:  Fiammetta Namer; Robert Baud
Journal:  Stud Health Technol Inform       Date:  2005

5.  Enriching medical terminologies: an approach based on aligned corpora.

Authors:  Louise Deleger; Magnus Merkel; Pierre Zweigenbaum
Journal:  Stud Health Technol Inform       Date:  2006

6.  Medical dictionaries for patient encoding systems: a methodology.

Authors:  C Lovis; R Baud; A M Rassinoux; P A Michel; J R Scherrer
Journal:  Artif Intell Med       Date:  1998 Sep-Oct       Impact factor: 5.326

7.  Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools.

Authors:  W R Hersh; E H Campbell; D A Evans; N D Brownlow
Journal:  Proc AMIA Annu Fall Symp       Date:  1996

8.  Creating a medical English-Swedish dictionary using interactive word alignment.

Authors:  Mikael Nyström; Magnus Merkel; Lars Ahrenberg; Pierre Zweigenbaum; Håkan Petersson; Hans Ahlfeldt
Journal:  BMC Med Inform Decis Mak       Date:  2006-10-12       Impact factor: 2.796

  8 in total
  1 in total

Review 1.  The Unified Medical Language System at 30 Years and How It Is Used and Published: Systematic Review and Content Analysis.

Authors:  Xia Jing
Journal:  JMIR Med Inform       Date:  2021-08-27
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.