Literature DB >> 19117739

Using WordNet synonym substitution to enhance UMLS source integration.

Kuo-Chuan Huang1, James Geller, Michael Halper, Yehoshua Perl, Junchuan Xu.   

Abstract

OBJECTIVE: Synonym-substitution algorithms have been developed for the purpose of matching source vocabulary terms with existing Unified Medical Language System (UMLS) terms during the integration process. A drawback is the possible explosion in the number of newly generated (potential) synonyms, which can tax computational and expert review resources. Experiments are run using a synonym-substitution approach based on WordNet to see how constraining two methodological parameters, namely, "maximum number of substitutions per term" and "maximum term length," affects performance. Our hypothesis is that these values can be constrained rather tightly--thus greatly speeding up the methodology--without a marked decline in the additional matches produced. Furthermore, we investigate whether a limitation on only the first of the two parameters is sufficient to achieve the same results.
METHODS: A four-stage synonym-substitution methodology using WordNet is presented. A group of experiments is carried out in which the two methodological parameters "maximum number of substitutions per term" and "maximum term length" are varied. The purpose is to examine their effect on the growth in the number of potential synonyms generated and the associated loss of results. The experiments are based on the re-integration of the "Minimal Standard Terminology" (MST) into the UMLS. Synonym-substitution matches found to be inconsistent with the current content of the UMLS and thus deemed to be incorrect are further manually scrutinized as an audit of the original integration of the MST.
RESULTS: An increase of 11% in the number of "MST term/UMLS term" matches was achieved using the synonym-substitution methodology. Importantly, this result prevailed when tight threshold values (such as a maximum of two synonym substitutions per term) were imposed on the parameters. Furthermore, it was found that limiting only the "maximum number of substitutions per term" parameter was sufficient to obtain the performance enhancement. During the additional audit phase, a number of the reported mismatches were actually seen to be correct, representing an additional 10% increase in the number of matches obtained.
CONCLUSION: A synonym-substitution methodology that utilizes WordNet is a useful automated aide in UMLS source integration. Experiments showed that there was a significant speed-up but no degradation in match results when the methodology's "maximum number of substitutions per term" parameter was relatively tightly constrained. The methodology also helped to discover errors in the MST's original integration, and improve the quality of the UMLS's conceptual content.

Entities:  

Mesh:

Year:  2008        PMID: 19117739      PMCID: PMC2755556          DOI: 10.1016/j.artmed.2008.11.008

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  10 in total

1.  Discovering missed synonymy in a large concept-oriented Metathesaurus.

Authors:  W T Hole; S Srinivasan
Journal:  Proc AMIA Symp       Date:  2000

2.  Integration of a standard gastrointestinal endoscopy terminology in the UMLS Metathesaurus.

Authors:  Michele Tringali; William T Hole; Suresh Srinivasan
Journal:  Proc AMIA Symp       Date:  2002

3.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

4.  Using BLAST for identifying gene and protein names in journal articles.

Authors:  M Krauthammer; A Rzhetsky; P Morozov; C Friedman
Journal:  Gene       Date:  2000-12-23       Impact factor: 3.688

5.  Using WordNet to improve the mapping of data elements to UMLS for data sources integration.

Authors:  Fleur Mougin; Anita Burgun; Olivier Bodenreider
Journal:  AMIA Annu Symp Proc       Date:  2006

6.  A fault model for ontology mapping, alignment, and linking systems.

Authors:  Helen L Johnson; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2007

7.  Piecewise synonyms for enhanced UMLS source terminology integration.

Authors:  Kuo-Chuan Huang; James Geller; Michael Halper; James J Cimino
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

8.  The Unified Medical Language System: an informatics research collaboration.

Authors:  B L Humphreys; D A Lindberg; H M Schoolman; G O Barnett
Journal:  J Am Med Inform Assoc       Date:  1998 Jan-Feb       Impact factor: 4.497

9.  The UMLS Metathesaurus: representing different views of biomedical concepts.

Authors:  P L Schuyler; W T Hole; M S Tuttle; D D Sherertz
Journal:  Bull Med Libr Assoc       Date:  1993-04

10.  Mapping the gene ontology into the unified medical language system.

Authors:  Jane Lomax; Alexa T McCray
Journal:  Comp Funct Genomics       Date:  2004
  10 in total
  11 in total

Review 1.  A review of auditing methods applied to the content of controlled biomedical terminologies.

Authors:  Xinxin Zhu; Jung-Wei Fan; David M Baorto; Chunhua Weng; James J Cimino
Journal:  J Biomed Inform       Date:  2009-03-12       Impact factor: 6.317

2.  Auditing SNOMED Integration into the UMLS for Duplicate Concepts.

Authors:  Kuo-Chuan Huang; James Geller; Gai Elhanan; Yehoshua Perl; Michael Halper
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

3.  A comparative analysis of the density of the SNOMED CT conceptual content for semantic harmonization.

Authors:  Zhe He; James Geller; Yan Chen
Journal:  Artif Intell Med       Date:  2015-04-02       Impact factor: 5.326

4.  Topological-Pattern-Based Recommendation of UMLS Concepts for National Cancer Institute Thesaurus.

Authors:  Zhe He; Yan Chen; Sherri de Coronado; Katrina Piskorski; James Geller
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

5.  Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology.

Authors:  J L Allones; D Martinez; M Taboada
Journal:  J Med Syst       Date:  2014-09-02       Impact factor: 4.460

6.  Auditing the multiply-related concepts within the UMLS.

Authors:  Fleur Mougin; Natalia Grabar
Journal:  J Am Med Inform Assoc       Date:  2014-01-24       Impact factor: 4.497

7.  Automating case definitions using literature-based reasoning.

Authors:  T Botsis; R Ball
Journal:  Appl Clin Inform       Date:  2013-10-30       Impact factor: 2.342

8.  Extended Analysis of Topological-Pattern-Based Ontology Enrichment.

Authors:  Zhe He; Vipina Kuttichi Keloth; Yan Chen; James Geller
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2019-01-24

9.  Logic-based assessment of the compatibility of UMLS ontology sources.

Authors:  Ernesto Jiménez-Ruiz; Bernardo Cuenca Grau; Ian Horrocks; Rafael Berlanga
Journal:  J Biomed Semantics       Date:  2011-03-07

10.  A new synonym-substitution method to enrich the human phenotype ontology.

Authors:  Maria Taboada; Hadriana Rodriguez; Ranga C Gudivada; Diego Martinez
Journal:  BMC Bioinformatics       Date:  2017-10-10       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.