Jake Vasilakes1,2, Anusha Bompelli1, Jeffrey R Bishop3, Terrence J Adam1,2, Olivier Bodenreider4, Rui Zhang1,2. 1. Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA. 2. Department of Pharmaceutical Care and Health Systems, University of Minnesota, Minneapolis, Minnesota, USA. 3. Department of Experimental and Clinical Pharmacy, University of Minnesota, Minneapolis, Minnesota, USA. 4. Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland, USA.
Abstract
OBJECTIVE: We sought to assess the need for additional coverage of dietary supplements (DS) in the Unified Medical Language System (UMLS) by investigating (1) the overlap between the integrated DIetary Supplements Knowledge base (iDISK) DS ingredient terminology and the UMLS and (2) the coverage of iDISK and the UMLS over DS mentions in the biomedical literature. MATERIALS AND METHODS: We estimated the overlap between iDISK and the UMLS by mapping iDISK to the UMLS using exact and normalized strings. The coverage of iDISK and the UMLS over DS mentions in the biomedical literature was evaluated via a DS named-entity recognition (NER) task within PubMed abstracts. RESULTS: The coverage analysis revealed that only 30% of iDISK terms can be matched to the UMLS, although these cover over 99% of iDISK concepts. A manual review revealed that a majority of the unmatched terms represented new synonyms, rather than lexical variants. For NER, iDISK nearly doubles the precision and achieves a higher F1 score than the UMLS, while maintaining a competitive recall. DISCUSSION: While iDISK has significant concept overlap with the UMLS, it contains many novel synonyms. Furthermore, almost 3000 of these overlapping UMLS concepts are missing a DS designation, which could be provided by iDISK. The NER experiments show that the specialization of iDISK is useful for identifying DS mentions. CONCLUSIONS: Our results show that the DS representation in the UMLS could be enriched by adding DS designations to many concepts and by adding new synonyms.
OBJECTIVE: We sought to assess the need for additional coverage of dietary supplements (DS) in the Unified Medical Language System (UMLS) by investigating (1) the overlap between the integrated DIetary Supplements Knowledge base (iDISK) DS ingredient terminology and the UMLS and (2) the coverage of iDISK and the UMLS over DS mentions in the biomedical literature. MATERIALS AND METHODS: We estimated the overlap between iDISK and the UMLS by mapping iDISK to the UMLS using exact and normalized strings. The coverage of iDISK and the UMLS over DS mentions in the biomedical literature was evaluated via a DS named-entity recognition (NER) task within PubMed abstracts. RESULTS: The coverage analysis revealed that only 30% of iDISK terms can be matched to the UMLS, although these cover over 99% of iDISK concepts. A manual review revealed that a majority of the unmatched terms represented new synonyms, rather than lexical variants. For NER, iDISK nearly doubles the precision and achieves a higher F1 score than the UMLS, while maintaining a competitive recall. DISCUSSION: While iDISK has significant concept overlap with the UMLS, it contains many novel synonyms. Furthermore, almost 3000 of these overlapping UMLS concepts are missing a DS designation, which could be provided by iDISK. The NER experiments show that the specialization of iDISK is useful for identifying DS mentions. CONCLUSIONS: Our results show that the DS representation in the UMLS could be enriched by adding DS designations to many concepts and by adding new synonyms.
Authors: Mary E Palmer; Christine Haller; Patrick E McKinney; Wendy Klein-Schwartz; Anne Tschirgi; Susan C Smolinske; Alan Woolf; Bruce M Sprague; Richard Ko; Gary Everson; Lewis S Nelson; Teresa Dodd-Butera; W Dana Bartlett; Brian R Landzberg Journal: Lancet Date: 2003-01-11 Impact factor: 79.321
Authors: Regan L Bailey; Jaime J Gahche; Cindy V Lentino; Johanna T Dwyer; Jody S Engel; Paul R Thomas; Joseph M Betz; Christopher T Sempos; Mary Frances Picciano Journal: J Nutr Date: 2010-12-22 Impact factor: 4.798
Authors: Le-Thuy T Tran; Guy Divita; Marjorie E Carter; Joshua Judd; Matthew H Samore; Adi V Gundlapalli Journal: J Biomed Inform Date: 2015-09-08 Impact factor: 6.317