Literature DB >> 18486558

Automated comparative auditing of NCIT genomic roles using NCBI.

Barry Cohen1, Marc Oren, Hua Min, Yehoshua Perl, Michael Halper.   

Abstract

Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT's Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information's (NCBI's) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes play a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance.

Entities:  

Mesh:

Year:  2008        PMID: 18486558      PMCID: PMC2630966          DOI: 10.1016/j.jbi.2008.03.010

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  32 in total

1.  Discovering missed synonymy in a large concept-oriented Metathesaurus.

Authors:  W T Hole; S Srinivasan
Journal:  Proc AMIA Symp       Date:  2000

2.  Creating the gene ontology resource: design and implementation.

Authors: 
Journal:  Genome Res       Date:  2001-08       Impact factor: 9.043

3.  Linking biomedical language information and knowledge resources: GO and UMLS.

Authors:  I N Sarkar; M N Cantor; R Gelman; F Hartel; Y A Lussier
Journal:  Pac Symp Biocomput       Date:  2003

4.  Structural methodologies for auditing SNOMED.

Authors:  Yue Wang; Michael Halper; Hua Min; Yehoshua Perl; Yan Chen; Kent A Spackman
Journal:  J Biomed Inform       Date:  2006-12-24       Impact factor: 6.317

5.  Evaluation of lexical methods for detecting relationships between concepts from multiple ontologies.

Authors:  Helen L Johnson; K Bretonnel Cohen; William A Baumgartner; Zhiyong Lu; Michael Bada; Todd Kester; Hyunmin Kim; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2006

6.  Manual curation is not sufficient for annotation of genomic databases.

Authors:  William A Baumgartner; K Bretonnel Cohen; Lynne M Fox; George Acquaah-Mensah; Lawrence Hunter
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

7.  A terminological and ontological analysis of the NCI Thesaurus.

Authors:  W Ceusters; B Smith; L Goldberg
Journal:  Methods Inf Med       Date:  2005       Impact factor: 2.176

8.  Genetics in the context of medical practice.

Authors:  Z E Karanjawala; F S Collins
Journal:  JAMA       Date:  1998-11-04       Impact factor: 56.272

9.  A physical map of the human genome.

Authors:  J D McPherson; M Marra; L Hillier; R H Waterston; A Chinwalla; J Wallis; M Sekhon; K Wylie; E R Mardis; R K Wilson; R Fulton; T A Kucaba; C Wagner-McPherson; W B Barbazuk; S G Gregory; S J Humphray; L French; R S Evans; G Bethel; A Whittaker; J L Holden; O T McCann; A Dunham; C Soderlund; C E Scott; D R Bentley; G Schuler; H C Chen; W Jang; E D Green; J R Idol; V V Maduro; K T Montgomery; E Lee; A Miller; S Emerling; R Gibbs; S Scherer; J H Gorrell; E Sodergren; K Clerc-Blankenburg; P Tabor; S Naylor; D Garcia; P J de Jong; J J Catanese; N Nowak; K Osoegawa; S Qin; L Rowen; A Madan; M Dors; L Hood; B Trask; C Friedman; H Massa; V G Cheung; I R Kirsch; T Reid; R Yonescu; J Weissenbach; T Bruls; R Heilig; E Branscomb; A Olsen; N Doggett; J F Cheng; T Hawkins; R M Myers; J Shang; L Ramirez; J Schmutz; O Velasquez; K Dixon; N E Stone; D R Cox; D Haussler; W J Kent; T Furey; S Rogic; S Kennedy; S Jones; A Rosenthal; G Wen; M Schilhabel; G Gloeckner; G Nyakatura; R Siebert; B Schlegelberger; J Korenberg; X N Chen; A Fujiyama; M Hattori; A Toyoda; T Yada; H S Park; Y Sakaki; N Shimizu; S Asakawa; K Kawasaki; T Sasaki; A Shintani; A Shimizu; K Shibuya; J Kudoh; S Minoshima; J Ramser; P Seranski; C Hoff; A Poustka; R Reinhardt; H Lehrach
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

10.  NCI Thesaurus: using science-based terminology to integrate cancer research results.

Authors:  Sherri de Coronado; Margaret W Haber; Nicholas Sioutos; Mark S Tuttle; Lawrence W Wright
Journal:  Stud Health Technol Inform       Date:  2004
View more
  6 in total

Review 1.  A review of auditing methods applied to the content of controlled biomedical terminologies.

Authors:  Xinxin Zhu; Jung-Wei Fan; David M Baorto; Chunhua Weng; James J Cimino
Journal:  J Biomed Inform       Date:  2009-03-12       Impact factor: 6.317

2.  Topological-Pattern-Based Recommendation of UMLS Concepts for National Cancer Institute Thesaurus.

Authors:  Zhe He; Yan Chen; Sherri de Coronado; Katrina Piskorski; James Geller
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

3.  Extended Analysis of Topological-Pattern-Based Ontology Enrichment.

Authors:  Zhe He; Vipina Kuttichi Keloth; Yan Chen; James Geller
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2019-01-24

4.  Relationship auditing of the FMA ontology.

Authors:  Huanying Helen Gu; Duo Wei; Jose L V Mejino; Gai Elhanan
Journal:  J Biomed Inform       Date:  2009-06       Impact factor: 6.317

5.  Auditing associative relations across two knowledge sources.

Authors:  Lowell T Vizenor; Olivier Bodenreider; Alexa T McCray
Journal:  J Biomed Inform       Date:  2009-06       Impact factor: 6.317

6.  Preliminary Analysis of Difficulty of Importing Pattern-Based Concepts into the National Cancer Institute Thesaurus.

Authors:  Zhe He; James Geller
Journal:  Stud Health Technol Inform       Date:  2016
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.