Literature DB >> 17094228

PhenoGO: assigning phenotypic context to gene ontology annotations with natural language processing.

Yves Lussier1, Tara Borlawsky, Daniel Rappaport, Yang Liu, Carol Friedman.   

Abstract

Natural language processing (NLP) is a high throughput technology because it can process vast quantities of text within a reasonable time period. It has the potential to substantially facilitate biomedical research by extracting, linking, and organizing massive amounts of information that occur in biomedical journal articles as well as in textual fields of biological databases. Until recently, much of the work in biological NLP and text mining has revolved around recognizing the occurrence of biomolecular entities in articles, and in extracting particular relationships among the entities. Now, researchers have recognized a need to link the extracted information to ontologies or knowledge bases, which is a more difficult task. One such knowledge base is Gene Ontology annotations (GOA), which significantly increases semantic computations over the function, cellular components and processes of genes. For multicellular organisms, these annotations can be refined with phenotypic context, such as the cell type, tissue, and organ because establishing phenotypic contexts in which a gene is expressed is a crucial step for understanding the development and the molecular underpinning of the pathophysiology of diseases. In this paper, we propose a system, PhenoGO, which automatically augments annotations in GOA with additional context. PhenoGO utilizes an existing NLP system, called BioMedLEE, an existing knowledge-based phenotype organizer system (PhenOS) in conjunction with MeSH indexing and established biomedical ontologies. More specifically, PhenoGO adds phenotypic contextual information to existing associations between gene products and GO terms as specified in GOA. The system also maps the context to identifiers that are associated with different biomedical ontologies, including the UMLS, Cell Ontology, Mouse Anatomy, NCBI taxonomy, GO, and Mammalian Phenotype Ontology. In addition, PhenoGO was evaluated for coding of anatomical and cellular information and assigning the coded phenotypes to the correct GOA; results obtained show that PhenoGO has a precision of 91% and recall of 92%, demonstrating that the PhenoGO NLP system can accurately encode a large number of anatomical and cellular ontologies to GO annotations. The PhenoGO Database may be accessed at the following URL: http://www.phenoGO.org

Entities:  

Mesh:

Year:  2006        PMID: 17094228      PMCID: PMC2906243     

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  24 in total

1.  UMLS concept indexing for production databases: a feasibility study.

Authors:  P Nadkarni; R Chen; C Brandt
Journal:  J Am Med Inform Assoc       Date:  2001 Jan-Feb       Impact factor: 4.497

2.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

3.  Medical subject headings.

Authors:  F B ROGERS
Journal:  Bull Med Libr Assoc       Date:  1963-01

4.  Automated encoding of clinical documents based on natural language processing.

Authors:  Carol Friedman; Lyudmila Shagina; Yves Lussier; George Hripcsak
Journal:  J Am Med Inform Assoc       Date:  2004-06-07       Impact factor: 4.497

5.  Gene annotation from scientific literature using mappings between keyword systems.

Authors:  Antonio J Pérez; Carolina Perez-Iratxeta; Peer Bork; Guillermo Thode; Miguel A Andrade
Journal:  Bioinformatics       Date:  2004-04-01       Impact factor: 6.937

6.  Terminological mapping for high throughput comparative biology of phenotypes.

Authors:  Y A Lussier; J Li
Journal:  Pac Symp Biocomput       Date:  2004

Review 7.  A survey of current work in biomedical text mining.

Authors:  Aaron M Cohen; William R Hersh
Journal:  Brief Bioinform       Date:  2005-03       Impact factor: 11.622

8.  Overview of BioCreAtIvE: critical assessment of information extraction for biology.

Authors:  Lynette Hirschman; Alexander Yeh; Christian Blaschke; Alfonso Valencia
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

9.  An ontology for cell types.

Authors:  Jonathan Bard; Seung Y Rhee; Michael Ashburner
Journal:  Genome Biol       Date:  2005-01-14       Impact factor: 13.583

10.  The Adult Mouse Anatomical Dictionary: a tool for annotating and integrating data.

Authors:  Terry F Hayamizu; Mary Mangan; John P Corradi; James A Kadin; Martin Ringwald
Journal:  Genome Biol       Date:  2005-02-15       Impact factor: 13.583

View more
  28 in total

1.  Bio-Ontology and text: bridging the modeling gap.

Authors:  Carol Friedman; Tara Borlawsky; Lyudmila Shagina; H Rosie Xing; Yves A Lussier
Journal:  Bioinformatics       Date:  2006-07-26       Impact factor: 6.937

2.  Exploiting semantic relations for literature-based discovery.

Authors:  Dimitar Hristovski; Carol Friedman; Thomas C Rindflesch; Borut Peterlin
Journal:  AMIA Annu Symp Proc       Date:  2006

Review 3.  Computational approaches to phenotyping: high-throughput phenomics.

Authors:  Yves A Lussier; Yang Liu
Journal:  Proc Am Thorac Soc       Date:  2007-01

Review 4.  Frontiers of biomedical text mining: current progress.

Authors:  Pierre Zweigenbaum; Dina Demner-Fushman; Hong Yu; Kevin B Cohen
Journal:  Brief Bioinform       Date:  2007-10-30       Impact factor: 11.622

5.  Biomedical ontologies in action: role in knowledge management, data integration and decision support.

Authors:  O Bodenreider
Journal:  Yearb Med Inform       Date:  2008

6.  Detection of practice pattern trends through Natural Language Processing of clinical narratives and biomedical literature.

Authors:  Elizabeth S Chen; Peter D Stetson; Yves A Lussier; Marianthi Markatou; George Hripcsak; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

Review 7.  Recent progress in automatically extracting information from the pharmacogenomic literature.

Authors:  Yael Garten; Adrien Coulet; Russ B Altman
Journal:  Pharmacogenomics       Date:  2010-10       Impact factor: 2.533

8.  A flexible framework for deriving assertions from electronic medical records.

Authors:  Kirk Roberts; Sanda M Harabagiu
Journal:  J Am Med Inform Assoc       Date:  2011-07-01       Impact factor: 4.497

9.  Discovery of protein interaction networks shared by diseases.

Authors:  Lee Sam; Yang Liu; Jianrong Li; Carol Friedman; Yves A Lussier
Journal:  Pac Symp Biocomput       Date:  2007

10.  PhenoHM: human-mouse comparative phenome-genome server.

Authors:  Divya Sardana; Suresh Vasa; Nishanth Vepachedu; Jing Chen; Ranga Chandra Gudivada; Bruce J Aronow; Anil G Jegga
Journal:  Nucleic Acids Res       Date:  2010-05-27       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.