Literature DB >> 30376045

Gene ontology concept recognition using named concept: understanding the various presentations of the gene functions in biomedical literature.

Chia-Jung Yang1,2, Jung-Hsien Chiang1.   

Abstract

OBJECTIVE: A major challenge in precision medicine is the development of patient-specific genetic biomarkers or drug targets. The firsthand information of the genes associated with the pathologic pathways of interest is buried in the ocean of biomedical literature. Gene ontology concept recognition (GOCR) is a biomedical natural language processing task used to extract and normalize the mentions of gene ontology (GO), the controlled vocabulary for gene functions across many species, from biomedical text. The previous GOCR systems, using either rule-based or machine-learning methods, treated GO concepts as separate terms and did not have an efficient way of sharing the common synonyms among the concepts.
MATERIALS AND METHODS: We used the CRAFT corpus in this study. Targeting the compositional structure of the GO, we introduced named concept, the basic conceptual unit which has a conserved name and is used in other complex concepts. Using the named concepts, we separated the GOCR task into dictionary-matching and machine-learning steps. By harvesting the surface names used in the training data, we wildly boosted the synonyms of GO concepts via the connection of the named concepts and then enhanced the capability to recognize more GO concepts in the text. The source code is available athttps://github.com/jeroyang/ncgocr.
RESULTS: Named concept gene ontology concept recognizer (NCGOCR) achieved 0.804 precision and 0.715 recall by correct recognition of the non-standard mentions of the GO concepts. DISCUSSION: The lack of consensus on GO naming causes diversity in the GO mentions in biomedical manuscripts. The high performance is owed to the stability of the composing GO concepts and the lack of variance in the spelling of named concepts.
CONCLUSION: NCGOCR reduced the arduous work of GO annotation and amended the process of searching for the biomarkers or drug targets, leading to improved biomarker development and greater success in precision medicine.

Entities:  

Mesh:

Year:  2018        PMID: 30376045      PMCID: PMC6204799          DOI: 10.1093/database/bay115

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  21 in total

1.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

2.  Implications of compositionality in the gene ontology for its curation and usage.

Authors:  Philip V Ogren; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2005

3.  A new initiative on precision medicine.

Authors:  Francis S Collins; Harold Varmus
Journal:  N Engl J Med       Date:  2015-01-30       Impact factor: 91.245

4.  Textpresso: an ontology-based information retrieval and extraction system for biological literature.

Authors:  Hans-Michael Müller; Eimear E Kenny; Paul W Sternberg
Journal:  PLoS Biol       Date:  2004-09-21       Impact factor: 8.029

5.  Assessing the impact of case sensitivity and term information gain on biomedical concept recognition.

Authors:  Tudor Groza; Karin Verspoor
Journal:  PLoS One       Date:  2015-03-19       Impact factor: 3.240

6.  A modular framework for biomedical concept recognition.

Authors:  David Campos; Sérgio Matos; José Luís Oliveira
Journal:  BMC Bioinformatics       Date:  2013-09-24       Impact factor: 3.169

7.  Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters.

Authors:  Christopher Funk; William Baumgartner; Benjamin Garcia; Christophe Roeder; Michael Bada; K Bretonnel Cohen; Lawrence E Hunter; Karin Verspoor
Journal:  BMC Bioinformatics       Date:  2014-02-26       Impact factor: 3.169

8.  Gene Ontology synonym generation rules lead to increased performance in biomedical concept recognition.

Authors:  Christopher S Funk; K Bretonnel Cohen; Lawrence E Hunter; Karin M Verspoor
Journal:  J Biomed Semantics       Date:  2016-09-09

9.  A guide to best practices for Gene Ontology (GO) manual annotation.

Authors:  Rama Balakrishnan; Midori A Harris; Rachael Huntley; Kimberly Van Auken; J Michael Cherry
Journal:  Database (Oxford)       Date:  2013-07-09       Impact factor: 3.451

10.  GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations.

Authors:  Amaia Sangrador-Vegas; Alex L Mitchell; Hsin-Yu Chang; Siew-Yit Yong; Robert D Finn
Journal:  Database (Oxford)       Date:  2016-03-19       Impact factor: 3.451

View more
  1 in total

1.  Parallel sequence tagging for concept recognition.

Authors:  Lenz Furrer; Joseph Cornelius; Fabio Rinaldi
Journal:  BMC Bioinformatics       Date:  2022-03-24       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.