Literature DB >> 30809638

Ontology based text mining of gene-phenotype associations: application to candidate gene prediction.

Şenay Kafkas1, Robert Hoehndorf1.   

Abstract

Gene-phenotype associations play an important role in understanding the disease mechanisms which is a requirement for treatment development. A portion of gene-phenotype associations are observed mainly experimentally and made publicly available through several standard resources such as MGI. However, there is still a vast amount of gene-phenotype associations buried in the biomedical literature. Given the large amount of literature data, we need automated text mining tools to alleviate the burden in manual curation of gene-phenotype associations and to develop comprehensive resources. In this study, we present an ontology-based approach in combination with statistical methods to text mine gene-phenotype associations from the literature. Our method achieved AUC values of 0.90 and 0.75 in recovering known gene-phenotype associations from HPO and MGI respectively. We posit that candidate genes and their relevant diseases should be expressed with similar phenotypes in publications. Thus, we demonstrate the utility of our approach by predicting disease candidate genes based on the semantic similarities of phenotypes associated with genes and diseases. To the best of our knowledge, this is the first study using an ontology based approach to extract gene-phenotype associations from the literature. We evaluated our disease candidate prediction model on the gene-disease associations from MGI. Our model achieved AUC values of 0.90 and 0.87 on OMIM (human) and MGI (mouse) datasets of gene-disease associations respectively. Our manual analysis on the text mined data revealed that our method can accurately extract gene-phenotype associations which are not currently covered by the existing public gene-phenotype resources. Overall, results indicate that our method can precisely extract known as well as new gene-phenotype associations from literature. All the data and methods are available at https://github.com/bio-ontology-research-group/genepheno.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2019        PMID: 30809638      PMCID: PMC6391585          DOI: 10.1093/database/baz019

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  31 in total

1.  ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text.

Authors:  Burr Settles
Journal:  Bioinformatics       Date:  2005-04-28       Impact factor: 6.937

2.  Online Mendelian Inheritance in Man (OMIM).

Authors:  A Hamosh; A F Scott; J Amberger; D Valle; V A McKusick
Journal:  Hum Mutat       Date:  2000       Impact factor: 4.878

3.  Using ontologies to describe mouse phenotypes.

Authors:  Georgios V Gkoutos; Eain C J Green; Ann-Marie Mallon; John M Hancock; Duncan Davidson
Journal:  Genome Biol       Date:  2004-12-20       Impact factor: 13.583

4.  Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).

Authors:  Dietrich Rebholz-Schuhmann; Jee-Hyub Kim; Ying Yan; Abhishek Dixit; Caroline Friteyre; Robert Hoehndorf; Rolf Backofen; Ian Lewin
Journal:  PLoS One       Date:  2013-10-04       Impact factor: 3.240

5.  Improved exome prioritization of disease genes through cross-species phenotype comparison.

Authors:  Peter N Robinson; Sebastian Köhler; Anika Oellrich; Kai Wang; Christopher J Mungall; Suzanna E Lewis; Nicole Washington; Sebastian Bauer; Dominik Seelow; Peter Krawitz; Christian Gilissen; Melissa Haendel; Damian Smedley
Journal:  Genome Res       Date:  2013-10-25       Impact factor: 9.043

6.  Semantic prioritization of novel causative genomic variants.

Authors:  Imane Boudellioua; Rozaimi B Mahamad Razali; Maxat Kulmanov; Yasmeen Hashish; Vladimir B Bajic; Eva Goncalves-Serra; Nadia Schoenmakers; Georgios V Gkoutos; Paul N Schofield; Robert Hoehndorf
Journal:  PLoS Comput Biol       Date:  2017-04-17       Impact factor: 4.475

Review 7.  Semantic similarity in biomedical ontologies.

Authors:  Catia Pesquita; Daniel Faria; André O Falcão; Phillip Lord; Francisco M Couto
Journal:  PLoS Comput Biol       Date:  2009-07-31       Impact factor: 4.475

8.  PhenoDigm: analyzing curated annotations to associate animal models with human diseases.

Authors:  Damian Smedley; Anika Oellrich; Sebastian Köhler; Barbara Ruef; Monte Westerfield; Peter Robinson; Suzanna Lewis; Christopher Mungall
Journal:  Database (Oxford)       Date:  2013-05-09       Impact factor: 3.451

9.  Identifying genotype-phenotype relationships in biomedical text.

Authors:  Maryam Khordad; Robert E Mercer
Journal:  J Biomed Semantics       Date:  2017-12-06

10.  The anatomy of phenotype ontologies: principles, properties and applications.

Authors:  Georgios V Gkoutos; Paul N Schofield; Robert Hoehndorf
Journal:  Brief Bioinform       Date:  2018-09-28       Impact factor: 11.622

View more
  2 in total

1.  Text Mining Protocol to Retrieve Significant Drug-Gene Interactions from PubMed Abstracts.

Authors:  Oviya Ramalakshmi Iyyappan; Sharanya Manoharan; Sadhanha Anand; Dheepa Anand; Manonmani Alvin Jose; Raja Ravi Shanker
Journal:  Methods Mol Biol       Date:  2022

2.  Creation and evaluation of full-text literature-derived, feature-weighted disease models of genetically determined developmental disorders.

Authors:  T M Yates; A Lain; J Campbell; D R FitzPatrick; T I Simpson
Journal:  Database (Oxford)       Date:  2022-06-07       Impact factor: 4.462

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.