Literature DB >> 30407490

OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction.

Fatima Zohra Smaili1, Xin Gao1, Robert Hoehndorf1.   

Abstract

MOTIVATION: Ontologies are widely used in biology for data annotation, integration and analysis. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation axioms which provide valuable pieces of information that characterize ontology classes. Annotation axioms commonly used in ontologies include class labels, descriptions or synonyms. Despite being a rich source of semantic information, the ontology meta-data are generally unexploited by ontology-based analysis methods such as semantic similarity measures.
RESULTS: We propose a novel method, OPA2Vec, to generate vector representations of biological entities in ontologies by combining formal ontology axioms and annotation axioms from the ontology meta-data. We apply a Word2Vec model that has been pre-trained on either a corpus or abstracts or full-text articles to produce feature vectors from our collected data. We validate our method in two different ways: first, we use the obtained vector representations of proteins in a similarity measure to predict protein-protein interaction on two different datasets. Second, we evaluate our method on predicting gene-disease associations based on phenotype similarity by generating vector representations of genes and diseases using a phenotype ontology, and applying the obtained vectors to predict gene-disease associations using mouse model phenotypes. We demonstrate that OPA2Vec significantly outperforms existing methods for predicting gene-disease associations. Using evidence from mouse models, we apply OPA2Vec to identify candidate genes for several thousand rare and orphan diseases. OPA2Vec can be used to produce vector representations of any biomedical entity given any type of biomedical ontology.
AVAILABILITY AND IMPLEMENTATION: https://github.com/bio-ontology-research-group/opa2vec. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2019        PMID: 30407490     DOI: 10.1093/bioinformatics/bty933

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Knowledge-Based Biomedical Data Science.

Authors:  Tiffany J Callahan; Ignacio J Tripodi; Harrison Pielke-Lombardo; Lawrence E Hunter
Journal:  Annu Rev Biomed Data Sci       Date:  2020-04-07

2.  Semantic similarity and machine learning with ontologies.

Authors:  Maxat Kulmanov; Fatima Zohra Smaili; Xin Gao; Robert Hoehndorf
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

3.  nhKcr: a new bioinformatics tool for predicting crotonylation sites on human nonhistone proteins based on deep learning.

Authors:  Yong-Zi Chen; Zhuo-Zhi Wang; Yanan Wang; Guoguang Ying; Zhen Chen; Jiangning Song
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

4.  PathoPhenoDB, linking human pathogens to their phenotypes in support of infectious disease research.

Authors:  Şenay Kafkas; Marwa Abdelhakim; Yasmeen Hashish; Maxat Kulmanov; Marwa Abdellatif; Paul N Schofield; Robert Hoehndorf
Journal:  Sci Data       Date:  2019-06-03       Impact factor: 6.444

5.  BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale.

Authors:  Qingyu Chen; Kyubum Lee; Shankai Yan; Sun Kim; Chih-Hsuan Wei; Zhiyong Lu
Journal:  PLoS Comput Biol       Date:  2020-04-23       Impact factor: 4.475

6.  Dimensional reduction of phenotypes from 53 000 mouse models reveals a diverse landscape of gene function.

Authors:  Tomasz Konopka; Letizia Vestito; Damian Smedley
Journal:  Bioinform Adv       Date:  2021-10-11

7.  Evaluating semantic similarity methods for comparison of text-derived phenotype profiles.

Authors:  Luke T Slater; Sophie Russell; Silver Makepeace; Alexander Carberry; Andreas Karwath; John A Williams; Hilary Fanning; Simon Ball; Robert Hoehndorf; Georgios V Gkoutos
Journal:  BMC Med Inform Decis Mak       Date:  2022-02-05       Impact factor: 2.796

8.  Formal axioms in biomedical ontologies improve analysis and interpretation of associated data.

Authors:  Fatima Zohra Smaili; Xin Gao; Robert Hoehndorf
Journal:  Bioinformatics       Date:  2020-04-01       Impact factor: 6.937

9.  Network-based protein-protein interaction prediction method maps perturbations of cancer interactome.

Authors:  Jiajun Qiu; Kui Chen; Chunlong Zhong; Sihao Zhu; Xiao Ma
Journal:  PLoS Genet       Date:  2021-11-02       Impact factor: 5.917

10.  DeepSVP: Integration of genotype and phenotype for structural variant prioritization using deep learning.

Authors:  Azza Althagafi; Lamia Alsubaie; Nagarajan Kathiresan; Katsuhiko Mineta; Taghrid Aloraini; Fuad Almutairi; Majid Alfadhel; Takashi Gojobori; Ahmad Alfares; Robert Hoehndorf
Journal:  Bioinformatics       Date:  2021-12-24       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.