Literature DB >> 31874639

GO2Vec: transforming GO terms and proteins to vector representations via graph embeddings.

Xiaoshi Zhong1, Rama Kaalia2, Jagath C Rajapakse2.   

Abstract

BACKGROUND: Semantic similarity between Gene Ontology (GO) terms is a fundamental measure for many bioinformatics applications, such as determining functional similarity between genes or proteins. Most previous research exploited information content to estimate the semantic similarity between GO terms; recently some research exploited word embeddings to learn vector representations for GO terms from a large-scale corpus. In this paper, we proposed a novel method, named GO2Vec, that exploits graph embeddings to learn vector representations for GO terms from GO graph. GO2Vec combines the information from both GO graph and GO annotations, and its learned vectors can be applied to a variety of bioinformatics applications, such as calculating functional similarity between proteins and predicting protein-protein interactions.
RESULTS: We conducted two kinds of experiments to evaluate the quality of GO2Vec: (1) functional similarity between proteins on the Collaborative Evaluation of GO-based Semantic Similarity Measures (CESSM) dataset and (2) prediction of protein-protein interactions on the Yeast and Human datasets from the STRING database. Experimental results demonstrate the effectiveness of GO2Vec over the information content-based measures and the word embedding-based measures.
CONCLUSION: Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GO and GOA graphs. Our results also demonstrate that GO annotations provide useful information for computing the similarity between GO terms and between proteins.

Entities:  

Keywords:  CESSM evaluation; Gene ontology; Graph embeddings; Protein-protein interaction prediction; Vector representations

Mesh:

Substances:

Year:  2019        PMID: 31874639     DOI: 10.1186/s12864-019-6272-2

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


  6 in total

1.  A Collection of Benchmark Data Sets for Knowledge Graph-based Similarity in the Biomedical Domain.

Authors:  Carlota Cardoso; Rita T Sousa; Sebastian Köhler; Catia Pesquita
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

2.  Utilization of Agro-Industrial Byproducts for Bacteriocin Production Using Enterococcus spp. Strains Isolated from Patagonian Marine Invertebrates.

Authors:  Franco M Sosa; Romina B Parada; Emilio R Marguet; Marisol Vallejo
Journal:  Curr Microbiol       Date:  2021-12-14       Impact factor: 2.188

3.  PFP-WGAN: Protein function prediction by discovering Gene Ontology term correlations with generative adversarial networks.

Authors:  Seyyede Fatemeh Seyyedsalehi; Mahdieh Soleymani; Hamid R Rabiee; Mohammad R K Mofrad
Journal:  PLoS One       Date:  2021-02-25       Impact factor: 3.240

4.  InfersentPPI: Prediction of Protein-Protein Interaction Using Protein Sentence Embedding With Gene Ontology Information.

Authors:  Meijing Li; Yingying Jiang; Keun Ho Ryu
Journal:  Front Genet       Date:  2022-03-28       Impact factor: 4.599

5.  Deep Learning-Powered Prediction of Human-Virus Protein-Protein Interactions.

Authors:  Xiaodi Yang; Shiping Yang; Panyu Ren; Stefan Wuchty; Ziding Zhang
Journal:  Front Microbiol       Date:  2022-04-15       Impact factor: 6.064

6.  TransformerGO: Predicting protein-protein interactions by modelling the attention between sets of gene ontology terms.

Authors:  Ioan Ieremie; Rob M Ewing; Mahesan Niranjan
Journal:  Bioinformatics       Date:  2022-02-17       Impact factor: 6.931

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.