Literature DB >> 28436885

Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations.

Najmul Ikram, Muhammad Abdul Qadir, Muhammad Tanvir Afzal.   

Abstract

Sequence similarity is a commonly used measure to compare proteins. With the increasing use of ontologies, semantic (function) similarity is getting importance. The correlation between these measures has been applied in the evaluation of new semantic similarity methods, and in protein function prediction. In this research, we investigate the relationship between the two similarity methods. The results suggest absence of a strong correlation between sequence and semantic similarities. There is a large number of proteins with low sequence similarity and high semantic similarity. We observe that Pearson's correlation coefficient is not sufficient to explain the nature of this relationship. Interestingly, the term semantic similarity values above 0 and below 1 do not seem to play a role in improving the correlation. That is, the correlation coefficient depends only on the number of common GO terms in proteins under comparison, and the semantic similarity measurement method does not influence it. Semantic similarity and sequence similarity have a distinct behavior. These findings are of significant effect for future works on protein comparison, and will help understand the semantic similarity between proteins in a better way.

Mesh:

Substances:

Year:  2017        PMID: 28436885     DOI: 10.1109/TCBB.2017.2695542

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  2 in total

1.  A Collection of Benchmark Data Sets for Knowledge Graph-based Similarity in the Biomedical Domain.

Authors:  Carlota Cardoso; Rita T Sousa; Sebastian Köhler; Catia Pesquita
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

Review 2.  Review of computational methods for virus-host protein interaction prediction: a case study on novel Ebola-human interactions.

Authors:  Anup Kumar Halder; Pritha Dutta; Mahantapas Kundu; Subhadip Basu; Mita Nasipuri
Journal:  Brief Funct Genomics       Date:  2018-11-26       Impact factor: 4.241

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.