| Literature DB >> 28371781 |
Pritha Dutta, Subhadip Basu, Mahantapas Kundu.
Abstract
The semantic similarity between two interacting proteins can be estimated by combining the similarity scores of the GO terms associated with the proteins. Greater number of similar GO annotations between two proteins indicates greater interaction affinity. Existing semantic similarity measures make use of the GO graph structure, the information content of GO terms, or a combination of both. In this paper, we present a hybrid approach which utilizes both the topological features of the GO graph and information contents of the GO terms. More specifically, we 1) consider a fuzzy clustering of the GO graph based on the level of association of the GO terms, 2) estimate the GO term memberships to each cluster center based on the respective shortest path lengths, and 3) assign weightage to GO term pairs on the basis of their dissimilarity with respect to the cluster centers. We test the performance of our semantic similarity measure against seven other previously published similarity measures using benchmark protein-protein interaction datasets of Homo sapiens and Saccharomyces cerevisiae based on sequence similarity, Pfam similarity, area under ROC curve, and measure.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28371781 DOI: 10.1109/TCBB.2017.2689762
Source DB: PubMed Journal: IEEE/ACM Trans Comput Biol Bioinform ISSN: 1545-5963 Impact factor: 3.710