| Literature DB >> 27708689 |
S Srivastava1, S B Lal2, D C Mishra2, U B Angadi2, K K Chaturvedi2, S N Rai3, A Rai4.
Abstract
BACKGROUND: Protein structure comparison play important role in in silico functional prediction of a new protein. It is also used for understanding the evolutionary relationships among proteins. A variety of methods have been proposed in literature for comparing protein structures but they have their own limitations in terms of accuracy and complexity with respect to computational time and space. There is a need to improve the computational complexity in comparison/alignment of proteins through incorporation of important biological and structural properties in the existing techniques.Entities:
Keywords: Backbone atoms; Geodesic distance; Protein structure comparison; Side chain properties
Year: 2016 PMID: 27708689 PMCID: PMC5041553 DOI: 10.1186/s13015-016-0089-1
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Fig. 1Flowchart of the algorithm
Fig. 2Home page of ProtSComp after user has logged in
Confusion matrix
| Group | Predicted class 1 | Predicted class 2 | … | Predicted class i | … | Predicted class n |
|---|---|---|---|---|---|---|
| True class 1 | M11 | M12 | … | M1i | … | M1n |
| True class 2 | M21 | M22 | … | M2i | … | M2n |
| : | : | : | … | : | … | : |
| True class i | Mi1 | Mi2 | … | Mii | … | Min |
| : | : | : | … | : | … | : |
| True class n | Mn1 | Mn2 | … | Mni | … | Mnn |
Performance measures of 100 proteins dataset from ESA, CE and jFATCAT methods at class level with computational time
| Method/levels | Time (hours) for N×N comparison | Measure | Spectral K-means | K-means | Fuzzy C-means |
|---|---|---|---|---|---|
| CE | 126.18 | Precision | 0.9600 | 0.8622 | 0.7141 |
| Recall | 0.9333 | 0.7573 | 0.9792 | ||
| F-measure | 0.9465 | 0.8064 | 0.8259 | ||
| RI | 0.9694 | 0.9538 | 0.9226 | ||
| jFACTCAT | 019.14 | Precision | 0.6653 | 0.4929 | 0.5058 |
| Recall | 0.6043 | 0.5019 | 0.6741 | ||
| F measure | 0.6333 | 0.4974 | 0.5780 | ||
| RI | 0.8554 | 0.8430 | 0.8154 | ||
| Original ESA | 020.40 | Precision | 0.8396 | 0.5075 | 0.4812 |
| Recall | 0.7563 | 0.7744 | 0.6347 | ||
| F measure | 0.7957 | 0.6132 | 0.5474 | ||
| RI | 0.9420 | 0.8248 | 0.8032 | ||
| ESA-MC-BB | 002.20 | Precision | 0.7767 | 0.5523 | 0.5710 |
| Recall | 0.9275 | 0.6277 | 0.5232 | ||
| F measure | 0.8454 | 0.5876 | 0.5461 | ||
| RI | 0.9359 | 0.8440 | 0.8338 | ||
| ESA-MC-BB + HP | 002.20 | Precision | 0.9168 | 0.5058 | 0.5699 |
| Recall | 0.8400 | 0.7925 | 0.5307 | ||
| F measure | 0.8767 | 0.6175 | 0.5496 | ||
| RI | 0.9557 | 0.8298 | 0.8369 | ||
| ESA-MC-BB + POL | 002.20 | Precision | 0.8974 | 0.5416 | 0.5576 |
| Recall | 0.8165 | 0.6000 | 0.5088 | ||
| F measure | 0.8551 | 0.5693 | 0.5321 | ||
| RI | 0.9444 | 0.8159 | 0.8322 | ||
| ESA-CA | 002.20 | Precision | 0.8572 | 0.5075 | 0.5322 |
| Recall | 0.7621 | 0.7744 | 0.4800 | ||
| F measure | 0.8069 | 0.6132 | 0.5048 | ||
| RI | 0.9364 | 0.8961 | 0.8234 | ||
| ESA-CA + HP | 002.20 | Precision | 0.8495 | 0.7588 | 0.5576 |
| Recall | 0.7525 | 0.6997 | 0.5088 | ||
| F measure | 0.7981 | 0.7281 | 0.5321 | ||
| RI | 0.9411 | 0.9020 | 0.8322 | ||
| ESA-CA + POL | 002.20 | Precision | 0.8572 | 0.5058 | 0.5205 |
| Recall | 0.7621 | 0.7925 | 0.4672 | ||
| F measure | 0.8069 | 0.6175 | 0.4924 | ||
| RI | 0.9297 | 0.8388 | 0.8194 |
Computational time (in seconds) required in comparing two protein structures using different methods
| Method | ~100 residues | ~200 residues | ~300 residues |
|---|---|---|---|
| Matt | 1.300 | 3.000 | 5.100 |
| MUSTANG | 0.160 | 2.300 | 2.100 |
| ESA | 1.200 | 2.600 | 15.000 |
| Proposed method (ESA-MC-BB) | 0.740 | 1.040 | 1.540 |
| Proposed method (ESA-CA) | 0.556 | 0.745 | 1.466 |
Fig. 3Upload file on ProtSComp server
Fig. 4Provision for various parameter selections and options such model, chain and auxiliary information
Fig. 5Presentation of final result as geodesic-distance in text (left) and graphical (right) form