| Literature DB >> 24587135 |
Wentao Dai1, Tingrui Song1, Xuan Wang1, Xiaoyang Jin1, Lizong Deng1, Aiping Wu2, Taijiao Jiang2.
Abstract
Many template-based modeling (TBM) methods have been developed over the recent years that allow for protein structure prediction and for the study of structure-function relationships for proteins. One major problem all TBM algorithms face, however, is their unsatisfactory performance when proteins under consideration are low-homology. To improve the performance of TBM methods for such targets, a novel model evaluation method was developed here, and named MEFTop. Our novel method focuses on evaluating the topology by using two novel groups of features. These novel features included secondary structure element (SSE) contact information and 3-dimensional topology information. By combining MEFTop algorithm with FR-t5, a threading program developed by our group, we found that this modified TBM program, which was named FR-t5-M, exhibited significant improvements in predictive abilities for low-homology protein targets. We further showed that the MEFTop could be a generalized method to improve threading programs for low-homology protein targets. The softwares (FR-t5-M and MEFTop) are available to non-commercial users at our website: http://jianglab.ibp.ac.cn/lims/FRt5M/FRt5M.html.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24587135 PMCID: PMC3935967 DOI: 10.1371/journal.pone.0089935
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Comparison of the performance of MEFTop and FR-t5 in the dawn region of SCOP1.75–500 set.
(A) The percentage of native-like Top1 models (Top1%) that selected by MEFTop using P-score and FR-t5 using Z-score. The X-axis is the Z-score cutoff and the Y-axis is the Top1%. The performances of Z-score and P-score are shown as white and black columns, respectively. (B) The TM-score of Top1 models selected according to Z-score and P-score for 63 targets with optimal Z-score <5.0. The X-axis and Y-axis of each point represent the TM-scores of Top1 models selected by Z-score and P-score, respectively.
Testing results for the contribution of structural features to MEFTop in the dawn region on SCOP1.75–500 set.
| SCOP1.75–500(Top1%) | |||
| Feature | <6.0 | <5.0 | <4.0 |
| T | 53.6 | 36.5 | 23.9 |
| SSE | 38.2 | 27.0 | 15.2 |
| Topology | 43.6 | 28.6 | 23.9 |
| T+SSE | 56.4 | 41.2 | 28.3 |
| T+Topology | 58.2 | 44.4 | 30.4 |
| SSE+Toplogy | 44.5 | 30.2 | 23.9 |
| All (T+SSE+Topology) | 62.7 | 42.9 | 30.4 |
*Targets with optimal Z-score less than this cutoff value (6.0 or 5.0 or 4.0). On the SCOP1.75–500 set, the numbers of targets are 110 (Z-score<6.0), 63(Z-score<5.0) and 46(Z-score<4.0).
Improvements of FR-t5-M over FR-t5 in the dawn region on SCOP1.75–500 set.
| Top1% | ||||||
| Metrics | Average | Sum | CC ±σ | <6.0 | <5.0 | <4.0 |
| Z-score | 11.48 | 66.41 | 0.472±0.382 | 64.5 | 41.3 | 26.1 |
| M-score | 9.14 | 69.08 | 0.556±0.344 | 69.1 | 50.8 | 37.0 |
The average rank according to TM-score(over 110 decoy sets) in the absence of native structures.
The sum of TM-scores for Top1 models in the dawn region.
The average and standard deviation of Pearson correlation coefficients between predicted score and TM-score for every target in the dawn region.
Targets whose best Z-score is less than the cutoff. On the SCOP1.75–500 set, the number of targets is 110(Z-score<6.0), 63(Z-score<5.0) and 46(Z-score<4.0), respectively.
Figure 2The TM-score of Top1 models selected according to Z-score and M-score for all targets with optimal Z-score <6.0 on SCOP1.75–500 set.
The X-axis and Y-axis of each point represent the TM-score of Top1 models selected according to Z-score and M-score, respectively. Low homology proteins (marked by triangles) had high-quality Top1 models by FR-t5-M (M-score) whereas not FR-t5 (Z-score).
Figure 3Four representative targets with different Top1 models selected by FR-t5-M (M-score) and FR-t5 (Z-score).
The native structure (red) of d1b33n_ (A), d2bl8c1 (B), d2rdeb1 (C) and d1sgka1 (D), the Top1 model selected by FR-t5-M using M-score (green) and FR-t5 using Z-score (cyan) are shown. The TM-scores of Top1 models and native structures are presented. 3D structure models are produced with PyMOL (http://www.pymol.org/).
Improvements of FR-t5-M over FR-t5 for low-homology targets on SCOP1.75–500 set.
| Seq-40% | Seq-30% | ||||
| Metrics | Ave-Rank | Top1% | Ave-Rank | Top1% | |
| Z-score | 17.72 | 42.4 | 17.80 | 32.0 | |
| M-score | 13.49 | 52.5 | 16.32 | 40.0 | |
The sequence identity.
The average rank according to TM-score in the absence of native structures.
The Top1% is the fraction of native-like Top1 models for all targets.
Performances of FR-t5-M and FR-t5 on CASP10 set.
| All | Dawn region | High-confidence | ||||
| Metrics | Ave-Rank | Ave-TM | Ave-Rank | Ave-TM | Ave-Rank | Ave-TM |
| Z-score | 10.46 | 0.564 | 14.15 | 0.449 | 2.79 | 0.803 |
| M-score | 9.00 | 0.570 | 12.08 | 0.458 | 2.58 | 0.802 |
57 proteins whose optimal Z-score <6.0.
46 proteins whose optimal Z-score > = 6.0.
The average rank according to TM-score in the absence of native structures.
The average of TM-scores for Top1 models.
Improvements of RaptorX-M over RaptorX in the dawn region on SCOP1.75–500 set.
| Method | Top1% | Sum | CC ±σ | Average |
| RaptorX | 76.0 | 69.95 | 0.572±0.324 | 13.88 |
| RaptorX-M | 78.8 | 71.4 | 0.581±0.320 | 11.81 |
The Top1% is the fraction of native-like Top1 models for 104 targets in the dawn region whose optimal Z-score(FR-t5) is less than 6.0. (remove 6 targets which could not get complete models by RaptorX).
The sum of TM-scores for Top1 models in the dawn region.
The average and standard deviation of Pearson correlation coefficients between predicted score and TM-score for every target in the dawn region.
The average rank according to TM-score(over 104 decoy sets, remove 6 targets which could not get complete models by RaptorX) in the absence of native structures.
Improvements of SPARKS-X-M over SPARKS-X in the dawn region on SCOP1.75–500 set.
| Method | Top1% | Sum | CC ±σ | Average |
| SPARKS-X | 70.9 | 68.64 | 0.518±0.351 | 13.05 |
| SPARKS-X -M | 73.7 | 69.93 | 0.587±0.301 | 9.77 |
The Top1% is the fraction of native-like Top1 models for 110 targets in the dawn region whose optimal Z-score(FR-t5) is less than 6.0.
The sum of TM-scores for Top1 models in the dawn region.
The average and standard deviation of Pearson correlation coefficients between predicted score and TM-score for every target in the dawn region.
The average rank according to TM-score(over 110 decoy sets) in the absence of native structures.
Improvements of RaptorX-M and SPARKS-X-M for low-homology targets on SCOP1.75–500 set.
| Seq-40% | Seq-30% | |||
| Method | Top1% | Ave-Rank | Top1% | Ave-Rank |
| RaptorX | 63.2 | 20.53 | 54.2 | 25.38 |
| RaptorX-M | 66.7 | 17.07 | 62.5 | 19.38 |
| SPARKS-X | 47.5 | 15.31 | 32.0 | 20.64 |
| SPARKS-X-M | 49.2 | 10.85 | 40.0 | 11.52 |
The sequence identity.
The Top1% is the fraction of native-like Top1 models for all targets.
The average rank according to TM-score in the absence of native structures.
Note: Ave-Rank is only compared between a pair of methods (RaptorX/RaptorX-M and SPARKS-X/SPARKS-X-M).
Figure 4The overview of MEFTop.
(A) The cartoon representation of two contacts between two pairs of SSEs (beta strands). (B) The radius of gyration for the model structure as one of the topology features. (C) Hydrophobic core and local conformation potential based on residue fragments. Schematic representation of the backbone atoms (N CA C O) and the side chain center of mass is shown. (D) The SVM predictor. Four groups of input features: traditional sequence (1D) and contact map (2D) features and two groups of newly introduced structural features including SSE contact features and topology features.