| Literature DB >> 27530967 |
Xiaoyang Jing1, Kai Wang2, Ruqian Lu1, Qiwen Dong3.
Abstract
Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27530967 PMCID: PMC4987638 DOI: 10.1038/srep31571
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The overall flowchart of the proposed methods.
The comparative results of the proposed methods with other methods on 3DRobot dataset based on GDT_TS score.
| Method | wmPMCC↑ | PMCC↑ | AUC↑ | Loss↓ | Top↑ |
|---|---|---|---|---|---|
| 140 | |||||
| 0.95 | 0.88 | 0.95 | 140 | ||
| ModFOLDclust2 | 0.95 | 0.90 | 7.51 | 13 | |
| DFIRE | 0.88 | 0.14 | 0.95 | 7.56 | 30 |
| DOPE | 0.89 | 0.66 | 0.95 | 4.45 | 72 |
| GOAP | 0.91 | 0.55 | 0.96 | 3.88 | 85 |
| RWplus | 0.87 | 0.13 | 0.95 | 7.20 | 32 |
| Frst | 0.86 | 0.78 | 0.94 | 3.11 | 109 |
| ProQ | 0.86 | 0.69 | 0.93 | 12.17 | 47 |
| RFMQA | 0.92 | 0.87 | 0.96 | 1.70 | |
| SIFT | 0.63 | 0.55 | 0.79 | 15.31 | 32 |
| SELECTpro | 0.79 | 0.60 | 0.92 | 17.69 | 8 |
| HRSC | 0.60 | 0.15 | 0.81 | 18.38 | 6 |
| Nonlinear-HRSC | 0.81 | 0.56 | 0.91 | 11.07 | 14 |
The ModFOLDclust2 is a clustering method, other compared methods are listed in “feature extraction” section.
Figure 2The ROC curves of compared methods on the 3DRobot dataset based on GDT_TS score.
The ModFOLDclust2 is a clustering method, other compared methods are listed in “feature extraction” section.
The comparative results of the proposed methods with other thirteen methods from CASP11 on CASP11 dataset based on GDT_TS score.
| Category | Method | Best 150 | Select 20 | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| wmPMCC | PMCC | AUC | Loss | Top | wmPMCC | PMCC | AUC | Loss | Top | ||
| clustering-based | Pcons-net | 5.28 | 3 | 0.91 | 0.93 | ||||||
| Wallner | 0.70 | 0.86 | 0.94 | 5.32 | 53 | ||||||
| DAVIS-QAconsensus | 0.68 | 7.74 | 0 | 0.90 | 5.51 | 48 | |||||
| MULTICOM-REFINE | 0.68 | 7.62 | 0 | 0.90 | 0.92 | 5.20 | 50 | ||||
| ModFOLDclust2 | 0.66 | 7.28 | 0 | 0.86 | 5.36 | 47 | |||||
| MQAPmulti | 0.59 | 0.81 | 0.93 | 9.06 | 0 | 0.91 | 0.97 | 5.14 | 44 | ||
| quasi single-model | 0.77 | 0.91 | 0.97 | 8.29 | 37 | ||||||
| MQAPsingleA | 0.65 | 0.75 | 0.90 | 8.95 | 1 | 0.88 | 0.95 | 3.64 | 52 | ||
| MQAPsingle | 0.56 | 0.75 | 0.90 | 9.51 | 3 | 0.89 | 0.86 | 0.94 | 6.34 | 41 | |
| ModFOLD5_single | 0.53 | 0.92 | 0.96 | 10.31 | 0 | 0.91 | 3.65 | ||||
| nns | 0.54 | 0.89 | 0.95 | 7.75 | 0.83 | 0.91 | 0.97 | 52 | |||
| ConsMQAPsingle | 0.53 | 0.73 | 0.89 | 8.37 | 1 | 0.87 | 0.82 | 0.94 | 5.18 | 47 | |
| single-model | 0.75 | 0.90 | 5 | 0.64 | 0.65 | 0.77 | 8.29 | 37 | |||
| MULTICOM-CLUSTER | 0.43 | 7.06 | 7 | 0.92 | 9.47 | 34 | |||||
| VoroMQA | 0.43 | 0.55 | 0.80 | 7.31 | 7 | 0.60 | 0.61 | 0.83 | 10.76 | 31 | |
| MULTICOM-NOVEL | 0.41 | 0.69 | 0.89 | 6.89 | 0.69 | 0.73 | 0.91 | 9.08 | 37 | ||
| ProQ2 | 0.38 | 0.76 | 5 | 0.70 | 0.79 | ||||||
Figure 3The ROC curves of compared methods on the CASP11 dataset based on GDT_TS score.
(a) The ROC curves for Best150 dataset and (b) the corresponding AUCs for Select20 dataset.