| Literature DB >> 32657397 |
Md Hossain Shuvo1, Sutanu Bhattacharya1, Debswapna Bhattacharya1,2.
Abstract
MOTIVATION: Protein model quality estimation, in many ways, informs protein structure prediction. Despite their tight coupling, existing model quality estimation methods do not leverage inter-residue distance information or the latest technological breakthrough in deep learning that has recently revolutionized protein structure prediction.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32657397 PMCID: PMC7355297 DOI: 10.1093/bioinformatics/btaa455
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Flowchart of QDeep. (A) Multiple sequence alignment generation. (B) Distance-based, sequence versus structure consistency-based and ROSETTA centroid energy terms-based features collection. (C) Architecture of stacked deep ResNet classifiers at 1, 2, 4, and 8Å error thresholds. (D) Residue-level ensemble error classifications and their combination for model quality estimation
Fig. 2.Accuracy of the individual residue-level classifiers at 1, 2, 4, and 8Å error thresholds on the validation set of 82 CASP11 targets
Performance of single-model quality estimation methods on CASP12 and CASP13 stage 2 datasets, sorted in decreasing order of average per-target Pearson correlations
| Dataset | Method | Avg. | Avg. | Avg. | Avg. loss | Global | Global | Global |
|---|---|---|---|---|---|---|---|---|
| CASP12 (stage 2) | QDeep |
|
|
|
|
|
|
|
| ProQ3D | 0.688 | 0.631 | 0.467 | 0.086 | 0.851 | 0.847 | 0.660 | |
| 3DCNN | 0.661 | 0.585 | 0.427 | 0.081 | 0.834 | 0.818 | 0.620 | |
| ProQ2 | 0.624 | 0.556 | 0.404 | 0.091 | 0.784 | 0.770 | 0.577 | |
| ProQ3 | 0.604 | 0.536 | 0.390 | 0.071 | 0.806 | 0.793 | 0.600 | |
| VoroMQA | 0.560 | 0.502 | 0.362 | 0.105 | 0.604 | 0.603 | 0.444 | |
| CASP13 (stage 2) | QDeep |
|
|
| 0.088 |
|
|
|
| ProQ4 | 0.733 | 0.667 | 0.507 | 0.089 | 0.667 | 0.642 | 0.491 | |
| MESHI | 0.713 | 0.663 | 0.492 |
| 0.833 | 0.845 | 0.659 | |
| ProQ3D | 0.671 | 0.619 | 0.457 | 0.084 | 0.849 | 0.811 | 0.626 | |
| VoroMQA-A | 0.665 | 0.606 | 0.442 | 0.092 | 0.769 | 0.767 | 0.574 | |
| VoroMQA-B | 0.651 | 0.592 | 0.429 | 0.072 | 0.754 | 0.750 | 0.554 |
Note: Values in bold represent the best performance.
Per-target average Pearson, Spearman, and Kendall’s Tau correlation with respect to true GDT-TS score.
Per-target average loss with respect to true GDT-TS score.
Global Pearson, Spearman, and Kendall’s Tau correlation with respect to true GDT-TS score.
Fig. 3.The ability of single-model quality estimation methods to distinguish good and bad models in (A) CASP12 and (B) CASP13 stage 2 datasets. A cutoff of GDT-TS = 0.4 is used to separate good and bad models
Performance comparison of deep ResNet models used in QDeep with other deep learning architectures on CASP12 and CASP13 stage 2 datasets
| CASP12 stage 2 | CASP13 stage 2 | |||||||
|---|---|---|---|---|---|---|---|---|
| Avg. | Avg. | Avg. | Avg. loss | Avg. | Avg. | Avg. | Avg. loss | |
| ResNet |
|
|
|
|
|
|
|
|
| LSTM | 0.716 | 0.596 | 0.452 | 0.059 | 0.735 | 0.668 | 0.500 | 0.116 |
| CNN | 0.657 | 0.581 | 0.433 | 0.097 | 0.735 | 0.660 | 0.487 | 0.116 |
Note: Values in bold represent the best performance.
Per-target average Pearson, Spearman, and Kendall’s Tau correlation with respect to true GDT-TS score.
Per-target average loss with respect to true GDT-TS score.
Performance comparison of variants of QDeep on CASP12 and CASP13 stage 2 datasets
| CASP12 stage 2 | CASP13 stage 2 | |||||||
|---|---|---|---|---|---|---|---|---|
| Avg. | Avg. | Avg. | Avg. loss | Avg. | Avg. | Avg. | Avg. loss | |
| QDeep | 0.740 | 0.657 | 0.492 | 0.051 | 0.752 | 0.692 | 0.512 | 0.088 |
| QDeepDeepMSA | 0.741 | 0.667 | 0.505 | 0.062 | 0.777 | 0.720 | 0.538 | 0.084 |
| QDeepNoDistance | 0.677 | 0.601 | 0.442 | 0.065 | 0.668 | 0.613 | 0.445 | 0.091 |
Note: QDeepDeepMSA: QDeep classifiers retrained using features generated by integrating deep MSA. QDeepNoDistance: QDeep classifiers retrained without using any distance-based features, but utilizing deep MSA.
Per-target average Pearson, Spearman, and Kendall’s Tau correlation with respect to true GDT-TS score.
Per-target average loss with respect to true GDT-TS score.