| Literature DB >> 35540566 |
Shiyang Long1, Pu Tian2.
Abstract
Rapid and accurate assessment of protein structural models is essential for protein structure prediction and design. Great progress has been made in this regard, especially by recent application of "knowledge-based" potentials. Various machine learning based protein structural model quality assessment methods are also quite successful. However, performance of traditional "physics-based" models has not been as effective. Based on our analysis of the fundamental computational limitation behind unsatisfactory performance of "physics-based" models, we propose a generalized solvation free energy (GSFE) framework, which is intrinsically flexible for multi-scale treatments and is amenable for machine learning implementation. Finally, we implemented a simple example of backbone-based residue level GSFE with neural network, which was found to have competitive performance when compared with highly complex latest "knowledge-based" atomic potentials in distinguishing native structures from decoys. This journal is © The Royal Society of Chemistry.Entities:
Year: 2019 PMID: 35540566 PMCID: PMC9074945 DOI: 10.1039/c9ra05168f
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 4.036
Fig. 1Schematic illustration of the GSFE framework. Here a residue in a protein molecule is considered to be solvated by its neighboring residues.
Fig. 2Schematic representation of the vector organization for neural network input features.
Fig. 3Loss value and accuracy of training dataset and validation dataset.
Performance comparison in native structure recognition. Models with input feature sets 1 through 4 are trained with different input features as mentioned in the text. The number of proteins whose native structure is given the lowest energy score (our method uses largest score) by the potential is listed outside the parentheses. The average Z-scores of native structures are listed in parentheses. Z-score is defined as (〈Edecoy〉 − Enative)/δ (our method (Enative − 〈Edecoy〉)/δ), where Enative is the energy score of the native structure, 〈Edecoy〉 and δ are respectively the average and the standard deviation of energy scores for all decoys in the set
| Decoy sets | CASP5-8 | CASP10-13 | I-TASSER | 3DRobot |
|---|---|---|---|---|
| No. of targets | 143 (2759) | 175 (13 474) | 56 (24 707) | 200 (60 200) |
| Model1 | 99 (1.35) | 80 (1.13) | 12 (1.54) | 66 (1.85) |
| Model2 | 114 (1.52) | 96 (1.20) | 28 (2.32) | 120 (2.09) |
| Model3 | 140 (1.83) | 132 (1.75) | 43 (3.95) | 200 (3.33) |
| Model4 | 140 (1.76) | 135 (1.83) | 48 (4.21) | 200 (3.43) |
Performance comparison in native structure recognition. The number of proteins whose native structure is given the lowest energy score (our method uses largest score) by the potential is listed outside the parentheses. The average Z-scores of native structures are listed in parentheses. Z-score is defined as (〈Edecoy〉 − Enative)/δ (our method (Enative − 〈Edecoy〉)/δ), where Enative is the energy score of the native structure, 〈Edecoy〉 and δ are respectively the average and the standard deviation of energy scores for all decoys in the set
| Decoy sets | CASP5-8 | CASP10-13 | I-TASSER | 3DRobot |
|---|---|---|---|---|
| No. of targets | 143 (2759) | 175 (13 474) | 56 (24 707) | 200 (60 200) |
| Dfire | 64 (0.61) | 56 (0.72) | 43 (2.80) | 1 (0.83) |
| RW | 65 (1.01) | 36 (0.86) |
| 0 (−0.30) |
| GOAP | 106 (1.67) | 89 (1.62) | 45 (4.98) | 94 (1.85) |
| DOOP | 135 (1.96) | 121 (1.99) | 52 (6.18) | 197 (3.53) |
| ITDA | 71 (1.15) | 117 (1.67) | 52 (4.98) | 196(3.83) |
| VoroMQA | 132 (2.00) | 111 (1.77) | 48 (5.11) | 114 (1.89) |
| SBROD | 88 (1.62) | 119 ( | 33 (3.25) | 49 (1.76) |
| AngularQA | 59 (1.26) | 24 (1.11) | 29 (1.82) | 9 (0.99) |
| ANDIS | 138 ( | 129 ( | 47 ( |
|
| GSFE |
|
| 48 (4.21) |
|
CASP13 result; native structure ranks in decoy structures
| Structure id | Rank ( | Structure id | Rank ( |
|---|---|---|---|
| T0950-D1 | 1/39 (2.65) | T0966-D1 | 1/87 (1.18) |
| T0953s1-D1 | 4/90 (1.92) | T0968s1-D1 | 1/94 (1.42) |
| T0954-D1 | 1/87 (1.33) | T0968s2-D1 | 2/95 (1.07) |
| T0955-D1 | 18/92 (0.68) | T1003-D1 | 8/89 (1.13) |
| T0957s1-D1 | 2/92 (1.01) | T1005-D1 | 1/83 (1.23) |
| T0957s2-D1 | 1/91 (1.60) | T1008-D1 | 25/91 (0.84) |
| T0958-D1 | 4/90 (1.22) | T1009-D1 | 1/85 (1.04) |
| T0960-D1 | 22/84 (0.71) | T1011-D1 | 1/82 (1.33) |