| Literature DB >> 21920038 |
Stewart E Moughon1, Ram Samudrala.
Abstract
BACKGROUND: Successful protein structure prediction requires accurate low-resolution scoring functions so that protein main chain conformations that are close to the native can be identified. Once that is accomplished, a more detailed and time-consuming treatment to produce all-atom models can be undertaken. The earliest low-resolution scoring used simple distance-based "contact potentials," but more recently, the relative orientations of interacting amino acids have been taken into account to improve performance.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21920038 PMCID: PMC3184297 DOI: 10.1186/1471-2105-12-368
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The "LoCo" local coordinate system. Shown is a single valine residue and its local coordinate system, seen looking down the positive x-axis. The center of the Cα atom is defined to be the origin. The positive y-axis passes through the main chain N coordinates. The positive X axis is placed such that the main chain C coordinates fall within the xy plane in the positive X direction. The coordinate system is right-handed. Both stick and sphere representations are presented for clarity.
Figure 2A single LoCo interaction. An interaction within a LoCo coordinate system is illustrated. Bins shown are approximately 3× actual size for clarity. Bin boundaries are counted from the origin. A single valine is placed as shown in Figure 1. The Cα atom of an interacting residue is displayed separately in green within bin +3, +5, +4.
Training vs. testing groups: native recognition test
| Total sets | # of | # RMSD | # RMSD | Ranknat | Cα RMSDbest | Znat | CCnat | FEnat (%) | |
|---|---|---|---|---|---|---|---|---|---|
| Training group | 154 | 57 | 112 | 136 | 47.1 | 2.10 | 1.587 | 0.478 | 32.9 |
| Testing group | 77 | 38 | 60 | 70 | 13.4 | 1.62 | 1.805 | 0.519 | 36.6 |
Comparison of LoCo performance for native structure recognition on both the training and testing groups of decoy sets is shown. Results are roughly comparable, though the testing group did somewhat better across the board. Natives, RMSD < 2Å and RMSD < 5Å refer to the number of times the best-scoring structure in a particular group is either the native structure (0Å Cα RMSD) or within 2Å or 5Å Cα RMSD from the native structure, respectively. All other measures are averages over every decoy set in the group. Definitions of measures used are provided in the Performance Measures subsection at the end of Methods. In summary, lower scores are better for Ranknat and RMSDbest; higher scores are better for all the other measures..
Training vs. testing groups: decoy discrimination test
| Total sets | RB1 | RB10 | RMSDdecoy | Zdecoy | CCdecoy | FEdecoy | log(PB1) | log(PB10) | |
|---|---|---|---|---|---|---|---|---|---|
| Training group | 154 | 172.0 | 39.2 | 3.75 | 0.829 | 0.461 | 29.9 | -0.773 | -1.491 |
| Testing group | 77 | 154.8 | 5.6 | 3.51 | 0.938 | 0.505 | 31.4 | -0.864 | -1.640 |
Comparison of LoCo performance for decoy discrimination on both the training and testing groups of decoy sets is shown. Again, results are roughly comparable, though the testing group did somewhat better overall. All measures are averages over every decoy set in the group. Definitions of measures used are provided in the Performance Measures subsection at the end of Methods. In summary, lower scores are better for RB1, RB10, RMSDdecoy, log(PB1) and log(PB10); higher scores are better for Zdecoy, CCdecoy and FEdecoy.
Function comparison: native recognition
| Ranknat | RMSDbest | Znat | CCnat | FEnat (%) | |
|---|---|---|---|---|---|
| 13.4 | 1.62 | 1.805 | 0.519 | 36.6 | |
| 6.7 | 1.17 | 2.630 | 0.562 | 38.3 | |
| 19.3 | 2.68 | 1.508 | 0.464 | 31.3 | |
| 44.0 | 2.39 | 1.288 | 0.491 | 33.8 | |
| 81.8 | 4.87 | 0.621 | 0.334 | 20.4 | |
| 56.3 | 4.67 | 0.797 | 0.311 | 18.9 | |
| 87.5 | 4.87 | 0.353 | 0.257 | 13.0 | |
| 54.5 | 3.54 | 0.774 | 0.397 | 24.7 | |
| 45.8 | 3.85 | 0.744 | 0.390 | 23.2 | |
| 28.5 | 5.42 | 0.229 | 0.235 | 12.3 | |
| 31.4 | 3.37 | 0.602 | 0.383 | 24.8 | |
| 124.5 | 3.79 | 0.014 | 0.336 | 20.4 | |
| 101.3 | 3.20 | 0.324 | 0.377 | 23.0 | |
| 50.6 | 4.69 | 0.401 | 0.244 | 15.7 | |
| 52.1 | 3.63 | 0.733 | 0.410 | 26.3 | |
| 57.8 | 3.33 | 0.246 | 0.353 | 23.0 | |
| 54.0 | 4.94 | 0.419 | 0.234 | 13.3 | |
| 54.2 | 5.77 | 0.119 | 0.159 | 7.5 | |
| 37.4 | 4.72 | 0.749 | 0.296 | 20.4 | |
| 31.6 | 4.35 | 0.723 | 0.275 | 19.2 | |
| 28.8 | 3.12 | 0.513 | 0.365 | 24.3 | |
| 248.3 | 5.67 | 0.287 | 0.248 | 17.6 | |
| 34.1 | 4.16 | 0.756 | 0.369 | 21.4 | |
| 33.1 | 4.35 | 0.664 | 0.352 | 20.5 | |
| 30.3 | 4.11 | 0.652 | 0.363 | 21.8 | |
| 47.7 | 3.81 | 0.739 | 0.399 | 24.2 | |
| 80.0 | 4.03 | 0.740 | 0.370 | 23.5 | |
| 54.2 | 4.50 | 0.646 | 0.331 | 17.2 | |
| 66.1 | 3.13 | 0.234 | 0.355 | 24.3 | |
| 73.7 | 5.09 | 0.478 | 0.290 | 17.5 |
Native structure recognition performance comparison among scoring functions. All reported measures are averages over the 77 decoy sets in the final testing group. Lower scores are better for Ranknat and RMSDbest. Higher ones are better for Znat, CCnat and FEnat. LoCo outperforms all other functions except DFMAC in every measure. All metrics are defined in Performance measures at the end of Methods.
Function comparison: decoy discrimination
| RB1 | RB10 | RMSDdecoy | Zdecoy | CCdecoy | FEdecoy (%) | log(PB1) | log(PB10) | |
|---|---|---|---|---|---|---|---|---|
| 154.8 | 5.6 | 3.51 | 0.938 | 0.505 | 31.4 | -0.864 | -1.640 | |
| 108.9 | 13.8 | 3.64 | 1.024 | 0.533 | 31.6 | -0.825 | -1.586 | |
| 172.8 | 52.5 | 4.11 | 0.914 | 0.457 | 28.4 | -0.761 | -1.524 | |
| 118.2 | 24.8 | 3.82 | 0.931 | 0.493 | 32.3 | -0.755 | -1.650 | |
| 124.8 | 36.8 | 5.01 | 0.539 | 0.328 | 19.6 | -0.488 | -1.317 | |
| 186.3 | 52.2 | 4.97 | 0.436 | 0.312 | 17.1 | -0.482 | -1.241 | |
| 192.4 | 36.6 | 5.27 | 0.377 | 0.267 | 14.9 | -0.536 | -1.249 | |
| 139.4 | 32.3 | 3.98 | 0.673 | 0.398 | 25.3 | -0.671 | -1.412 | |
| 175.4 | 39.0 | 4.32 | 0.636 | 0.391 | 24.3 | -0.558 | -1.406 | |
| 164.6 | 71.3 | 5.57 | 0.187 | 0.231 | 13.0 | -0.400 | -1.166 | |
| 161.3 | 31.7 | 3.86 | 0.725 | 0.396 | 27.3 | -0.650 | -1.434 | |
| 183.1 | 34.9 | 4.54 | 0.596 | 0.336 | 22.5 | -0.594 | -1.302 | |
| 149.5 | 28.4 | 3.93 | 0.695 | 0.407 | 25.1 | -0.640 | -1.361 | |
| 172.4 | 25.3 | 4.92 | 0.270 | 0.241 | 15.8 | -0.476 | -1.241 | |
| 152.2 | 26.5 | 4.00 | 0.641 | 0.416 | 25.7 | -0.615 | -1.417 | |
| 129.5 | 29.0 | 4.15 | 0.693 | 0.375 | 25.6 | -0.650 | -1.363 | |
| 191.1 | 67.1 | 5.09 | 0.217 | 0.236 | 13.4 | -0.496 | -1.155 | |
| 140.1 | 65.4 | 5.77 | -0.001 | 0.165 | 7.6 | -0.359 | -1.043 | |
| 176.9 | 32.0 | 5.02 | 0.357 | 0.286 | 17.2 | -0.480 | -1.323 | |
| 189.6 | 53.8 | 5.02 | 0.356 | 0.268 | 16.9 | -0.479 | -1.224 | |
| 163.9 | 31.4 | 3.81 | 0.721 | 0.377 | 25.0 | -0.622 | -1.381 | |
| 231.3 | 44.3 | 5.83 | 0.197 | 0.240 | 15.5 | -0.396 | -1.194 | |
| 146.4 | 41.1 | 4.45 | 0.504 | 0.362 | 20.4 | -0.586 | -1.309 | |
| 90.7 | 37.1 | 4.46 | 0.469 | 0.352 | 20.8 | -0.593 | -1.328 | |
| 144.3 | 27.9 | 4.34 | 0.573 | 0.363 | 21.8 | -0.587 | -1.395 | |
| 156.4 | 30.6 | 4.17 | 0.724 | 0.416 | 26.3 | -0.604 | -1.418 | |
| 141.3 | 64.4 | 4.26 | 0.540 | 0.363 | 22.0 | -0.569 | -1.402 | |
| 163.3 | 62.1 | 4.74 | 0.523 | 0.338 | 18.5 | -0.522 | -1.237 | |
| 125.5 | 28.7 | 4.04 | 0.697 | 0.381 | 26.3 | -0.656 | -1.358 | |
| 149.8 | 48.4 | 5.22 | 0.504 | 0.292 | 17.3 | -0.435 | -1.284 |
Comparison of decoy discrimination performance among all tested functions is shown. All reported measures are averages over the 77 decoy sets in the final testing group. Lower scores are better for RB1, RB10, RMSDdecoy, log(PB1) and log(PB10). Higher scores are better for Zdecoy, CCdecoy and FEdecoy. LoCo outperforms all other functions in RB10, RMSDdecoy, and log(PB1). It is slightly higher than ProSa 2003 in log(PB10) and slightly lower than ProSa 2003 in FEdecoy. Zdecoy, CCdecoy and FEdecoy for LoCo are all slightly lower than for DFMAC. LoCo outperforms the remaining 27 functions in every measure except RB1. All metrics are defined in Performance measures at the end of Methods.
LoCo vs. all-atom potentials: native recognition
| Ranknat | RMSDbest | Znat | CCnat | FEnat (%) | |
|---|---|---|---|---|---|
| 13.4 | 1.62 | 1.805 | 0.519 | 36.6 | |
| 30.2 | 2.54 | 1.367 | 0.474 | 33.2 | |
| 21.2 | 1.89 | 2.019 | 0.556 | 37.3 | |
| 37.5 | 2.69 | 1.525 | 0.482 | 34.5 | |
| 18.6 | 1.59 | 2.055 | 0.526 | 39.7 |
LoCo native recognition performance is compared to that of four widely-used all-atom potentials. All reported measures are averages over the 77 decoy sets in the final testing group. LoCo performance is comparable to the others, placing 1st, 2nd, 3rd, 3rd and 3rd in Ranknat, RMSDbest, Znat, CCnat and FEnat, respectively. Taking the sum of all rankings among these five potentials, LoCo places 3rd overall. All metrics are defined in Performance measures at the end of Methods.
LoCo vs. all-atom potentials: decoy discrimination
| RB1 | RB10 | RMSDdecoy | Zdecoy | CCdecoy | FEdecoy (%) | log(PB1) | log(PB10) | |
|---|---|---|---|---|---|---|---|---|
| 154.8 | 5.6 | 3.51 | 0.938 | 0.505 | 31.4 | -0.864 | -1.640 | |
| 152.2 | 27.7 | 4.02 | 0.878 | 0.479 | 30.8 | -0.818 | -1.604 | |
| 136.5 | 18.9 | 3.75 | 1.014 | 0.536 | 33.3 | -0.896 | -1.592 | |
| 97.4 | 52.0 | 4.21 | 0.764 | 0.466 | 25.9 | -0.717 | -1.409 | |
| 122.5 | 56.9 | 3.45 | 0.896 | 0.493 | 33.1 | -0.881 | -1.526 |
Decoy discrimination performance for LoCo is compared to that of four widely-used all-atom potentials. All reported measures are averages over the 77 decoy sets in the final testing group. LoCo outperforms all others at RB10 and log(PB10). LoCo is beaten by all four at RB1. LoCo places 2nd among all potentials at RMSDdecoy, Zdecoy and CCdecoy. It places 3rd in FEdecoy and log(PB1). When the sum of all rankings among these five potentials is considered, LoCo places 2nd overall. All metrics are defined in Performance measures at the end of Methods.
Figure 3Statistical significance of differences in rank distributions. Cα RMSD rank distributions for the best-scoring non-native structures for all functions are compared. P-values show the likelihood that better rank distributions for the function on the left are the result of chance. P-values less than 0.05 have been colored in red, showing statistically significant differences in these distributions. These ranks are among decoy structures only. The null hypothesis of this one-tailed Wilcoxon test is that neither distribution is lower than the other. The alternative hypothesis is that functions on the left achieved lower ranks for their best-scoring decoys than functions along the top.
LoCo variation: native recognition
| Ranknat | RMSDbest | Znat | CCnat | FEnat (%) | |
|---|---|---|---|---|---|
| 12.0 | 1.51 | 1.870 | 0.529 | 38.4 | |
| 17.5 | 3.09 | 1.445 | 0.403 | 30.0 | |
| 13.9 | 2.36 | 1.659 | 0.496 | 34.5 | |
| 13.4 | 1.62 | 1.805 | 0.519 | 36.6 | |
| 6.7 | 1.17 | 2.630 | 0.562 | 38.3 | |
| 19.3 | 2.68 | 1.508 | 0.464 | 31.3 | |
| 44.0 | 2.39 | 1.288 | 0.491 | 33.8 | |
| 28.5 | 3.12 | 0.797 | 0.410 | 26.3 | |
| 248.3 | 5.77 | 0.014 | 0.159 | 7.5 | |
| 63.3 | 4.27 | 0.521 | 0.324 | 19.9 |
Best, worst and average performance for LoCo across all 84 parameter sets tested is compared with the chosen LoCo parameter set, the three best-performing of the other potentials, and the best, worst and average performance of all 26 remaining potentials from the Jernigan Lab server. All best, worst, and average values are for each individual performance measure; no single set contained all those values. All reported measures are averages over the 77 decoy sets in the final testing group. All metrics are defined in Performance measures at the end of Methods. In summary, lower scores are better for Ranknat and RMSDbest. Higher ones are better for Znat, CCnat and FEnat. The average performance across all 84 versions of LoCo surpassed that every other function except DFMAC. Even at its worst, performance for LoCo exceeded that of all Jernigan server functions for every measure except CCnat.
LoCo variation: decoy discrimination
| RB1 | RB10 | RMSDdecoy | Zdecoy | CCdecoy | FEdecoy (%) | log(PB1) | log(PB10) | |
|---|---|---|---|---|---|---|---|---|
| 25.5 | 5.6 | 3.01 | 1.005 | 0.517 | 33.0 | -0.982 | -1.654 | |
| 175.0 | 53.0 | 4.09 | 0.748 | 0.374 | 24.0 | -0.665 | -1.432 | |
| 112.5 | 26.5 | 3.68 | 0.909 | 0.481 | 29.9 | -0.778 | -1.562 | |
| 154.8 | 5.6 | 3.51 | 0.938 | 0.505 | 31.4 | -0.864 | -1.640 | |
| 108.9 | 13.8 | 3.64 | 1.024 | 0.533 | 31.6 | -0.825 | -1.586 | |
| 172.8 | 52.5 | 4.11 | 0.914 | 0.457 | 28.4 | -0.761 | -1.524 | |
| 118.2 | 24.8 | 3.82 | 0.931 | 0.493 | 32.3 | -0.755 | -1.650 | |
| 90.7 | 25.3 | 3.81 | 0.725 | 0.416 | 27.3 | -0.671 | -1.434 | |
| 231.3 | 71.3 | 5.83 | -0.001 | 0.165 | 7.6 | -0.359 | -1.043 | |
| 159.3 | 41.5 | 4.64 | 0.494 | 0.328 | 20.2 | -0.544 | -1.306 |
Best, worst and average performance for LoCo across all 84 parameter sets tested is compared with the chosen LoCo parameter set, the three best-performing of the other potentials, and the best, worst and average performance of all 26 remaining potentials from the Jernigan Lab server. All best, worst, and average values are for each individual performance measure; no single set contained all those values. All reported measures are averages over the 77 decoy sets in the final testing group. Lower scores are better for RB1, RB10, RMSDdecoy, log(PB1) and log(PB10). Higher scores are better for Zdecoy, CCdecoy and FEdecoy. The average performance for LoCo among all 84 parameter sets exceeds all other functions except DFMAC in RMSDdecoy and log(PB1). The LoCo average betters all other functions except DFMAC and ProSa 2003 in log(PB10). All metrics are defined in Performance measures at the end of Methods.
Omega angles and native recognition
| Ranknat | RMSDbest | Znat | CCnat | FEnat (%) | |
|---|---|---|---|---|---|
| 13.4 | 1.62 | 1.805 | 0.519 | 36.6 | |
| 12.1 | 2.40 | 5.640 | 0.198 | 18.6 | |
| 6.7 | 1.17 | 2.630 | 0.562 | 38.3 | |
| 11.9 | 1.04 | 2.582 | 0.558 | 39.0 |
Native recognition performance comparison among LoCo, our ω-only function and DFMAC both with and without its ω component is shown. All reported measures are averages over the 77 decoy sets in the final testing group. Lower scores are better for Ranknat and RMSDbest. Higher ones are better for Znat, CCnat and FEnat. The ω-only function is able to pick out native structures quite well, but when it fails, its choices are essentially random. In the two measures for which the ω-only function does poorly (CCnat and FEnat), DFMAC performance improves when its ω component is removed. All metrics are defined in Performance measures at the end of Methods.
Omega angles and decoy discrimination
| RB1 | RB10 | RMSDdecoy | Zdecoy | CCdecoy | FEdecoy (%) | logPB1 | logPB10 | |
|---|---|---|---|---|---|---|---|---|
| 154.8 | 5.6 | 3.51 | 0.938 | 0.505 | 31.4 | -0.864 | -1.640 | |
| 171.1 | 47.9 | 6.46 | 0.100 | 0.166 | 9.1 | -0.361 | -1.226 | |
| 108.9 | 13.8 | 3.64 | 1.024 | 0.533 | 31.6 | -0.825 | -1.586 | |
| 106.1 | 12.6 | 3.61 | 1.021 | 0.533 | 32.1 | -0.830 | -1.600 |
Comparison of decoy discrimination performance comparison among LoCo, our ω-only function and DFMAC, both with and without its ω component, is shown. All reported measures are averages over the 77 decoy sets in the final testing group. Lower scores are better for RB1, RB10, RMSDdecoy, log(PB1) and log(PB10). Higher scores are better for Zdecoy, CCdecoy and FEdecoy. Performance for our ω-only function is approximately the same as if its choices had been made at random. With the exception of CCdecoy (which stays the same) DFMAC performance improves across the board with the ω component removed. All metrics are defined in Performance measures at the end of Methods.