| Literature DB >> 29351266 |
Nasrin Akhter1, Amarda Shehu2,3,4.
Abstract
Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.Entities:
Keywords: Pareto optimality; basins; conformational space; decoy selection; energy landscape; template-free protein structure prediction
Mesh:
Substances:
Year: 2018 PMID: 29351266 PMCID: PMC6017496 DOI: 10.3390/molecules23010216
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Testing dataset (* denotes proteins with a predominant fold and a short helix).
| PDB ID | Fold | Length | min_dist (Å) | |||
|---|---|---|---|---|---|---|
| Easy | 1ail | 70 | 53,568 | |||
| 1dtdb | 61 | 57,839 | ||||
| 1wapa | 68 | 51,841 | ||||
| 1tig | 88 | 52,099 | ||||
| 1dtja | 74 | 53,526 | ||||
| Medium | 1hz6a | 64 | 57,474 | |||
| 1c8ca | 64 | 53,322 | ||||
| 2ci2 | 65 | 52,220 | ||||
| 1bq9 | 53 | 53,663 | ||||
| 1hhp | 99 | 52,159 | ||||
| 1fwp | 69 | 53,133 | ||||
| 1sap | 66 | 51,209 | ||||
| Hard | 2h5nd | 123 | 51,475 | |||
| 2ezk | 93 | 50,192 | ||||
| 1aoy | 78 | 52,218 | ||||
| 1cc5 | 83 | 51,687 | ||||
| 1isua | 62 | 60,360 | ||||
| 1aly | 146 | 53,274 |
Figure 1Visualization of selected decoys for the target with PDB entry 1dtja. Decoys are plotted by their lRMSD from the conformation in the PDB entry and their Rosetta score 12 all-atom energy.
Figure 2Visualization of selected decoys for targets with PDB entries 1bq9 and 1hhp. Decoys are plotted by their lRMSD from the conformation in the PDB entry and Rosetta score 12 all-atom energy.
Figure 3Visualization of selected decoys for the target with PDB entry 1aoy. Decoys are plotted by their lRMSD from the conformation in the PDB entry and Rosetta score 12 all-atom energy.
Comparison of all selection strategies on the easy cases.
| 1ail | 1dtdb | 1wapa | 1tig | 1dtja | ||
|---|---|---|---|---|---|---|
| Cluster-Random | C | n: 4% | n: 17.8% | n: 5.2% | n: 8.8% | n: 21.4% |
| p: 6.2% | p: 18.2% | p: 10.1% | p: 15.2% | p: 22.3% | ||
| s: 4.1% | s: 22.3% | s: 5.2% | s: 8.7% | s: 21.6% | ||
| C | n: 6.6% | n: 18.6% | n: 9.8% | n: 11.3% | n: 22.2% | |
| p: 6.3% | p: 18.2% | p: 10% | p: 15.2% | p: 22.2% | ||
| s: 6.7% | s: 23.3% | s: 10% | s: 11.2% | s: 22.4% | ||
| C | n: 8.5% | n: 19.1% | n: 10% | n: 13.6% | n: 22.4% | |
| p: 6.3% | p: 18.2% | p: 10% | p: 15.1% | p: 22.3% | ||
| s: 8.7% | s: 23.9% | s: 10.2% | s: 13.6% | s: 22.6% | ||
| Cluster-Size | C | n: 63.9% | n: 97.6% | n: 50.8% | n: 57.3% | n: 95.5% |
| p: 99.5% | p: 99.9% | p: 99.9% | p: 99.1% | p: 99.2% | ||
| s: 4.1% | s: 22.3% | s: 5.2% | s: 8.7% | s: 21.6% | ||
| C | n: 64.4% | n: 97.6% | n: 97.6% | n: 73% | n: 97.8% | |
| p: 61.1% | p: 95.7% | p: 99.5% | p: 98.2% | p: 98% | ||
| s: 6.7% | s: 23.3% | s: 10% | s: 11.2% | s: 22.4% | ||
| C | n: 65.6% | n: 97.6% | n: 97.6% | n: 88.4% | n: 97.8% | |
| p: 48.2% | p: 93.3% | p: 97.3% | p: 98.4% | p: 97.2% | ||
| s: 8.7% | s: 23.9% | s: 10.2% | s: 13.6% | s: 22.6% | ||
| Basin-Size | B | n: 47.2% | n: 85.3% | n: 76.8% | n: 28.8% | n: 36.9% |
| p: 100% | p: 99% | p: 98.9% | p: 100% | p: 98.9% | ||
| s: 3% | s: 19.7% | s: 7.9% | s: 4.4% | s: 8.4% | ||
| B | n: 48.4% | n: 94.9% | n: 81.8% | n: 40.1% | n: 56.7% | |
| p: 52.8% | p: 98.9% | p: 98.8% | p: 99.6% | p: 99.1% | ||
| s: 5.8% | s: 21.9% | s: 8.4% | s: 6.1% | s: 12.8% | ||
| B | n: 48.4% | n: 94.9% | n: 86.3% | n: 50.2% | n: 70.7% | |
| p: 44.8% | p: 94.8% | p: 98.7% | p: 99.7% | p: 99.2% | ||
| s: 6.9% | s: 22.9% | s: 8.9% | s: 7.6% | s: 16% | ||
| Basin-Size+Energy | B | n: 1.2% | n: 85.3% | n: 76.8% | n: 2.7% | n: 19.9% |
| p: 2.8% | p: 99% | p: 98.9% | p: 88.4% | p: 99.6% | ||
| s: 3% | s: 19.7% | s: 7.9% | s: 0.5% | s: 4.5% | ||
| B | n: 48.4% | n: 94.9% | n: 79.1% | n: 31.5% | n: 33.8% | |
| p: 52.8% | p: 98.9% | p: 98.9% | p: 98.9% | p: 99.6% | ||
| s: 5.8% | s: 21.9% | s: 8.2% | s: 4.8% | s: 7.6% | ||
| B | n: 61.9% | n: 95.9% | n: 84.1% | n: 42.8% | n: 70.7% | |
| p: 58.6% | p: 98.9% | p: 98.8% | p: 98.8% | p: 99.2% | ||
| s: 6.7% | s: 22.1% | s: 8.7% | s: 6.5% | s: 16% | ||
| Basin-PR | B | n: 47.2% | n: 85.3% | n: 76.8% | n: 28.8% | n: 36.9% |
| p: 100% | p: 99% | p: 98.9% | p: 100% | p: 98.9% | ||
| s: 3% | s: 19.7% | s: 7.9% | s: 4.4% | s: 8.4% | ||
| B | n: 48.4% | n: 94.9% | n: 79.1% | n: 31.5% | n: 56.7% | |
| p: 52.8% | p: 98.9% | p: 98.9% | p: 98.9% | p: 99.1% | ||
| s: 5.8% | s: 21.9% | s: 8.2% | s: 4.8% | s: 12.8% | ||
| B | n: 61.9% | n: 94.9% | n: 84.1% | n: 42.8% | n: 70.7% | |
| p: 58.6% | p: 98.9% | p: 98.8% | p: 98.8% | p: 99.2% | ||
| s: 6.7% | s: 21.9% | s: 8.7% | s: 6.6% | s: 16% | ||
| Basin-PR+PC | B | n: 47.2% | n: 85.3% | n: 76.8% | n: 28.8% | n: 19.9% |
| p: 100% | p: 99% | p: 98.9% | p: 100% | p: 99.6% | ||
| s: 3% | s: 19.7% | s: 7.9% | s: 4.4% | s: 4.5% | ||
| B | n: 48.4% | n: 94.9% | n: 81.8% | n: 31.5% | n: 56.7% | |
| p: 52.8% | p: 98.9% | p: 98.8% | p: 98.9% | p: 99.1% | ||
| s: 5.8% | s: 21.9% | s: 8.4% | s: 4.8% | s: 12.8% | ||
| B | n: 61.9% | n: 95.4% | n: 84.1% | n: 42.8% | n: 70.7% | |
| p: 58.6% | p: 98.8% | p: 98.8% | p: 98.8% | p: 99.2% | ||
| s: 6.7% | s: 22% | s: 8.7% | s: 6.6% | s: 16% |
Comparison of all selection strategies on the medium cases.
| 1hz6a | 1c8ca | 2ci2 | 1bq9 | 1hhp | 1fwp | 1sap | ||
|---|---|---|---|---|---|---|---|---|
| Cluster-Random | C | n: 4.5% | n: 3.5% | n: 0.4% | n: 0.8% | n: 0.2% | n: 1.9% | n: 9.5% |
| p: 11.4% | p: 11.4% | p: 22.5% | p: 1.9% | p: 2.8% | p: 6% | p: 2.3% | ||
| s: 4.4% | s: 3.4% | s: 0.4% | s: 0.6% | s: 0.2% | s: 1.8% | s: 9.3% | ||
| C | n: 7.7% | n: 5.3% | n: 0.6% | n: 1.4% | n: 0.3% | n: 3.2% | n: 14.6% | |
| p: 11.3% | p: 11.2% | p: 22.9% | p: 2.1% | p: 2.7% | p: 6.1% | p: 2.4% | ||
| s: 7.7% | s: 5.2% | s: 0.6% | s: 1% | s: 0.3% | s: 3.1% | s: 13.9% | ||
| C | n: 10.9% | n: 6.3% | n: 0.8% | n: 1.9% | n: 0.3% | n: 4% | n: 18.3% | |
| p: 11.4% | p: 11.2% | p: 22.2% | p: 2.1% | p: 2.3% | p: 5.8% | p: 7.4% | ||
| s: 10.8% | s: 6.2% | s: 0.8% | s: 1.4% | s: 0.3% | s: 4% | s: 17.4% | ||
| Cluster-Size | C | n: 0% | n: 10% | n: 1.3% | n: 0.6% | n: 1.5% | n: 29.1% | n: 0% |
| p: 0% | p: 32.1% | p: 82% | p: 1.5% | p: 19.8% | p: 92.8% | p: 0% | ||
| s: 4.4% | s: 3.4% | s: 0.4% | s: 0.64% | s: 0.19% | s: 1.8% | s: 9.3% | ||
| C | n: 0% | n: 11.8% | n: 2.4% | n: 9.1% | n: 2.6% | n: 36.3% | n: 44.1% | |
| p: 0% | p: 24.7% | p: 89.4% | p: 13.6% | p: 25.4% | p: 69.2% | p: 7.3% | ||
| s: 7.7% | s: 5.2% | s: 0.6% | s: 1.04% | s: 0.26% | s: 3.1% | s: 13.9% | ||
| C | n: 26.4% | n: 20.5% | n: 3.2% | n: 21% | n: 3.7% | n: 44.1% | n: 55.9 | |
| p: 27.7% | p: 36.3% | p: 92% | p: 24% | p: 28.7% | p: 63.7% | p: 7.4 | ||
| s: 10.8% | s: 6.2% | s: 0.8% | s: 1.4% | s: 0.32% | s: 4% | s: 17.4% | ||
| Basin-Size | B | n: 55.5% | n: 6.1% | n: 0.3% | n: 9.3% | n: 3.5% | n: 5.6% | n: 0% |
| p: 85.5% | p: 32.9% | p: 47.2% | p: 80.4% | p: 53.6% | p: 97.7% | p: 0% | ||
| s: 7.3% | s: 2% | s: 0.13% | s: 0.18% | s: 0.16% | s: 0.33% | s: 4.4% | ||
| B | n: 55.5% | n: 20.2% | n: 0.3% | n: 11.1% | n: 3.5% | n: 9.1% | n: 32.4% | |
| p: 50% | p: 60.8% | p: 23.6% | p: 49.2% | p: 27% | p: 97.2% | p: 9.3% | ||
| s: 12.6% | s: 3.6% | s: 0.3% | s: 0.4% | s: 0.32% | s: 0.54% | s: 8.1% | ||
| B | n: 55.5% | n: 22.3% | n: 0.3% | n: 19.8% | n: 5.6% | n: 10.7% | n: 51.4% | |
| p: 39.3% | p: 48.5% | p: 15.9% | p: 60.8% | p: 30.8% | p: 84.2% | p: 11.5% | ||
| s: 16% | s: 5% | s: 0.4% | s: 0.51% | s: 0.45% | s: 0.74% | s: 10.3% | ||
| Basin-Size+Energy | B | n: 55.5% | n: 3.3% | n: 0.42% | n: 9.3% | n: 3.5% | n: 3.5% | n: 32.4% |
| p: 85.5% | p: 47.8% | p: 100% | p: 80.4% | p: 53.6% | p: 96.4% | p: 20.2% | ||
| s: 7.3% | s: 0.8% | s: 0.1% | s: 0.18% | s: 0.16% | s: 0.21% | s: 3.7% | ||
| B | n: 55.5% | n: 17.4% | n: 0.71% | n: 14.1% | n: 5.6% | n: 3.7% | n: 51.4% | |
| p: 66.6% | p: 80.6% | p: 68.9% | p: 68.2% | p: 47.7% | p: 58.4% | p: 20% | ||
| s: 9.4% | s: 2.4% | s: 0.23% | s: 0.32% | s: 0.29% | s: 0.37% | s: 5.9% | ||
| B | n: 55.7% | n: 20.1% | n: 1.13% | n: 20.5% | n: 8.5% | n: 9.3% | n: 51.4% | |
| p: 55.7% | p: 80.4% | p: 76.9% | p: 69.6% | p: 51.4% | p: 77% | p: 18.2% | ||
| s: 11.3% | s: 2.7% | s: 0.33% | s: 0.46% | s: 0.41% | s: 0.7% | s: 6.5% | ||
| Basin-PR | B | n: 55.5% | n: 3.3% | n: 0.1% | n: 9.3% | n: 0.1% | n: 3.5% | n: 32.4% |
| p: 85.5% | p: 47.8% | p: 100% | p: 80.4% | p: 5% | p: 96.4% | p: 20.2% | ||
| s: 7.3% | s: 0.8% | s: 0.01% | s: 0.18% | s: 0.04% | s: 0.21% | s: 3.7% | ||
| B | n: 55.5% | n: 17.4% | n: 0.1% | n: 11.1% | n: 3.6% | n: 9.1% | n: 32.4% | |
| p: 58.3% | p: 80.6% | p: 7.7% | p: 49.2% | p: 44.2% | p: 97.2% | p: 9.3% | ||
| s: 10.8% | s: 2.4% | s: 0.15% | s: 0.35% | s: 0.2% | s: 0.54% | s: 8.1% | ||
| B | n: 57.7% | n: 23.5% | n: 0.3% | n: 13.3% | n: 6.9% | n: 9.3% | n: 51.4% | |
| p: 58.4% | p: 58.5% | p: 26.5% | p: 53.9% | p: 55.6% | p: 77% | p: 11.5% | ||
| s: 11.2% | s: 4.4% | s: 0.2% | s: 0.51% | s: 0.31% | s: 0.7% | s: 10.3% | ||
| Basin-PR+PC | B | n: 55.5% | n: 14% | n: 0.43% | n: 9.3% | n: 3.5% | n: 3.5% | n: 32.4% |
| p: 85.5% | p: 96.3% | p: 100% | p: 80.4% | p: 53.6% | p: 96.4% | p: 20.2% | ||
| s: 7.3% | s: 1.6% | s: 0.1% | s: 0.18% | s: 0.16% | s: 0.21% | s: 3.7% | ||
| B | n: 55.5% | n: 17.4% | n: 0.72% | n: 14.1% | n: 3.6% | n: 9.1% | n: 32.4% | |
| p: 50% | p: 80.6% | p: 68.9% | p: 68.2% | p: 44.2% | p: 97.2% | p: 9.3% | ||
| s: 12.6% | s: 2.4% | s: 0.23% | s: 0.32% | s: 0.2% | s: 0.54% | s: 8.1% | ||
| B | n: 55.5% | n: 23.5% | n: 0.93% | n: 22.7% | n: 6.9% | n: 9.3% | n: 51.4% | |
| p: 39.3% | p: 58.5% | p: 67.7% | p: 74.3% | p: 55.6% | p: 77% | p: 11.5% | ||
| s: 16% | s: 4.4% | s: 0.31% | s: 0.46% | s: 0.31% | s: 0.7% | s: 10.3% |
Comparison of all selection strategies on the hard cases.
| 2h5nd | 2ezk | 1aoy | 1cc5 | 1isua | 1aly | ||
|---|---|---|---|---|---|---|---|
| Cluster-Random | C | n: 0% | n: 0.01% | n: 0.02% | n: 0% | n: 0.02% | n: 0% |
| p: 0% | p: 5% | p: 8.0% | p: 0% | p: 5.5% | p: 0% | ||
| s: 0.004% | s: 0.02% | s: 0.03% | s: 0.01% | s: 0.02% | s: 0.01% | ||
| C | n: 0% | n: 0.03% | n: 0.03% | n: 0% | n: 0.04% | n: 0% | |
| p: 0% | p: 7.5% | p: 8.2% | p: 0% | p: 6% | p: 0% | ||
| s: 0.008% | s: 0.05% | s: 0.04% | s: 0.02% | s: 0.03% | s: 0.02% | ||
| C | n: 0% | n: 0.05% | n: 0.04% | n: 0% | n: 0.04% | n: 0.01% | |
| p: 0% | p: 10% | p: 6.9% | p: 0% | p: 5% | p: 1.4% | ||
| s: 0.01% | s: 0.07% | s: 0.06% | s: 0.03% | s: 0.05% | s: 0.03% | ||
| Cluster-Size | C | n: 0% | n: 0% | n: 0% | n: 0% | n: 0% | n: 0% |
| p: 0% | p: 0% | p: 0% | p: 0% | p: 0% | p: 0% | ||
| s: 0.004% | s: 0.02% | s: 0.03% | s: 0.01% | s: 0.02% | s: 0.01% | ||
| C | n: 0% | n: 0% | n: 0% | n: 0% | n: 0% | n: 0.3% | |
| p: 0% | p: 0% | p: 0% | p: 0% | p: 0% | p: 40% | ||
| s: 0.008% | s: 0.05% | s: 0.04% | s: 0.02% | s: 0.03% | s: 0.02% | ||
| C | n: 0% | n: 0% | n: 0% | n: 0% | n: 0% | n: 0.4% | |
| p: 0% | p: 0% | p: 0% | p: 0% | p: 0% | p: 42.9% | ||
| s: 0.01% | s: 0.07% | s: 0.06% | s: 0.03% | s: 0.05% | s: 0.03% | ||
| Basin-Size | B | n: 0% | n: 0.96% | n: 0% | n: 0.03% | n: 0.34% | n: 0% |
| p: 0% | p: 41.2% | p: 0% | p: 1.14% | p: 14.1% | p: 0% | ||
| s: 0.27% | s: 0.3% | s: 0.2% | s: 0.17% | s: 0.13% | s: 0.06% | ||
| B | n: 0% | n: 2% | n: 0.2% | n: 0.03% | n: 0.34% | n: 0.07% | |
| p: 0% | p: 43.5% | p: 4.9% | p: 0.6% | p: 7.1% | p: 1.6% | ||
| s: 0.38% | s: 0.6% | s: 0.39% | s: 0.32% | s: 0.26% | s: 0.12% | ||
| B | n: 10% | n: 2% | n: 0.2% | n: 0.03% | n: 0.34% | n: 0.07% | |
| p: 17.4% | p: 33% | p: 3.4% | p: 0.42% | p: 4.9% | p: 1.1% | ||
| s: 0.48% | s: 0.8% | s: 0.57% | s: 0.46% | s: 0.38% | s: 0.17% | ||
| Basin-Size+Energy | B | n: 0% | n: 1.02% | n: 0.05% | n: 0% | n: 0.34% | n: 0% |
| p: 0% | p: 45.9% | p: 3.5% | p: 0% | p: 14.1% | p: 0% | ||
| s: 0.09% | s: 0.29% | s: 0.16% | s: 0.14% | s: 0.13% | s: 0.05% | ||
| B | n: 0% | n: 1.5% | n: 0.23% | n: 1.15% | n: 0.34% | n: 0% | |
| p: 0% | p: 45.7% | p: 6.9% | p: 27.3% | p: 7.6% | p: 0% | ||
| s: 0.37% | s: 0.41% | s: 0.36% | s: 0.23% | s: 0.24% | s: 0.1% | ||
| B | n: 10% | n: 2.4% | n: 0.28% | n: 1.2% | n: 0.44% | n: 0% | |
| p: 17.8% | p: 43.8% | p: 6.1% | p: 18.9% | p: 6.6% | p: 0% | ||
| s: 0.47% | s: 0.72% | s: 0.51% | s: 0.35% | s: 0.35% | s: 0.16% | ||
| Basin-PR | B | n: 0% | n: 0% | n: 0.56% | n: 0.03% | n: 0% | n: 0.27% |
| p: 0% | p: 0% | p: 78.1% | p: 1.14% | p: 0% | p: 40% | ||
| s: 0.006% | s: 0.03% | s: 0.08% | s: 0.17% | s: 0.02% | s: 0.02% | ||
| B | n: 0% | n: 1.02% | n: 0.56% | n: 0.03% | n: 0% | n: 0.27% | |
| p: 0% | p: 41.9% | p: 33% | p: 1.12% | p: 0% | p: 19.1% | ||
| s: 0.28% | s: 0.32% | s: 0.19% | s: 0.17% | s: 0.12% | s: 0.04% | ||
| B | n: 0% | n: 1.02% | n: 0.56% | n: 0.66% | n: 0.07% | n: 0.27% | |
| p: 0% | p: 41.1% | p: 21.8% | p: 15.8% | p: 4.8% | p: 8% | ||
| s: 0.31% | s: 0.32% | s: 0.28% | s: 0.23% | s: 0.21% | s: 0.09% | ||
| Basin-PR+PC | B | n: 0% | n: 1.02% | n: 0.18% | n: 0% | n: 0% | n: 0% |
| p: 0% | p: 45.9% | p: 9.8% | p: 0% | p: 0% | p: 0% | ||
| s: 0.27% | s: 0.29% | s: 0.2% | s: 0.14% | s: 0.05% | s: 0.04% | ||
| B | n: 0% | n: 2% | n: 0.23% | n: 0.63% | n: 0% | n: 0% | |
| p: 0% | p: 43.5% | p: 6.9% | p: 17.5% | p: 0% | p: 0% | ||
| s: 0.37% | s: 0.6% | s: 0.36% | s: 0.2% | s: 0.11% | s: 0.08% | ||
| B | n: 0% | n: 2.0% | n: 0.23% | n: 0.73% | n: 0.03% | n: 0% | |
| p: 0% | p: 39.7% | p: 5.5% | p: 15.8% | p: 1.2% | p: 0% | ||
| s: 0.39% | s: 0.66% | s: 0.46% | s: 0.26% | s: 0.14% | s: 0.10% |
Figure 4Visualization of basins extracted from the energy landscapes probed for an easy (PDB entry 1wapa), medium (1bq9), and hard target (2ezk). The color-coding scheme varies from blue (low purity) to red (high purity). The size of each disk respects the size of the corresponding basin. Top three basins selected by Basin-PR (left panel) and Basin-PR+PC (right panel) are indicated by encapsulating corresponding disks in rectangles.