| Literature DB >> 26397464 |
Noah Ollikainen1, René M de Jong2, Tanja Kortemme3.
Abstract
Interactions between small molecules and proteins play critical roles in regulating and facilitating diverse biological functions, yet our ability to accurately re-engineer the specificity of these interactions using computational approaches has been limited. One main difficulty, in addition to inaccuracies in energy functions, is the exquisite sensitivity of protein-ligand interactions to subtle conformational changes, coupled with the computational problem of sampling the large conformational search space of degrees of freedom of ligands, amino acid side chains, and the protein backbone. Here, we describe two benchmarks for evaluating the accuracy of computational approaches for re-engineering protein-ligand interactions: (i) prediction of enzyme specificity altering mutations and (ii) prediction of sequence tolerance in ligand binding sites. After finding that current state-of-the-art "fixed backbone" design methods perform poorly on these tests, we develop a new "coupled moves" design method in the program Rosetta that couples changes to protein sequence with alterations in both protein side-chain and protein backbone conformations, and allows for changes in ligand rigid-body and torsion degrees of freedom. We show significantly increased accuracy in both predicting ligand specificity altering mutations and binding site sequences. These methodological improvements should be useful for many applications of protein-ligand design. The approach also provides insights into the role of subtle conformational adjustments that enable functional changes not only in engineering applications but also in natural protein evolution.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26397464 PMCID: PMC4580623 DOI: 10.1371/journal.pcbi.1004335
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Comparison of fixed backbone and coupled moves methods on predicting specificity altering mutations.
| Mutant # | Wild-type PDB ID | Mutant PDB ID | Mutation | # of Designed Positions | Fixed Backbone Percentile | Fixed Backbone Rank | Coupled Moves Percentile | Coupled Moves Rank |
|---|---|---|---|---|---|---|---|---|
| 1 | 2FZN | 3E2Q | Y540S | 2 | – | – | 95.8 | 2 |
| 2 | 1FCB | 1SZE | L230A | 5 | – | – | 63.0 | 28 |
| 3 | 3KZO | 3L02 | E92A | 5 | 78.9 | 5 | 100 | 1 |
| 4 | 3KZO | 3L04 | E92S | 5 | – | – | 86.5 | 8 |
| 5 | 3KZO | 3L05 | E92P | 5 | – | – | – | – |
| 6 | 3KZO | 3L06 | E92V | 5 | – | – | 63.5 | 20 |
| 7 | 2O7B | 2O78 | H89F | 4 | – | – | 90.9 | 3 |
| 8 | 1ZK4 | 1ZK1 | G37D | 7 | 93.8 | 2 | 90.2 | 6 |
| 9 | 1A80 | 1M9H | K232G | 5 | – | – | 71.4 | 19 |
| 10 | 1A80 | 1M9H | R238H | 5 | – | – | – | – |
| 11 | 1PK7 | 1OUM | M64V | 3 | – | – | 69.2 | 13 |
| 12 | 1K70 | 1RA0 | D314S | 4 | – | – | 63.6 | 9 |
| 13 | 1K70 | 1RA5 | D314G | 4 | – | – | 90.9 | 3 |
| 14 | 1K70 | 1RAK | D314A | 4 | – | – | 72.7 | 7 |
| 15 | 2H6F | 2H6G | W602T | 9 | – | – | 61.3 | 37 |
| 16 | 3HG5 | 3LX9 | E203S | 7 | – | – | 93.5 | 7 |
| 17 | 3HG5 | 3LX9 | L206A | 7 | – | – | 87.1 | 13 |
Dashes denote cases where the known mutation was not enriched in the predicted sequences using non-native substrate/substrate analogs and therefore not predicted to be a specificity altering mutation. “# of Designed Positions” refers to the number positions that were allowed to mutate in the simulation. “Percentile” refers to the percentile of the known mutation relative to all other predicted mutations when sorted in descending order of their percent enrichment. “Rank” refers to the index of the known mutation in this sorted list. The number of correctly predicted mutations is significantly greater with the coupled moves method than with fixed backbone design (p < 0.0001).
Comparison of fixed backbone and coupled moves methods on predicting co-factor binding site sequences.
| Protein Domain | PFAM ID | Co-factor ligand | # of unique binding site sequences | PDB ID | Number of designed positions | Fixed Backbone Mean Profile Similarity | Coupled Moves Mean Profile Similarity |
|---|---|---|---|---|---|---|---|
| Cytochrome P450 | PF00067 | Heme | 8296 | 2IJ2 | 30 | 0.233 | 0.312 |
| Methyltransferase domain | PF08241 | S-adenosyl methionine | 7042 | 3DLC | 19 | 0.443 | 0.568 |
| Acetyltransferase (GNAT) family | PF00583 | Coenzyme A | 4084 | 3S6F | 14 | 0.541 | 0.639 |
| Glutathione S-transferase | PF13417 | Glutathione | 2948 | 3R2Q | 11 | 0.540 | 0.637 |
| Short chain dehydrogenase | PF00106 | NAD | 21085 | 1ZK4 | 21 | 0.541 | 0.659 |
| Aminotransferase class I and II | PF00155 | Pyridoxal 5'-phosphate | 3149 | 2XBN | 14 | 0.401 | 0.544 |
| FAD dependent oxidoreductase | PF01266 | FAD | 3053 | 3DK9 | 30 | 0.608 | 0.683 |
| Flavodoxin | PF00258 | FMN | 947 | 1F4P | 19 | 0.438 | 0.637 |
Sequence alignments of naturally occurring co-factor binding domains were taken from Pfam and filtered for redundancy. Positions were included in design if they had a side-chain heavy-atom within 6Å of the co-factor ligand and no gaps in the multiple sequence alignment.