| Literature DB >> 32295283 |
Bartłomiej Surpeta1,2, Carlos Eduardo Sequeiros-Borja1,2, Jan Brezovsky1,2.
Abstract
Computational prediction has become an indispensable aid in the processes of engineering and designing proteins for various biotechnological applications. With the tremendous progress in more powerful computer hardware and more efficient algorithms, some of in silico tools and methods have started to apply the more realistic description of proteins as their conformational ensembles, making protein dynamics an integral part of their prediction workflows. To help protein engineers to harness benefits of considering dynamics in their designs, we surveyed new tools developed for analyses of conformational ensembles in order to select engineering hotspots and design mutations. Next, we discussed the collective evolution towards more flexible protein design methods, including ensemble-based approaches, knowledge-assisted methods, and provable algorithms. Finally, we highlighted apparent challenges that current approaches are facing and provided our perspectives on their further development.Entities:
Keywords: computational design; de novo design; ensemble-based approach; flexible backbone; hotspot prediction; ligand transport; mutational analysis; protein dynamics; protein engineering; rational design
Mesh:
Substances:
Year: 2020 PMID: 32295283 PMCID: PMC7215530 DOI: 10.3390/ijms21082713
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Hierarchy of principal motions in protein dynamics. From left to right: bond vibrations (fs–ps), side-chain rotations (ps–ns), backbone fluctuations (ns), loop motion/gating (ns–ms), ligand binding/unbinding events (>100 ns), and collective domain movement (>µs).
Computational tools to extract valuable information for protein engineering from molecular dynamics (MD) simulations.
| Tool | Target Property | Availability | Code | Core Method(s) | Input | Link | Reference | ||
|---|---|---|---|---|---|---|---|---|---|
| Web Server | Standalone | Structure | Trajectory | ||||||
| Residue interaction network in protein molecular dynamics (RIP-MD) | Interaction network | + | + | Python | Residue interaction network | + | + |
| [ |
| Java-based Essential Dynamics (JED) | Essential dynamics | - | + | Java | Principal component analysis (PCA) | - | + |
| [ |
| DynaComm | Allostery | - | + | Python | Distance and correlation-based graphs, Dijkstra algorithm | + | + |
| [ |
| Computation of allosteric mechanism by evaluating residue–residue associations (CAMERRA) | Allostery | - | + | Perl, Python, C | PCA, contact analysis | - | + |
| [ |
| AQUA-DUCT | Ligand movement | - | + | Python | Geometry analysis | - | + |
| [ |
| CaverDock | Ligand movement | + | + | Python | Molecular docking | + | + |
| [ |
Figure 2Predicting engineering hotspots for protein dynamics based on analyses of interaction networks and coordinated movements. (A) Functional protein dynamics can be represented by a conformational ensemble of a given protein. (B) This ensemble can be subjected to contact analysis to identify residue–residue interaction networks (left) or subjected to PCA to reveal coupled movements indicated by blue arrows right). (C) Either of these two approaches or their combination and hotspot residues (blue spheres) essential for the dynamics or allosteric communication can be selected for engineering.
Figure 3Hotspot detection based on ligand transport analyses. (A) AQUA-DUCT tool traces the movement of ligands via void spaces (blue lines) inside the scope region (dotted orange shapes) of the protein moiety throughout an MD trajectory. Only the ligands that reach the functionally important object region (dotted violet ellipses) are considered. The significance of the interactions of transported ligands with residues (grey spheres) along the ligand trajectory (black arrows) can be evaluated to select relevant hotspots (blue spheres) for the modification of the transport kinetics. (B) By iteratively docking the ligand along a molecular tunnel, CaverDock estimates the energy profile of a ligand transport, indicating residues that are most likely responsible for energy barriers in the path. These residues represent hotspots (blue spheres) for the design of new protein variants with altered ligand transport.
Computational protocols implementing protein flexibility for protein design and redesign.
| Primary Package | Category | Method | Short Description | Input | Sampling of Side-Chain and Backbone Flexibility | Package | Add-Ons | Reference |
|---|---|---|---|---|---|---|---|---|
| Rosetta | Ensemble-based | Flex ddG | Estimating interface ∆∆G values upon mutation | Static structure | Backrub, torsion minimization, side-chain repacking |
|
| [ |
| Rosetta:MSF | Multistate framework using single-state protocols | Ensemble | Genetic algorithm based sequence optimizer and user-defined evaluator from Rosetta protocols |
| - | [ | ||
| Meta-multistate design (meta-MSD) | Engineering protein dynamics by meta-multistate design | Set of ensembles | Fast and accurate side-chain topology and energy refinement algorithm for sequence optimization; backbone-dependent rotamer library optimization for side-chains |
| PHOENIX scripts upon request | [ | ||
| Knowledge-based | Flexible backbone learning by Gaussian processes (FlexiBaL-GP) | Learning global protein backbone movements from multiple structures | Ensemble | Markov Chain Monte Carlo sampling—95% time spent on the side-chain selection and 5% time spent on the generation of the backbone movement |
| - | [ | |
| Structural homology algorithm for protein design (SHADES) | Protein design guided by local structural environments from known structures | Static structure | Sequence assembly from fragments followed by backbone optimization, side-chains repacking, and structure relaxation |
|
| [ | ||
| OSPREY 3.0 | Provable | Coordinates of atoms by Taylor series (CATS) | Enabling progressive backbone motions during protein design | Static structure | Continuous, strictly localized perturbations of the given segment of the backbone using a new internal coordinate system compatible with dead-end elimination workflows |
| - | [ |
Figure 4Flexible-backbone approaches facilitating the successful design of more diverse protein variants. (A) By employing a structural ensemble of a given protein, a larger variety of residues can be introduced to additional positions (green ticks), including those buried in the protein core, which would otherwise cause steric clashes (orange explosion-like shapes). (B) Data on protein dynamics encoded in different experimental structures or predicted ensembles can be extracted in the form of tertiary motifs (grey dotted circle) of interacting residues (pink arrows). Analogously, machine learning methods can learn and generalize the data to inspire novel backbone movements (grey arrows). The derived knowledge then enables the efficient application of more pronounced, yet physically correct, backbone perturbations during the design procedure.