| Literature DB >> 34978000 |
Benjamin Ries1, Karl Normak1, R Gregor Weiß1, Salomé Rieder1, Emília P Barros1, Candide Champion1, Gerhard König1, Sereina Riniker2.
Abstract
The calculation of relative free-energy differences between different compounds plays an important role in drug design to identify potent binders for a given protein target. Most rigorous methods based on molecular dynamics simulations estimate the free-energy difference between pairs of ligands. Thus, the comparison of multiple ligands requires the construction of a "state graph", in which the compounds are connected by alchemical transformations. The computational cost can be optimized by reducing the state graph to a minimal set of transformations. However, this may require individual adaptation of the sampling strategy if a transformation process does not converge in a given simulation time. In contrast, path-free methods like replica-exchange enveloping distribution sampling (RE-EDS) allow the sampling of multiple states within a single simulation without the pre-definition of alchemical transition paths. To optimize sampling and convergence, a set of RE-EDS parameters needs to be estimated in a pre-processing step. Here, we present an automated procedure for this step that determines all required parameters, improving the robustness and ease of use of the methodology. To illustrate the performance, the relative binding free energies are calculated for a series of checkpoint kinase 1 inhibitors containing challenging transformations in ring size, opening/closing, and extension, which reflect changes observed in scaffold hopping. The simulation of such transformations with RE-EDS can be conducted with conventional force fields and, in particular, without soft bond-stretching terms.Entities:
Keywords: Enveloping distribution sampling; Free energy calculation; Molecular dynamics; Protein-ligand binding; Replica exchange
Mesh:
Substances:
Year: 2022 PMID: 34978000 PMCID: PMC8907147 DOI: 10.1007/s10822-021-00436-z
Source DB: PubMed Journal: J Comput Aided Mol Des ISSN: 0920-654X Impact factor: 3.686
Fig. 1State graph to calculate relative binding free energies, where the nodes represent specific compounds A–E in a particular environment (water/protein). The connecting (directed) edges describe the transformations from one end state to another. The dashed-dotted arrows denote the (absolute) binding free energy of compound i to the protein, , whereas solid arrows indicate alchemical transformations between compound i to compound j in a given environment. From the resulting , can be calculated (gray dashed arrows) and compared with the value obtained from the difference of the experimentally determined . In pathway-dependent methods, each edge between two end states is calculated separately. With (RE-)EDS, all end states in a given environment can be considered simultaneously in a single simulation of a reference state (green circles)
Fig. 2Schematic illustration of the effect of the two types of EDS reference-state parameters. a The smoothing parameter s decreases the barriers between the end states. If s is too small, an “undersampling” situation occurs with a global unphysical minimum. b The energy offsets provide equal weighting to all end states in the EDS reference state. The figure was generated with Ensembler [27]
Fig. 3Schematic illustration of RE-EDS with three harmonic oscillators as end states (A, B, and C). Each replica differs by the s-parameter, generating reference states with a different degree of smoothness. Sampling of each replica is denoted with orange dots. Exchanges between the replicas are indicated with green arrows. The replica graph shows three regions: a “physical” region where s is close to 1, a transition region, and the “undersampling” region when s approaches zero. The figure was generated with Ensembler [27]
Fig. 4The RE-EDS workflow can be split into four steps: (1) Input stage with energy offsets set to and a set of s-parameters logarithmically distributed between 1 and ; (2) Parameter exploration to determine the lower bound for s, to obtain equilibrated coordinates for each end state, and to estimate initial energy offsets with the PEOE scheme [19]; (3) Parameter optimization to improve the s-distribution with the N-LRTO algorithm [20] and the state sampling with energy offset rebalancing; (4) Production run and calculation of the free-energy differences
Fig. 5(Top): 3D depiction of the five CHK1 inhibitors L1, L17, L19, L20, and L21 (numbering according to Ref. [23]). The selected locations of the distance restraints are indicated by the silver spheres. (Bottom): CHK1 protein in complex with the ligand bundle (PDB ID:3U9N)
Energy offsets estimated from a short RE-EDS simulation using the PEOE [19] scheme
| Ligand | Water | Complex | ||
|---|---|---|---|---|
| RE-EDS 1SS [kJ mol | RE-EDS SSM [kJ mol | RE-EDS 1SS [kJ mol | RE-EDS SSM [kJ mol | |
| L1 | 0.0 | 0.0 | 0.0 | 0.0 |
| L17 | ||||
| L19 | ||||
| L20 | ||||
| L21 | ||||
The errors indicate the standard deviation over the different replicas in undersampling. All energy offsets were calculated relative to ligand L1. The starting coordinates were selected following the 1SS or the SSM approach (see “Theory” and “Methods” sections)
Fig. 6Optimization steps of the s-distribution with the N-LRTO [20] algorithm followed by the energy offset rebalancing scheme (start indicated by the red horizontal line). The measured quality criteria were the number of round trips (1. row), the average round-trip time (2. row), the placement of the replicas in s-space (3. row), and the sampling fractions of maximally contributing states (4. row). The light colored bars of indicate s-optimization iterations, whereas the fully colored bars indicate energy offset rebalancing steps
Fig. 7Comparison of the Boltzmann reweighted potential-energy distributions obtained from standard MD simulations of a given end state (black) and from the RE-EDS production runs with the 1SS (green) and SSM (turquoise, dashed) approaches
Fig. 8Free-energy differences estimated from the production run of 3.5 ns length. (Top): Comparison between the experimental and calculated using RE-EDS 1SS and RE-EDS SSM. The results were calculated with all possible pairwise transformations (forward and backward). (Bottom): Graphical representation of the results with structures, inspired by the one in Ref. [24]
Relative binding free energies from experiment and calculated with the RE-EDS 1SS and RE-EDS SSM approaches
| Ligands | Exp. [ | FEP+ [ | FEP+ CC [ | QligFEP [ | RE-EDS 1SS | RE-EDS SSM | |
|---|---|---|---|---|---|---|---|
| [kJ mol | [kJ mol | [kJ mol | [kJ mol | [kJ mol | [kJ mol | ||
| L17 | L1 | 0.1 | − 3.6 ± 0.4 | − 2.9 ± 1.0 | − 1.6 ± 1.7 | 5.1 ± 0.8 | 3.0 ± 2.0 |
| L19 | L1 | − 4.8 | − 3.9 ± 0.3 | − 4.0 ± 0.6 | − 1.7 ± 2.0 | 3.0 ± 1.0 | − 5.0 ± 0.1 |
| L20 | L1 | − 2.0 | − 2.5 ± 0.1 | − 3.1 ± 1.0 | − 1.3 ± 1.3 | 0.2 ± 0.9 | 0.5 ± 0.1 |
| L21 | L1 | − 2.3 | − 1.4 ± 0.8 | 3.2 ± 0.1 | |||
| L19 | L17 | − 4.9 | − 1.4 ± 0.3 | − 1.1 ± 1.0 | − 2.1 ± 0.6 | − 7.9 ± 1.9 | |
| L20 | L17 | − 2.1 | 0.3 ± 0.4 | − 0.1 ± 0.8 | − 1.3 ± 2.3 | − 4.9 ± 0.1 | − 2.5 ± 1.9 |
| L21 | L17 | − 2.4 | − 1.1 ± 0.4 | − 0.9 ± 0.9 | − 6.5 ± 0.1 | 0.2 ± 1.9 | |
| L20 | L19 | 2.8 | − 2.7 ± 0.6 | 5.4 ± 0.1 | |||
| L21 | L19 | 2.5 | − 0.1 ± 0.6 | 0.6 ± 0.1 | 0.6 ± 4.9 | − 4.4 ± 0.6 | 8.2 ± 0.1 |
| L21 | L20 | − 0.3 | − 0.3 ± 0.8 | − 0.6 ± 0.8 | 0.6 ± 1.1 | − 1.6 ± 0.1 | − 2.7 ± 0.1 |
| RMSE | 2.4 ± 0.3 | 2.1 ± 0.2 | 2.3 ± 0.38 | 4.8 ± 0.6 | 3.3 ± 0.3 | ||
| MAE | 1.8 ± 1.2 | 1.9 ± 1.0 | 2.0 ± 1.2 | 3.9 ± 2.8 | 2.8 ± 1.7 | ||
| 0.67 | 0.73 | 0.61 | − 0.01 | 0.69 | |||
| 640 | 640 | 51 | 171.5 | 157.5 | |||
For comparison, the results for FEP+ with and without cycle closure (CC) correction taken from Ref. [24] and the results for QligFEP taken from Ref. [13] are listed. The free-energy differences of directly simulated paths were used to infer not directly simulated free-energy differences (marked in bold). If multiple indirect paths were possible, their average was used. The errors for QligFEP were determined in Ref. [13] by calculating the standard deviation over ten replicas. For FEP+, the error of the results was taken from the used BAR [54] method and the FEP+ CC errors were obtained from the cycle closure analysis. For the RE-EDS approaches, the reported error is based on the statistical uncertainties of the values estimated using Gaussian error approximation [15]. The uncertainty estimate of the RMSE was obtained by a 100-fold bootstrapping approach
Absolute binding free energies and ranking of the ligands derived from the relative binding free energies
| Ligands | Exp. [ | FEP+ [ | FEP+ CC [ | QligFEP [ | RE-EDS 1SS | RE-EDS SSM |
|---|---|---|---|---|---|---|
| Molecule | [kJ mol | [kJ mol | [kJ mol | [kJ mol | [kJ mol | [kJ mol |
| L1 | − 40.7 | − 41.7 ± 1.7 | − 41.7 ± 0.9 | − 38.5 ± 1.5 | − 40.0 ± 3.4 | − 38.0 ± 2.0 |
| L17 | − 40.8 | − 38.0 ± 1.0 | − 38.2 ± 1.1 | − 38.6 ± 1.3 | − 33.7 ± 1.3 | − 41.7 ± 2.3 |
| L19 | − 35.9 | − 38.1 ± 0.9 | − 38.3 ± 1.8 | − 38.3 ± 1.0 | − 37.6 ± 3.3 | − 33.0 ± 2.0 |
| L20 | − 38.6 | − 38.6 ± 1.6 | − 38.3 ± 1.4 | − 39.2 ± 1.7 | − 40.4 ± 3.3 | − 39.1 ± 2.3 |
| L21 | − 38.4 | − 37.7 ± 1.4 | − 37.8 ± 1.3 | − 39.4 ± 1.9 | − 42.4 ± 2.9 | − 42.5 ± 1.4 |
| RMSE | 1.7 ± 0.4 | 1.7 ± 0.4 | 1.7 ± 0.4 | 3.8 ± 1.3 | 2.6 ± 0.6 | |
| MAE | 1.3 ± 1.0 | 1.4 ± 0.9 | 1.4 ± 0.9 | 3.0 ± 2.3 | 2.2 ± 1.6 | |
| 0.20 | 0.10 | − 0.21 | − 0.40 | 0.30 |
The values were calculated from the relative binding free energies using an experimental binding free energy as anchor point, and then averaged over the five possibilities. The errors are standard deviations over the possible outcomes. For comparison, the results for FEP+ with and without cycle closure (CC) correction taken from Ref. [24] and the results for QligFEP taken from Ref. [13] are shown (calculated with the same procedure). The uncertainty estimate of the RMSE was obtained by a 100-fold bootstrapping approach