| Literature DB >> 26757914 |
C Ruben Vosmeer1, Derk P Kooi1, Luigi Capoferri1, Margreet M Terpstra1, Nico P E Vermeulen1, Daan P Geerke2.
Abstract
Recently an iterative method was proposed to enhance the accuracy and efficiency of ligand-protein binding affinity prediction through linear interaction energy (LIE) theory. For ligand binding to flexible Cytochrome P450s (CYPs), this method was shown to decrease the root-mean-square error and standard deviation of error prediction by combining interaction energies of simulations starting from different conformations. Thereby, different parts of protein-ligand conformational space are sampled in parallel simulations. The iterative LIE framework relies on the assumption that separate simulations explore different local parts of phase space, and do not show transitions to other parts of configurational space that are already covered in parallel simulations. In this work, a method is proposed to (automatically) detect such transitions during the simulations that are performed to construct LIE models and to predict binding affinities. Using noise-canceling techniques and splines to fit time series of the raw data for the interaction energies, transitions during simulation between different parts of phase space are identified. Boolean selection criteria are then applied to determine which parts of the interaction energy trajectories are to be used as input for the LIE calculations. Here we show that this filtering approach benefits the predictive quality of our previous CYP 2D6-aryloxypropanolamine LIE model. In addition, an analysis is performed of the gain in computational efficiency that can be obtained from monitoring simulations using the proposed filtering method and by prematurely terminating simulations accordingly.Entities:
Keywords: Binding free energy prediction; Cytochrome P450 2D6; Iterative Linear Interaction Energy approach; Molecular Dynamics simulations
Year: 2016 PMID: 26757914 PMCID: PMC4710667 DOI: 10.1007/s00894-015-2883-y
Source DB: PubMed Journal: J Mol Model ISSN: 0948-5023 Impact factor: 1.810
Fig. 1The sequence of filtering steps applied to the raw data for the electrostatic (black) and van der Waals (red) protein–ligand interaction energies (thin continuous lines in the lower panel). The data are first filtered using a Fourier transform (dashed lines, lower panel), then splines are fitted to the filtered data (thick lines, lower panel). The gradients of the fitted splines are then calculated (straight lines, upper panel). A change in conformation is considered to have occurred when the absolute value of the gradient exceeds a predefined cut-off (set to 0.2 kJ mol −1 ps −1, dashed line, upper panel)
Root-mean-square error (RMSE) and standard deviation in error prediction (SDEP) values for LIE models with ’s and ’s in Eq. 3 averaged over various time spans of simulations i
|
| 200 ps | 400 ps | 600 ps | 1000 ps | ||||
|---|---|---|---|---|---|---|---|---|
| RMSE | SDEP | RMSE | SDEP | RMSE | SDEP | RMSE | SDEP | |
| Unfiltereda | 6.35 | 8.69 | 6.05 | 8.56 | 6.14 | 8.50 | 6.03 | 8.46 |
| Filteredb | 5.76 | 8.41 | 5.63 | 8.02 | 5.69 | 8.05 | ||
| Filter+extc | 5.89 | 8.52 | 5.68 | 8.08 | 5.65 | 8.05 | ||
a Ligand interaction energies in protein simulations averaged over the time span ranging from 0 ps to L
b Time spans selected according to the protocol in the Methods section
c Same as b, but step (2) of the protocol is omitted
Average time per simulation i in Eq. 3 (sim.) needed to calibrate the models reported in Tables 1 and 2
|
| 200 ps | 400 ps | 600 ps | 1000 ps | ||||
|---|---|---|---|---|---|---|---|---|
| Average | usedd | corr.e | usedd | corr.e | usedd | corr.e | usedd | corr.e |
| sim. time | (ps) | (ps) | (ps) | (ps) | (ps) | (ps) | (ps) | (ps) |
| Unfiltereda | 200 | 200 | 400 | 400 | 600 | 600 | 1000 | 1000 |
| Filteredb | 194 | 283 | 342 | 656 | 435 | 851 | ||
| Filter+extc | 526 | 614 | 539 | 853 | 540 | 957 | ||
a Ligand interaction energies in protein simulations averaged over the time span ranging from 0 ps to L
b Time spans selected according to the protocol discussed in the Methods section
c Same as b, but step (2) of the protocol is omitted
d Simulation times are counted from 0 ps until the end of the time span defined in b or c
e Same as d, but for simulations for which the selected time span is shorter than L, 1000 ps was used in the calculation of the average time
α and β values for LIE models with ’s and ’s in Eq. 3 averaged over various time spans of simulations i
|
| 200 ps | 400 ps | 600 ps | 1000 ps | ||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| |
| Unfiltereda | 0.442 | 0.078 | 0.446 | 0.080 | 0.447 | 0.084 | 0.448 | 0.090 |
| Filteredb | 0.441 | 0.088 | 0.444 | 0.088 | 0.444 | 0.088 | ||
| Dilter+extc | 0.442 | 0.087 | 0.445 | 0.091 | 0.445 | 0.090 | ||
a Ligand interaction energies in protein simulations averaged over the time span ranging from 0 ps to L
b Time spans selected according to the protocol in the Methods section
c Same as b, but step (2) of the protocol is omitted
Fig. 2Comparison between the ‘ns’ model and the filtered model with L set to 200 ps. Thebase of each arrow is located on the result of the ‘ns’ model, while the arrow points at the result of the filtered model