| Literature DB >> 26693223 |
Igor V Oferkin1, Ekaterina V Katkova2, Alexey V Sulimov2, Danil C Kutov2, Sergey I Sobolev3, Vladimir V Voevodin3, Vladimir B Sulimov2.
Abstract
The adequate choice of the docking target function impacts the accuracy of the ligand positioning as well as the accuracy of the protein-ligand binding energy calculation. To evaluate a docking target function we compared positions of its minima with the experimentally known pose of the ligand in the protein active site. We evaluated five docking target functions based on either the MMFF94 force field or the PM7 quantum-chemical method with or without implicit solvent models: PCM, COSMO, and SGB. Each function was tested on the same set of 16 protein-ligand complexes. For exhaustive low-energy minima search the novel MPI parallelized docking program FLM and large supercomputer resources were used. Protein-ligand binding energies calculated using low-energy minima were compared with experimental values. It was demonstrated that the docking target function on the base of the MMFF94 force field in vacuo can be used for discovery of native or near native ligand positions by finding the low-energy local minima spectrum of the target function. The importance of solute-solvent interaction for the correct ligand positioning is demonstrated. It is shown that docking accuracy can be improved by replacement of the MMFF94 force field by the new semiempirical quantum-chemical PM7 method.Entities:
Year: 2015 PMID: 26693223 PMCID: PMC4674582 DOI: 10.1155/2015/126858
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Parallelization efficiency of the FLM program: number of finished optimizations (w) in 3 hours depending on number of nodes (N), in other words, on number of working processes n = 8∗N − 1.
|
|
|
|
|
|---|---|---|---|
| 1 | 7 | 329 | 47 |
| 32 | 255 | 11567 | 45 |
| 1024 | 8191 | 379885 | 46 |
Figure 1The dependence of the local minima set updates number (N ) on the total number of the performed “test optimizations” (N ) is presented for 1SQO (black lower line) and 1VJA (red higher line) protein-ligand complexes. The saturation means that N is not changed with the increase of N . The target function was the MMFF94 energy in vacuum.
IN/INN values for all tested 16 protein-ligand complexes and all 7 minima sets. “inf” for IN means that all (1024 minima) calculated low-energy minima have energy below the energy of the optimized native ligand. “inf” for INN means that all the low-energy minima found by FLM have RMSD from the native position above 2 Å.
| PDB ID | {1}MMFF94 | {1}MMFF94 + PCM | {1}PM7 | {1}PM7 + COSMO | {2}MMFF94 | {2}MMFF94 + PCM | {2}MMFF94 + SGB |
|---|---|---|---|---|---|---|---|
| 4FT0 | 36/20 | 8/7 | 37/12 | 1/1 | 180/99 | 164/159 | 8/6 |
| 4FT9 | 45/28 | 1/1 | 25/6 | 1/1 | 194/125 | 3/1 | 1/1 |
| 4FSW | 5/5 | 6/6 | 40/40 | 12/13 | 110/102 | 134/140 | 21/3 |
| 4FTA | inf/inf | 4/inf | 379/inf | 1/inf | inf/inf | 186/187 | 97/97 |
| 4FV5 | 204/131 | 3/3 | 253/194 | 2/1 | 186/134 | 6/3 | 5/5 |
| 4FV6 | inf/inf | 1/inf | 49/inf | 1/inf | 86/289 | 3/68 | 1/24 |
| 1DWC | inf/670 | 245/25 | 689/661 | 158/141 | 245/114 | 250/35 | 107/8 |
| 1TOM | inf/inf | 13/inf | inf/inf | 1/inf | inf/inf | 13/4 | 7/1 |
| 1C5Y | 1/1 | 2/1 | 7/1 | 2/1 | 1/1 | 2/1 | 1/1 |
| 1F5L | 1/1 | 1/1 | 43/16 | 69/30 | 1/1 | 10/1 | 1/1 |
| 1O3P | 20/18 | 21/1 | 5/1 | 3/1 | 69/62 | 274/1 | 130/2 |
| 1SQO | 1/1 | 2/1 | 1/1 | 1/1 | 1/1 | 54/1 | 5/1 |
| 1VJ9 | 46/1 | 86/51 | 32/1 | 26/8 | 6/1 | 11/18 | 10/14 |
| 1VJA | 42/3 | 7/1 | 7/4 | 6/4 | 4/49 | 1/2 | 1/1 |
| 2P94 | 36/2 | 19/1 | 23/6 | 7/1 | 22/1 | 35/1 | 21/1 |
| 3CEN | 96/1 | 18/1 | 13/1 | 3/1 | 90/1 | 35/1 | 13/1 |
Figure 2The distribution of the RMSD of the low-energy minima from the nonoptimized native ligand positions for 16 complexes for “{1}MMFF94” and “{1}MMFF94 + PCM” sets (a) and for “{2}MMFF94”, “{2}MMFF94 + PCM”, and “{2}MMFF94 + SGB” sets (b). The lowest RMSD values are presented for each complex in the insets.
Figure 3The number of clusters for each protein-ligand complex.
The indices of the clusters where the native ligand conformation is located. The clusters are sorted here by lowest energies of their conformations in ascending order; that is, the cluster containing the minimum with the lowest protein-ligand energy has index equal to 1. “inf” means that the native position does not fall into any cluster.
| Complex ID | {1}MMFF94 | {2}MMFF94 + PCM |
|---|---|---|
| Cluster number with native pose | ||
| 4FT0 | 10 | 139 |
| 4FT9 | 29 | 1 |
| 4FSW | 4 | 12 |
| 4FTA | inf | 59 |
| 4FV5 | 10 | 3 |
| 4FV6 | inf | 23 |
| 1DWC | 1 | 14 |
| 1TOM | inf | 2 |
| 1C5Y | 1 | 1 |
| 1F5L | 1 | 1 |
| 1O3P | 2 | 1 |
| 1SQO | 1 | 1 |
| 1VJ9 | 1 | inf |
| 1VJA | 2 | 2 |
| 2P94 | 2 | 1 |
| 3CEN | 1 | inf |
The protein-ligand binding energy components: potential energies and additive corrections to it. ΔE is the binding potential energy from the MMFF94 force field in vacuum, calculated by the global minima of the protein-ligand complex and the free ligand with energy of the free protein. ΔG is the correction due to vibration degrees of freedom calculated with respective configuration integral Z (2). ΔG and ΔG are corrections due to translational and rotational degrees of freedom calculated with configuration integrals Z (3) and Z (4), respectively. Corrections ΔG , ΔG , and ΔG include both enthalpy and entropy components. ΔG all is the correction for multiple minima accounting; it is calculated as a difference between binding free energy ΔG bind, calculated with multiple minima accounting, and binding free energy, calculated with only the global minima of the complex and the ligand. ΔG exp is the experimental binding energy calculated from the binding constant.
| Protein | PDBID | Δ | Δ | Δ | Δ | Δ | Δ | Δ |
|---|---|---|---|---|---|---|---|---|
| CHK1 | 4FT0 | −10.1 | 82.2 | 63.8 | −3.9 | 10.6 | 10.3 | 1.4 |
| 4FT9 | −10.9 | −48.5 | −63.7 | −5.0 | 10.4 | 9.9 | −0.1 | |
| 4FSW | −6.8 | −44.2 | −60.2 | −3.7 | 10.3 | 9.4 | 0.0 | |
| 4FTA | −9.8 | −9.8 | −30.4 | −0.2 | 10.6 | 10.1 | 0.1 | |
|
| ||||||||
| ERK2 | 4FV5 | −10.9 | −79.3 | −102.1 | 0.5 | 10.7 | 10.6 | 1.0 |
| 4FV6 | −12.3 | −74.6 | −96.8 | −0.4 | 10.8 | 10.7 | 1.1 | |
|
| ||||||||
| Thrombin | 1DWC | −10.5 | −128.5 | −144.9 | −4.2 | 10.9 | 10.5 | 0.2 |
| 1TOM | −11.8 | −224.1 | −248.7 | 2.7 | 10.7 | 10.5 | 0.7 | |
|
| ||||||||
| Urokinase | 1C5Y | −6.0 | −16.3 | −34.8 | −0.7 | 10.0 | 8.7 | 0.5 |
| 1F5L | −7.5 | 34.7 | 17.4 | −2.3 | 10.2 | 9.3 | 0.1 | |
| 1O3P | −9.4 | 23.8 | 3.9 | −2.0 | 10.6 | 10.2 | 1.1 | |
| 1SQO | −10.6 | −4.6 | −24.8 | −0.4 | 10.3 | 9.6 | 0.7 | |
| 1VJ9 | −10.7 | −25.9 | −50.7 | 2.4 | 11.0 | 10.8 | 0.6 | |
| 1VJA | −10.9 | −31.0 | −51.1 | −2.1 | 10.9 | 10.5 | 0.8 | |
|
| ||||||||
| Factor Xa | 2P94 | −13.0 | −42.0 | −68.2 | −5.5 | 10.9 | 10.8 | 0.0 |
| 3CEN | −11.7 | −49.2 | −69.5 | −1.9 | 10.9 | 10.6 | 0.7 | |
Binding energies (in kcal/mol) calculated as E 0 1(PL) − E(P) − E 0 1(L), where E 0 1(PL) and E 0 1(L) are energies of the protein-ligand complex and the free ligand in their global minima, respectively, and E(P) is energy of the protein in its configuration prepared as it is described in Section 2. The global energies of complexes and ligands were taken from respective minima sets (see Section 2.2). ΔG exp is the experimental binding energy calculated from the binding constant. “Energy range” is the difference between the highest and the lowest energies among all 16 protein-ligand complexes. “Energy correlation” is Pearson correlation coefficient between experimental and calculated binding energies. Autodock and SOL scoring functions are also given to compare (in kcal/mol).
| PDB ID | Δ | {1}MMFF94 | {1}MMFF94 + PCM | {1}PM7 | {1}PM7 + COSMO | {2}MMFF94 | {2}MMFF94 + PCM | {2}MMFF94 + SGB | Sol Score | Autodock score |
|---|---|---|---|---|---|---|---|---|---|---|
| 4FT0 | −10.1 | 63.84 | 0.04 | −39.48 | −48.67 | 58.64 | −0.08 | 3.03 | −5.20 | −7.15 |
| 4FT9 | −10.9 | −63.72 | −9.98 | −111.64 | −48.77 | −64.83 | −10.05 | −16.34 | −4.29 | −4.9 |
| 4FSW | −6.8 | −60.20 | −5.89 | −108.48 | −46.41 | −60.19 | −6.53 | −7.72 | −4.78 | −6.08 |
| 4FTA | −9.8 | −30.36 | −5.44 | −126.54 | −58.61 | −30.35 | −15.04 | −14.05 | −4.35 | −4.7 |
| 4FV5 | −10.9 | −102.01 | −7.42 | −168.44 | −54.37 | −102.01 | −4.64 | −11.16 | −6.05 | −8.25 |
| 4FV6 | −12.3 | −96.92 | −7.75 | −164.91 | −72.68 | −89.27 | −13.42 | −13.87 | −5.26 | −5.6 |
| 1DWC | −10.5 | −146.16 | −33.43 | −194.69 | −70.12 | −144.88 | −32.93 | −36.77 | −2.86 | −4.24 |
| 1TOM | −11.8 | −248.29 | −49.66 | −258.10 | −73.67 | −248.28 | −49.89 | −51.59 | −8.11 | −7.88 |
| 1C5Y | −6.0 | −34.83 | −79.15 | −33.81 | −52.84 | −34.83 | −79.34 | −80.85 | −6.83 | −5.28 |
| 1F5L | −7.5 | 17.40 | −52.26 | −29.98 | −81.95 | 17.40 | −52.54 | −52.88 | −4.41 | −6.62 |
| 1O3P | −9.4 | 3.59 | −40.30 | −32.53 | −64.60 | 3.59 | −40.59 | −42.54 | −6.95 | −8.43 |
| 1SQO | −10.6 | −24.78 | −50.42 | −58.75 | −69.56 | −24.78 | −51.30 | −52.22 | −6.75 | −8.68 |
| 1VJ9 | −10.7 | −49.57 | −50.03 | −91.69 | −76.58 | −49.20 | −43.77 | −49.18 | −4.53 | −3.17 |
| 1VJA | −10.9 | −49.87 | −51.91 | −95.43 | −81.40 | −47.97 | −38.86 | −41.85 | −4.47 | −1.82 |
| 2P94 | −13.0 | −68.20 | −15.74 | −153.53 | −68.94 | −67.28 | −15.79 | −20.87 | −6.53 | −13.09 |
| 3CEN | −11.7 | −69.48 | −18.17 | −133.10 | −63.12 | −69.47 | −18.43 | −26.62 | −5.48 | −11.68 |
| Energy range | 7.0 | 312 | 79.2 | 228 | 35.5 | 306.93 | 79.26 | 83.89 | 5.25 | 11.27 |
| Energy correlation | 0.41 | −0.36 | 0.60 | 0.33 | 0.40 | −0.39 | −0.35 | 0.09 | 0.13 |