| Literature DB >> 24534807 |
Kalina Atkovska1, Sergey A Samsonov2, Maciej Paszkowski-Rogacz3, M Teresa Pisabarro4.
Abstract
Molecular docking has been extensively applied in virtual screening of small molecule libraries for lead identification and optimization. A necessary prerequisite for successful differentiation between active and non-active ligands is the accurate prediction of their binding affinities in the complex by use of docking scoring functions. However, many studies have shown rather poor correlations between docking scores and experimental binding affinities. Our work aimed to improve this correlation by implementing a multipose binding concept in the docking scoring scheme. Multipose binding, i.e., the property of certain protein-ligand complexes to exhibit different ligand binding modes, has been shown to occur in nature for a variety of molecules. We conducted a high-throughput docking study and implemented multipose binding in the scoring procedure by considering multiple docking solutions in binding affinity prediction. In general, improvement of the agreement between docking scores and experimental data was observed, and this was most pronounced in complexes with large and flexible ligands and high binding affinities. Further developments of the selection criteria for docking solutions for each individual complex are still necessary for a general utilization of the multipose binding concept for accurate binding affinity prediction by molecular docking.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24534807 PMCID: PMC3958872 DOI: 10.3390/ijms15022622
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1.Complexes exhibiting multipose binding. (a,b,c) Trypsin-inhibitor complex. The protein receptor is depicted as a cartoon and an orange molecular surface. Crystal ligands in orientations I and II are shown in magenta and blue sticks, respectively. AutoDock docking solutions are shown in grey sticks; (d,e) HIV protease-inhibitor complex. The crystallographic ligand in orthorhombic orientation is shown in magenta sticks, and the AutoDock solutions analogous to orthorhombic (d) and to hexagonal (e) orientations are shown in grey sticks for comparison; (f,g,h) SH3 domain-polyproline peptide complex. The SH3 domain is depicted as a cartoon, and the crystal ligands in orientation I and II are shown in magenta and blue sticks, respectively. The crystal unit cell is shown in (f). In (g) and (h) a molecular surface is colored by lipophilicity (green: lipophilic, pink: hydrophilic), and the MOE docking solutions are shown in grey sticks (orientation I: (g), orientation II: (h)); and (i,j,k) Annexin A2-heparin complex. The protein is shown as a cartoon, and a molecular surface indicates the electrostatic potential (blue: positive, red: negative). The crystal ligand is shown in magenta sticks, and AutoDock docking solutions in black (poses I, II and III are shown in panels (i), (j) and (k), respectively). The sugar rings in the ligand are labeled for clarity (A, B, C and D).
Number of complexes (% from respective dataset in brackets) from each dataset/docking program combination for which results were successfully generated and the best pose was assigned favorable binding energy.
| Dataset | eHiTS | AutoDock |
|---|---|---|
| 2428 (99%) | 2070 (84%) | |
| 214 (99%) | 197 (91%) | |
| 340 (99%) | 225 (65%) |
Pearson correlation between the predicted binding affinities of the “best pose” and the “top score” docking solutions and the experimental binding affinities for all dataset/docking program combinations.
| Dataset/Program | Best pose | Top score | |
|---|---|---|---|
| Refined/eHiTS | 0.47 | 0.54 | 5.8 |
| Core/eHiTS | 0.46 | 0.51 | 8.4 |
| CSAR/eHiTS | 0.58 | 0.61 | 5.3 |
| Refined/AutoDock | 0.07 | 0.11 | 15.7 |
| Core/AutoDock | 0.14 | 0.18 | 15.2 |
| CSAR/AutoDock | 0.10 | 0.09 | 18.7 |
Fraction of complexes, for which the top scores corresponded the best poses.
Statistical analysis of the results obtained for the Refined-eHiTS combination.
| Refined/eHiTS (2428 complexes) | |||||
|---|---|---|---|---|---|
| Number of poses | ρ | ∑(ressp/mp)2
| Count | ||
| 0.47 | 0.47 | NA | 16 651 | NA | |
| 0.57 | 0.56 | 2.76 × 10−16 | 11,810 | 2,117 | |
| 0.57 | 0.56 | 1.42 × 10−14 | 12,083 | 2,066 | |
| 0.56 | 0.55 | 2.66 × 10−13 | 12,287 | 2,023 | |
| 0.56 | 0.55 | 1.46 × 10−12 | 12,417 | 1,989 | |
| 0.56 | 0.55 | 6.55 × 10−12 | 12,535 | 1,966 | |
| 0.56 | 0.55 | 1.93 × 10−11 | 12,623 | 1,933 | |
| 0.56 | 0.55 | 4.79 × 10−11 | 12,698 | 1,910 | |
| 0.56 | 0.55 | 1.07 × 10−10 | 12,766 | 1,878 | |
| 0.55 | 0.54 | 2.22 × 10−10 | 12,829 | 1,858 | |
| 0.53 | 0.52 | 1.06 × 10−5 | 13,929 | 1,603 | |
| 0.53 | 0.52 | 3.31 × 10−7 | 13,533 | 1,048 | |
| 0.53 | 0.52 | 4.92 × 10−7 | 13,589 | 1,016 | |
| 0.53 | 0.53 | 3.57 × 10−7 | 13,543 | 1,046 | |
| 0.53 | 0.52 | 7.53 × 10−7 | 13,631 | 1,038 | |
| 0.53 | 0.52 | 2.72 × 10−7 | 13,531 | 1,090 | |
| 0.53 | 0.52 | 5.23 × 10−7 | 13,597 | 1,052 | |
| 0.53 | 0.52 | 6.30 × 10−7 | 13,616 | 1,055 | |
| 0.53 | 0.52 | 6.15 × 10−7 | 13,606 | 1,036 | |
| 0.53 | 0.52 | 5.70 × 10−7 | 13,604 | 1,034 | |
| 0.54 | 0.53 | 7.33 × 10−15 | 12,055 | 1,784 | |
| 0.54 | 0.53 | 4.17 × 10−13 | 12,349 | 1,625 | |
| 0.54 | 0.53 | 7.51 × 10−12 | 12,563 | 1,527 | |
| 0.54 | 0.53 | 4.59 × 10−11 | 12,711 | 1,441 | |
| 0.54 | 0.53 | 2.25 × 10−10 | 12,845 | 1,376 | |
| 0.54 | 0.53 | 7.80 × 10−10 | 12,955 | 1,312 | |
| 0.54 | 0.53 | 2.13 × 10−9 | 13,046 | 1,263 | |
| 0.54 | 0.53 | 5.47 × 10−9 | 13,134 | 1,214 | |
| 0.53 | 0.52 | 1.23 × 10−8 | 13,212 | 1,158 | |
| 0.52 | 0.51 | 9.0 × 10−4 | 14,581 | 720 | |
| 0.63 | 0.62 | NA | 11,115 | NA | |
Pearson correlation coefficient between Eexp and Esp/mp;
Spearman rank-correlation coefficient between Eexp and Esp/mp;
p-value from the (Eexp − Esp)2 vs. (Eexp − Emp)2 t-test;
∑(Eexp − Esp/mp)2, (kcal/mol)2;
number of complexes where |Eexp − Esp| ≥ |Eexp − Emp|;
a multipose case considering all poses with higher binding affinity than the single-pose, when the single-pose binding affinity is lower than the experimental, and all poses with lower binding affinity than the single-pose, when the single-pose binding affinity is higher than the experimental; and
single-pose case constructed by selecting the poses with a score closest to the experimental affinity.
Figure 2.The difference of absolute single-pose and multipose residuals |Eexp − Esp| − |Eexp − Emp| is plotted against the solvent accessible surface area of the ligand. All values above zero denote an improvement of the binding affinity prediction. (a) Two-pose case considering the top score and the best pose from the Refined-eHiTS combination; and (b) Multipose case considering all poses from the Refined-eHiTS combination.
Figure 3.Analysis of the number of ligand atoms (a,b); ligand flexibility (c,d); and experimental binding affinity (e,f); in relation to the effect of multipose binding on binding affinity prediction for the two-pose case considering the top score and the best pose from the Refined-eHiTS combination. (a,c,e) Ratio of “improved” (pink) and “not-improved” (light blue) binding affinities; and (b,d,f) Average improvement of the squared residuals shown in dots and density of the ligand property shown as a line.
Kendall τ and Spearman ρ rank-correlation coefficients between the difference of the squared residuals and the properties of the complex.
| Property | Refined/eHiTS/TOP-BEST | Refined/AutoDock/TOP-BEST | ||
|---|---|---|---|---|
|
| ||||
| ρ | τ | ρ | τ | |
| Number of atoms | 0.21 | 0.14 | 0.36 | 0.25 |
| Molecular weight | 0.22 | 0.15 | 0.34 | 0.23 |
| Solvent accessible surface area | 0.23 | 0.15 | 0.37 | 0.25 |
| Molar refractivity | 0.21 | 0.14 | 0.36 | 0.24 |
| Flexible torsions | 0.22 | 0.15 | 0.33 | 0.23 |
| Binding affinity | 0.48 | 0.33 | 0.44 | 0.30 |
| Polar surface area | 0.14 | 0.09 | 0.12 | 0.08 |
| Log | 0.15 | 0.10 | 0.24 | 0.16 |
| Number of rings | 0.02 | 0.02 | 0.21 | 0.15 |
| Charge | 0.00 | 0.00 | 0.11 | 0.08 |
Values shown in italic were accompanied by p-value > 0.01.