| Literature DB >> 35424985 |
Mai Fayiz Al-Tawil1, Safa Daoud2, Ma'mon M Hatmal3, Mutasem Omar Taha1.
Abstract
Cdc2-like kinase 4 (CLK4) inhibitors are of potential therapeutic value in many diseases particularly cancer. In this study, we combined extensive ligand-based pharmacophore exploration, ligand-receptor contact fingerprints generated by flexible docking, physicochemical descriptors and machine learning-quantitative structure-activity relationship (ML-QSAR) analysis to investigate the pharmacophoric/binding requirements for potent CLK4 antagonists. Several ML methods were attempted to tie these properties with anti-CLK4 bioactivities including multiple linear regression (MLR), random forests (RF), extreme gradient boosting (XGBoost), probabilistic neural network (PNN), and support vector regression (SVR). A genetic function algorithm (GFA) was combined with each method for feature selection. Eventually, GFA-SVR was found to produce the best self-consistent and predictive model. The model selected three pharmacophores, three ligand-receptor contacts and two physicochemical descriptors. The GFA-SVR model and associated pharmacophore models were used to screen the National Cancer Institute (NCI) structural database for novel CLK4 antagonists. Three potent hits were identified with the best one showing an anti-CLK4 IC50 value of 57 nM. This journal is © The Royal Society of Chemistry.Entities:
Year: 2022 PMID: 35424985 PMCID: PMC8982525 DOI: 10.1039/d2ra00136e
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 3.361
Fig. 1Potent CLK4 inhibitors currently investigated as potential clinical candidates.
Fig. 2The workflow implemented in the current project.
Correlation coefficient values of best ML-QSAR regression models
| ML method | Selected model descriptors |
|
|
|
|
|---|---|---|---|---|---|
| GFA-SVR | Hypo(5-R2-08), Hypo(6-R2-07), Hypo(8-R3-08), LEU244HNLD, VAL324HBLD, ASP325HALD, CHI_2, Num_Rings6 | 0.91 | 0.65 | 0.66 | 0.76 |
| GFA-RF | Hypo(5-R2-08), Hypo(6-R2-03), Hypo(8-R3-08), Hypo(2-R5-05), VAL324HBLD, ASP325HALD, CHI_2 | 0.94 | 0.63 | 0.57 | 0.77 |
| GFA-PNN | Hypo(3-R6-08), Hypo(1-R6-02), LEU210HD12CD, Num_Rings5, Kappa_3 | 0.96 | 0.07 | 0.01 | 0.71 |
| GFA-XGBoost | Hypo(5-R2-07), Hypo(6-R2-08), Hypo(8-R2-04), Hypo(2-R5-05), Kappa_3, Dipole_Y | 0.96 | 0.47 | 0.46 | 0.75 |
| GFA-MLR | log(1/IC50) = + 0.12 Hypo(5-R6-08) + 0.129 Hypo(1-R2-08) − 0.276 LYS191HZ2CD − 0.22 VAL324 HBLD + 0.433 Num_Rings5 − 0.002 PMI_x − 2.65 Shadow_XYfrac − 1.667 | 0.65 | 0.46 | 0.50 | 0.53 |
Hypo(5-R2-08) is the 8th pharmacophore model generated using training subset 5 (Table S2 under ESI) with the 2nd HYPOGEN run settings (Table S3 under ESI), Hypo(6-R2-07): is the7th pharmacophore model generated using training subset 6 (Table S2) with the 2nd HYPOGEN run settings (Table S3), Hypo(8-R3-08) is the 8th pharmacophore model generated using training subset 8 (Table S2) with the 3rd HYPOGEN run settings (Table S3), Hypo(6-R2-03) is the 3rd pharmacophore model generated using training subset 6 (Table S2) with the 2nd HYPOGEN run settings (Table S3), Hypo(2-R5-05) is the 5th pharmacophore model generated using training subset 2 (Table S2) with the 5th HYPOGEN run settings (Table S3), Hypo(3-R6-08) is the 8th pharmacophore model generated using training subset 3 (Table S2) with the 6th HYPOGEN run settings (Table S3), Hypo(1-R6-02) is the 2nd pharmacophore model generated using training subset 1 (Table S2) with the 6th HYPOGEN run settings (Table S3), Hypo(5-R2-07) is the 7th pharmacophore model generated using training subset 5 (Table S2) with the 2nd HYPOGEN run settings (Table S3), Hypo(6-R2-08) is the 8th pharmacophore model generated using training subset 6 (Table S2) with the 2nd HYPOGEN run settings (Table S3), Hypo(8-R2-04) is the 4th pharmacophore model generated using training subset 8 (Table S2) with the 2nd HYPOGEN run settings (Table S3), Hypo(5-R6-08) is the 8th pharmacophore model generated using training subset 5 (Table S2) with the 6th HYPOGEN run settings (Table S3), Hypo(1-R2-08) is the 8th pharmacophore model generated using training subset 1 (Table S2) with the 2nd HYPOGEN run settings (Table S3). Table 2 shows the X, Y, Z coordinates of pharmacophores Hypo(5-R2-08), Hypo(6-R2-07), and Hypo(8-R3-08).
LEU244HNLD is the hydrogen atom attached to peptidic N of Leu244 selected by LibDock score scoring function, VAL324HBLD is the hydrogen atom attached to beta carbon of Val324 selected by LibDock score scoring function, ASP325HALD is the hydrogen atom attached to alpha carbon of Asp325 selected by LibDock score scoring function. Fig. 3 shows the position of these three atoms within the binding pocket, LEU210HD12CD is the hydrogen atom attached to delta carbon of Leu210 selected by CDocker interaction energy scoring function, LYS191HZ2CD is one of the hydrogen atoms at the terminal amine on the side chain of Lys191 selected by CDocker interaction energy scoring function.
Num_Rings6: number of 6-membered rings. CHI_2: second order connectivity index, positively correlated with molecular size, Num_Rings5: number of 5-membered rings. Kappa_3: third order kappa shape index, related to molecular flexibility, Dipole_Y: 3D the calculated magnitude and the X-vector component of the molecular dipole moment in debyes as estimated from the partial atomic charges (calculated by Gasteiger method) and atomic coordinates. PMI_x: principle moment of inertia in the X-dimension, Shadow_XYfrac area of the molecular shadow in the XY plane.[54,55]
Resubstitution correlation coefficient: the model is trained on the training list and used to predict the bioactivities of the same training set.
Leave-20%-out correlation coefficient.
Leave-one-out correlation coefficient.
Predictive correlation coefficient on the external testing set.
X, Y, Z coordinates, weights and tolerances of binding features of pharmacophore models selected by implemented ML methods
| Pharmacophore | Definition | Chemical features | |||||||
|---|---|---|---|---|---|---|---|---|---|
| HBA | Hbic | Hbic | RingArom | ||||||
| Hypo(5-R2-08) | Weights | 2.26 | 2.26 | 2.26 | 2.26 | ||||
| Tolerances | 1.60 | 2.20 | 1.60 | 1.60 | 1.60 | 1.60 | |||
| Coordinates |
| 5.60 | 7.80 | 2.93 | −0.72 | −3.62 | −3.80 | ||
|
| −0.30 | 1.78 | −1.08 | −0.82 | −0.86 | 1.75 | |||
|
| −0.002 | −0.04 | 4.00 | 6.46 | 0.39 | 1.86 | |||
| HBA | HBD | Hbic | RingArom | ||||||
| Hypo(6-R2-07) | Weights | 1.97 | 1.97 | 1.97 | 1.97 | ||||
| Tolerances | 1.60 | 2.20 | 1.60 | 2.20 | 1.60 | 1.60 | 1.60 | ||
| Coordinates |
| −1.37 | −0.27 | −2.84 | −4.35 | −0.56 | 2.23 | 2.82 | |
|
| −1.58 | −2.36 | −1.35 | −3.68 | 0.70 | −1.44 | 1.38 | ||
|
| −1.25 | −3.93 | −2.47 | −3.62 | −4.49 | 0.53 | −0.31 | ||
| HBA | HBD | Hbic | RingArom | ||||||
| Hypo(8-R3-08) | Weights | 2.18 | 2.18 | 2.18 | 2.18 | ||||
| Tolerances | 1.60 | 2.20 | 1.60 | 2.20 | 1.60 | 1.60 | 1.60 | ||
| Coordinates |
| 4.51 | 3.65 | −2.13 | 0.73 | −3.08 | −1.42 | −1.42 | |
|
| −2.27 | −5.16 | −1.90 | −2.75 | −3.82 | 0.01 | 2.60 | ||
|
| 0.06 | −0.07 | 2.84 | 3.13 | 6.66 | −0.01 | 1.50 | ||
This pharmacophore includes 3 exclusion spheres of 1.2 Å diameters and at the following X, Y, Z coordinates: (−1.73, −0.05, 9.32), (4.58, 0.33, 2.75), and (3.06, 3.54, −2.21). Exclusion spheres represent regions forbidden for occupancy by ligand groups.
This pharmacophore includes 4 exclusion spheres of 1.2 Å diameters and at the following X, Y, Z coordinates: (−1.81, 2.02, −7.37), (2.94, 1.12, 2.95), (−3.64, −0.13, 4.64), and (3.49, −5.94, 0.08).
Fig. 3Comparison between pharmacophore models in the best GFA-SVR QSAR model and binding interactions observed within the CLK4 crystallographic complex 6FYV. (A)–(C) Pharmacophore models, Hypo(5-R2-08), Hypo(8-R3-08) and Hypo(6-R2-07), respectively, and how they map the crystallographic bound ligand. Hydrogen bond donor features (HBDs) are shown as pink vectored spheres, hydrogen bond acceptor features (HBA) are shown as green vectored spheres, Aromatic ring features (RingArom) are shown as orange vectored spheres, hydrophobic features (Hbic) are shown as blue spheres, exclusion areas (spheres) are shown as grey spheres.
Receiver operating characteristic (ROC) information of ligand-based pharmacophores
| Pharmacophore | AUC% | ACC% | TNR% | TPR% | GH score |
|---|---|---|---|---|---|
| Hypo(5-R2-08) | 56% | 54% | 47% | 66% | 0.43 |
| Hypo(6-R2-07) | 57% | 57% | 49% | 64% | 0.39 |
| Hypo(8-R3-08) | 53% | 50% | 44% | 62% | 0.40 |
Area under the curve.
Overall accuracy.
Overall true negative rate (also known as specificity).
Overall true positive rate (also known as sensitivity).
Fig. 4Significant ligand–receptor contacts selected by the GFA-SVR model, Table 1. Significant contacts are shown as spheres. The image represents the crystallographic structure of silmisertib bound to CLK4 (PDB code: 6FYV).
Fig. 5Dose–response curves of hits (A) 96, (B) 107 and (C) 109. The figures also show the corresponding IC50 values, Hill slopes and correlation r2 values.
Fig. 6The chemical structures of captured hits (left column) and their mappings against pharmacophores Hypo(5-R2-08), Hypo(8-R3-08) and Hypo(6-R2-07) (respectively, left to right).
Fig. 7Docked poses of active hits 96, 107 and 109 as generated by successful flexible docking settings.
Fig. 8Principal component analysis showing the relative distribution of captured hits (92–115, Table S6;† red spheres ) compared to modeled compounds (1–91, Table S1;† blue spheres ). The top three principal components calculated for modeled compounds and captured hits are based on eight descriptors (i.e., log P, molecular weight, hydrogen bond donors and acceptors, rotatable bonds, number of rings, number of aromatic rings, fractional polar surface area surface area). The active hits 96, 107 and 109 are indicated in the figure with arrows.
Fig. 9Principal component analysis showing the relative distribution of captured hits (red spheres, ) including active hits (yellow spheres, ) among all reported 3643 CLK4 inhibitors extracted from ChEMBL database (blue spheres, ). The top three principal components were calculated based on eight descriptors (i.e., log P, molecular weight, hydrogen bond donors and acceptors, rotatable bonds, number of rings, number of aromatic rings, fractional polar surface area surface area).