| Literature DB >> 32102234 |
Madhu Sudhana Saddala1, Anton Lennikov1, Hu Huang1.
Abstract
Glucose-6-Phosphate Dehydrogenase (G6PD) is a ubiquitous cytoplasmic enzyme converting glucose-6-phosphate into 6-phosphogluconate in the pentose phosphate pathway (PPP). The G6PD deficiency renders the inability to regenerate glutathione due to lack of Nicotine Adenosine Dinucleotide Phosphate (NADPH) and produces stress conditions that can cause oxidative injury to photoreceptors, retinal cells, and blood barrier function. In this study, we constructed pharmacophore-based models based on the complex of G6PD with compound AG1 (G6PD activator) followed by virtual screening. Fifty-three hit molecules were mapped with core pharmacophore features. We performed molecular descriptor calculation, clustering, and principal component analysis (PCA) to pharmacophore hit molecules and further applied statistical machine learning methods. Optimal performance of pharmacophore modeling and machine learning approaches classified the 53 hits as drug-like (18) and nondrug-like (35) compounds. The drug-like compounds further evaluated our established cheminformatics pipeline (molecular docking and in silico ADMET (absorption, distribution, metabolism, excretion and toxicity) analysis). Finally, five lead molecules with different scaffolds were selected by binding energies and in silico ADMET properties. This study proposes that the combination of machine learning methods with traditional structure-based virtual screening can effectively strengthen the ability to find potential G6PD activators used for G6PD deficiency diseases. Moreover, these compounds can be considered as safe agents for further validation studies at the cell level, animal model, and even clinic setting.Entities:
Keywords: ADMET; G6PD; docking; machine learning; pharmacophore modeling
Mesh:
Substances:
Year: 2020 PMID: 32102234 PMCID: PMC7073180 DOI: 10.3390/ijms21041523
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Depiction of the target enzyme G6PD and its active sites. (A) the target enzyme G6PD-dimer represented as a cartoon model; (B) the target enzyme G6PD-dimer represented as a surface model. Monomer-1 (cyan color) and monomer-2 (yellow color) are connected and formed dimer formation dimer interface active site (pink color), dark red color spheres designate the NADP+ binding sites. The dimer interface is designated with red.
Figure 2Pharmacophore model of G6PD-AG1-complex. (A) The pharmacophore model contains four pharmacophore features, such as two hydrogen donors (green color), one positive ionizable, and one aromatic ring (B). The G6PD-AG1 compound interacts with His513, ASP421, and ARG427 functional residues. The 53 hit molecules are fitted into pharmacophore features which are applied to the PubChem database.
Figure 3The molecular descriptors and clustering of 53 hit molecules. (A) The molecular descriptors were represented as a heatmap. It showed positive values as a red color and negative values as a blue color. (B) The molecular descriptors were classified as hierarchical clustering trees. The hierarchical clustering showed seven cluster trees. Cluster 1 has four compounds, Cluster 2 and Cluster 3 have ten compounds each, Cluster 4 has fourteen compounds, Cluster 5 has four compounds, Cluster 6 has five compounds, and Cluster 7 has seven compounds, respectively.
Figure 4Principle component analysis (PCA) of 53 pharmacophores hit molecules. (A) The PCA showed various groups of compounds based on the Tanimoto coefficient (distance) between the first component (PCA1) against the second component (PCA2). (B) The logarithm of the calculated partition coefficient (logP) against the polar surface area (PSA) showed that the compounds have a maximum of 5.8 logP and 66 PSA. (C) The molecular weight (MW) against the PSA showed that the compounds have a maximum of 66 PSA and 400 MW. (D) The molecular weight (MW) against the logarithm of the calculated Partition coefficient (logP) showed that the compounds have a maximum of 5.8 logP and 400 MW.
Figure 5The statistical machine learning predictions and classified the 53 pharmacophore hit candidate molecules as a drug-like (18) and nondrug-like (35) compounds.
Figure 6Molecular docking analysis illustrated all the drug-like (18) molecules docked into the G6PD-dimer interface active site (top). The binding energy of the top five compounds was aligned and superimposed (bottom).
Molecular docking scores (kcal/mol) and functional residues of the active molecules in the binding site of the protein G6PD.
| PubChem IDs | Smiles Notation | Binding Energies (ΔG) | Functional Amino Acids |
|---|---|---|---|
|
| C1CN(CC=C1C2=CC=CC=C2)CCCCC3=CNC4=CC=CC=C43 | ne–8.9 | ILE220, PHE221, ASN229, ASN388, ILE224, PHE373, TYR401, VAL400, THR402, ASP421, LEU420, LEU422 |
|
| C1CN(CCC1C2=CNC3=CC=CC=C32)CCCN4CCC5=CC=CC=C54 | ne–7.6 | LEU214, PHE221, LEU420, THR402, ILE220, PHE373, ASN388, ILU224, LEU422 |
|
| C1CN(CCN1CCCC2=CNC3=CC=CC=C32)CCCC4=CNC5=CC=CC=C54 | ne–7.3 | ASP421, LEU422, TYR401, LEU420, VAL400, THR402, PHE373, ILE224, ASN388, HIS374, ASP375, ASN229, VAL376, PHE221, ILE220 |
|
| C1=CC2=C(C=CN2)C=C1CC3=CC4=C(C=C3)NC=C4 | ne–7.2 | LEU420, VAL400, ASP421, TYR401, THR402, LEU422, PHE373, ASN388, ILE224, PHE221, ILE220 |
|
| C1=CC=C2C(=C1)C=C(N2)CC3=CNC4=CC=CC=C43 | ne–7.0 | PHE373, THR402, LEU422, TYR401, ASP521, LEU420, VAL400, PHE221, ILE220, ASN388 |
|
| C1=CC=C2C(=C1)C(=CN2)CCNCCSSCCNCCC3=CNC4=CC=CC=C43 | ne–6.1 | LEU420, THR423, ASN426, ASP421, ARG427, LEU422 |
Figure 7Protein–ligand interaction analysis of the best five compounds. (A) CID6917760 compound interacted with (ne–8.9 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (B) CID9820229 compound interacted to (ne–7.6 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (C) CID5221957 compound interacted with (ne–7.3 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (D) CID389556 compound interacted to (ne–7.2 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (E) CID10900930 compound interacted to (ne–7.0 kcal/mole) active site of G6PD (dimer interface domain) functional residues; (F) AG1 (CID6615809) compound interacted to (ne–6.1 kcal/mole) active site of G6PD (dimer interface domain) functional residues. The binding site functional residues represented as a sticks model with rainbow color; the best top five compounds were represented as a sticks model with magenta, and the G6PD protein represented as a cartoon model with white.
Physicochemical properties, lipophilicity, water-solubility, pharmacokinetics, drug-likeness, and medicinal chemistry properties of selected ligands determined by the SwissADME server.
| Descriptors | CID6917760 | CID9820229 | CID5221957 | CID389556 | CID10900930 | AG1 |
|---|---|---|---|---|---|---|
|
| ||||||
| Formula | C23H26N2 | C24H29N3 | C26H32N4 | C17H14N2 | C17H14N2 | C24H30N4S2 |
| Molecular weight | 330.47 g/mol | 359.51 g/mol | 400.56 g/mol | 246.31 g/mol | 246.31 g/mol | 438.65 g/mol |
| Num. heavy atoms | 25 | 27 | 30 | 19 | 19 | 30 |
| Num. arom. heavy atoms | 15 | 15 | 18 | 18 | 18 | 18 |
| Fraction Csp3 | 0.03 | 0.42 | 0.38 | 0.06 | 0.06 | 0.33 |
| Num. rotatable bonds | 6 | 5 | 8 | 2 | 2 | 13 |
| Num. H-bond acceptors | 1 | 1 | 2 | 0 | 0 | 2 |
| Num. H-bond donors | 1 | 1 | 2 | 2 | 2 | 4 |
| Molar Refractivity | 111.22 | 121.13 | 134.38 | 79.61 | 79.61 | 134.04 |
| TPSA | 19.03 Å2 | 22.27 Å2 | 38.06 Å2 | 31.58 Å2 | 31.58 Å2 | 106.24 Å2 |
|
| ||||||
| Log Po/w (iLOGP) | 3.60 | 3.66 | 3.66 | 2.07 | 2.21 | 3.99 |
| Log Po/w (XLOGP3) | 4.93 | 4.84 | 4.95 | 4.10 | 4.13 | 4.22 |
| Log Po/w (WLOGP) | 4.90 | 4.04 | 4.07 | 4.24 | 4.24 | 5.00 |
| Log Po/w (MLOGP) | 4.16 | 3.83 | 3.24 | 3.00 | 3.00 | 2.83 |
| Log Po/w (SILICOS-IT) | 5.70 | 4.92 | 5.90 | 5.01 | 5.01 | 6.29 |
| Consensus Log Po/w | 4.66 | 4.26 | 4.36 | 3.68 | 3.72 | 4.47 |
|
| ||||||
| Log S (ESOL) | ne–5.04 | ne–5.20 | ne–5.36 | ne–4.52 | ne–4.54 | ne–4.80 |
| Solubility | 2.99 × 10−3 mg/mL; 9.06 × 10−6 mol/L | 2.27 × 10−3 mg/mL; 6.32 × 10−6 mol/L | 1.76 × 10−3 mg/mL; 4.39 × 10−6 mol/L | 7.45 × 10−3 mg/mL; 3.03 × 10−5 mol/L | 7.14 × 10−3 mg/mL; 2.90 × 10−5 mol/L | 6.88 × 10−3 mg/mL; 1.57 × 10−5 mol/L |
| Class | Moderately soluble | Moderately soluble | Moderately soluble | Moderately soluble | Moderately soluble | Moderately soluble |
| Log S (Ali) | ne–5.07 | ne–5.04 | ne–5.49 | ne–4.47 | ne–4.50 | ne–6.16 |
| Solubility | 2.83 × 10−3 mg/mL; 8.58 × 10−6 mol/L | 3.27 × 10−3 mg/mL; 9.09 × 10−6 mol/L | 1.30 × 10−3 mg/mL; 3.26 × 10−6 mol/L | 8.37 × 10−3 mg/mL; 3.40 × 10−5 mol/L | 7.79 × 10−3 mg/mL; 3.16 × 10−5 mol/L | 3.03 × 10−4 mg/mL; 6.90 × 10−7 mol/L |
| Class | Moderately soluble | Moderately soluble | Moderately soluble | Moderately soluble | Moderately soluble | Poorly soluble |
| Log S (SILICOS-IT) | ne–7.87 | ne–7.35 | ne–8.82 | ne–7.10 | ne–7.10 | ne–10.13 |
| Solubility | 4.49 × 10−6 mg/mL; 1.36 × 10−8 mol/L | 1.61 × 10−5 mg/mL; 4.48 × 10−8 mol/L | 6.09 × 10−7 mg/mL; 1.52 × 10−9 mol/L | 1.97 × 10−5 mg/mL; 8.00 × 10−8 mol/L | 1.97 × 10−5 mg/mL; 8.00 × 10−8 mol/L | 3.21 × 10−8 mg/mL; 7.33 × 10−11 mol/L |
| Class | Poorly soluble | Poorly soluble | Poorly soluble | Poorly soluble | Poorly soluble | Insoluble |
|
| ||||||
| GI absorption | High | High | High | High | High | High |
| BBB permeant | Yes | Yes | Yes | Yes | Yes | No |
| P-gp substrate | Yes | Yes | Yes | Yes | Yes | Yes |
| CYP1A2 inhibitor | Yes | Yes | Yes | Yes | Yes | Yes |
| CYP2C19 inhibitor | Yes | No | No | Yes | Yes | Yes |
| CYP2C9 inhibitor | No | No | No | No | No | No |
| CYP2D6 inhibitor | Yes | Yes | Yes | Yes | Yes | Yes |
| CYP3A4 inhibitor | Yes | Yes | Yes | Yes | Yes | Yes |
| Log Kp (skin permeation) | ne–4.82 cm/s | ne–5.06 cm/s | ne–5.23 cm/s | ne–4.89 cm/s | ne–4.87 cm/s | ne–5.98 cm/s |
|
| ||||||
| Lipinski | Yes; 1 violation: MLOGP > 4.15 | Yes; 0 violation | Yes; 0 violation | Yes; 0 violation | Yes; 0 violation | Yes; 0 violation |
| Ghose | Yes | Yes | No; 1 violation: MR > 130 | Yes | Yes | No; 1 violation: MR > 130 |
| Veber | Yes | Yes | Yes | Yes | Yes | No; 1 violation: Rotors > 10 |
| Egan | Yes | Yes | Yes | Yes | Yes | Yes |
| Muegge | Yes | Yes | Yes | Yes | Yes | Yes |
| Bioavailability Score | 0.05 | 0.05 | 0.55 | 0.55 | 0.55 | 0.55 |
|
| ||||||
| PAINS | 0 alert | 0 alert | 0 alert | 0 alert | 0 alert | 0 alert |
| Brenk | 0 alert | 0 alert | 0 alert | 0 alert | 0 alert | 1 alert: disulphide |
| Lead likeness | No; 1 violation: XLOGP3 > 3.5 | No; 2 violations: MW > 350, XLOGP3 > 3.5 | No; 3 violations: MW > 350, Rotors > 7, XLOGP3 > 3.5 | No; 2 violations: MW < 250, XLOGP3 > 3.5 | No; 2 violations: MW < 250, XLOGP3 > 3.5 | No; 3 violations: MW > 350, Rotors > 7, XLOGP3 > 3.5 |
| Synthetic accessibility | 3.24 | 3.31 | 2.81 | 1.70 | 2.20 | 3.17 |
The best five compounds of ADMET properties are calculated by Osiris molecular property explorer.
| Properties | CID6917760 | CID9820229 | CID5221957 | CID389556 | CID10900930 | AG1 |
|---|---|---|---|---|---|---|
| Mutagenic | No | No | No | Partial | No | No |
| Tumorigenic | No | No | No | Partial | No | No |
| Irritant | No | No | No | No | No | No |
| Reproductive effect | No | No | No | No | No | No |
| cLogP | 4.77 | 4.76 | 4.63 | 3.55 | 3.61 | 4.22 |
| Solubility | ne–3.99 | ne–4.41 | ne–3.84 | ne–4.38 | ne–4.40 | ne–5.18 |
| MW | 330 | 359 | 400 | 246 | 246 | 438 |
| TPSA | 19.03 Å2 | 22.27 Å2 | 38.06 Å2 | 31.58 Å2 | 31.58 Å2 | 106.2 Å2 |
| Drug likeness | 2.81 | 7.9 | 7.97 | 0.11 | 2.77 | 1.76 |
| Drug score | 0.62 | 0.59 | 0.62 | 0.36 | 0.70 | 0.48 |
Figure 8The ADMET properties of the five best G6PD small molecule activators CID6917760, CID9820229, CID5221957, CID389556, CID10900930, and AG1 (CID6615809). The pink area represents the optimal range for each properties (lipophilicity: XLOGP3 between ne−0.7 and +5.0, size: MW between 150 and 500 g/mol, polarity: TPSA between 20 and 130 Å2, solubility: log S not higher than 6, saturation: fraction of carbons in the sp3 hybridization not less than 0.25, and flexibility: no more than 9 rotatable bonds.