| Literature DB >> 22907158 |
Gregory Sliwoski1, Edward W Lowe, Mariusz Butkiewicz, Jens Meiler.
Abstract
Stereochemistry is an important determinant of a molecule's biological activity. Stereoisomers can have different degrees of efficacy or even opposing effects when interacting with a target protein. Stereochemistry is a molecular property difficult to represent in 2D-QSAR as it is an inherently three-dimensional phenomenon. A major drawback of most proposed descriptors for 3D-QSAR that encode stereochemistry is that they require a heuristic for defining all stereocenters and rank-ordering its substituents. Here we propose a novel 3D-QSAR descriptor termed Enantioselective Molecular ASymmetry (EMAS) that is capable of distinguishing between enantiomers in the absence of such heuristics. The descriptor aims to measure the deviation from an overall symmetric shape of the molecule. A radial-distribution function (RDF) determines a signed volume of tetrahedrons of all triplets of atoms and the molecule center. The descriptor can be enriched with atom-centric properties such as partial charge. This descriptor showed good predictability when tested with a dataset of thirty-one steroids commonly used to benchmark stereochemistry descriptors (r² = 0.89, q² = 0.78). Additionally, EMAS improved enrichment of 4.38 versus 3.94 without EMAS in a simulated virtual high-throughput screening (vHTS) for inhibitors and substrates of cytochrome P450 (PUBCHEM AID891).Entities:
Mesh:
Year: 2012 PMID: 22907158 PMCID: PMC3805266 DOI: 10.3390/molecules17089971
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Calculating DAS (A) Scores reflect opposing enantiomorphs based on cross-product direction and geometric center. Enantiomers [(2R,3R)-2-(chloromethyl)-3-propyloxirane and (2S,3S)-2-(chloromethyl)-3-propyloxirane] with two stereocenters are shown. (B) Two triangles are visualized in both enantiomers. These triangles encompass the same triplets of atoms between the two molecules. Four tetramers formed by the atom triplets and molecular center are visualized. i, j, k, and i', j', k' reflect the order of these atoms in either molecule. Importance of atom ordering is shown based on the direction of cross product (red arrow) and location of molecular center (black circle). (C) Volume and score calculations for the four tetrahedrons across both enantiomers are shown. Note the opposite signs and scores between the two enantiomers’ tetrahedrons.
Figure 2Diazepam (A) Top five scoring atom triplets in diazepam are shown. The black circle in all figures represents the molecular center. (B) Lowest five scoring atom triplets in diazepam. All triplets shown here score 0 and do not contribute to the RDF-like code. (C) Top five positive and top five negative scoring triplets in diazepam. Here is visualized the different distribution of high scoring positive (yellow) versus high scoring negative (orange) triplets in diazepam.
Figure 3EMAS curves for epothilone B (A) Plotted EMAS curves for epothilone B (blue) compared with its mirror image (red). X-axis represents the Directional Asymmetry Score in angstroms while the y-axis indicates the frequency of these scores across the entire molecule. (B) Atom triplets with a directional asymmetry score of approximately 0.3 angstroms. Note that these triangles generally cover the center of the molecule and are fairly symmetric. (C) Atom triplets with a directional asymmetry score of approximately 1.3 angstroms. Note that these triangles are further from the center of the molecule and have an asymmetric shape. (D) Atom triplets with a directional asymmetry score of approximately 1.7 angstroms. Note that these atom triplets lie furthest from the center of the molecule and are very asymmetric.
Experimental and predicted binding affinities for the 31 Cramer’s steroids using novel stereoselective descriptor to train ANN models. Spatial predictions utilize the novel descriptor without any atom property weighting. Multiply properties utilize the novel descriptor weighted by the product of atom properties. Sum properties utilize the novel descriptor weighted by the sum of atom properties.
| Molecule | Observed CBG affinity (pKa) | Predicted [spatial] | Predicted [multiply properties] | Predicted [sum properties] | Predicted [no stereochemistry] |
|---|---|---|---|---|---|
| aldosterone | −6.28 | −7.47 | −7.31 | −7.25 | −7.22 |
| androstanediol | −5.00 | −5.47 | −5.46 | −5.33 | −5.56 |
| 5-androstenediol | −5.00 | −5.47 | −5.43 | −5.36 | −5.75 |
| 4-androstenedione | −5.76 | −5.64 | −5.60 | −5.79 | −6.36 |
| androsterone | −5.61 | −5.78 | −5.81 | −5.55 | −5.42 |
| corticosterone | −7.88 | −7.30 | −7.37 | −7.32 | −7.34 |
| cortisol | −7.88 | −7.63 | −7.58 | −7.64 | −7.33 |
| cortisone | −6.89 | −7.22 | −6.83 | −7.39 | −7.07 |
| dehydroepiandrosterone | −5.00 | −5.39 | −5.13 | −5.46 | −5.80 |
| 11-deoxycorticosterone | −7.65 | −7.48 | −7.47 | −7.50 | −6.85 |
| 11-deoxycortisol | −7.88 | −7.66 | −7.53 | −7.59 | −7.52 |
| dihydrotestosterone | −5.92 | −5.38 | −5.70 | −5.43 | −5.96 |
| estradiol | −5.00 | −5.40 | −5.36 | −5.32 | −5.21 |
| estriol | −5.00 | −5.25 | −5.26 | −5.43 | −6.10 |
| estrone | −5.00 | −5.30 | −5.21 | −5.54 | −5.42 |
| etiocholanolone | −5.23 | −6.42 | −6.44 | −6.22 | −6.27 |
| pregnenolone | −5.23 | −5.30 | −5.25 | −5.37 | −6.37 |
| 17a-hydroxypregnenolone | −5.00 | −5.20 | −5.28 | −5.29 | −6.65 |
| progesterone | −7.38 | −7.17 | −7.27 | −7.13 | −6.46 |
| 17a-hydroxyprogesterone | −7.74 | −7.42 | −7.39 | −6.97 | −6.70 |
| testosterone | −6.72 | −6.08 | −6.36 | −6.19 | −5.94 |
| prednisolone | −7.51 | −7.61 | −7.36 | −7.65 | −7.03 |
| cortisolacetat | −7.55 | −6.74 | −6.90 | −7.63 | −6.00 |
| 4-pregnene-3,11,20-trione | −6.78 | −6.40 | −6.83 | −6.09 | −6.46 |
| epicorticosterone | −7.20 | −5.98 | −6.00 | −7.03 | −7.15 |
| 19-nortestosterone | −6.14 | −5.58 | −5.86 | −5.54 | −5.45 |
| 16a,17a-dihydroxy-progesterone | −6.25 | −7.25 | −7.04 | −7.46 | −7.36 |
| 16a-methylprogesterone | −7.12 | −6.69 | −6.39 | −6.78 | −6.60 |
| 19-norprogesterone | −6.82 | −6.01 | −6.30 | −7.25 | −6.19 |
| 2a-methylcortisol | −7.69 | −6.62 | −7.22 | −7.68 | −6.57 |
| 2a-methyl-9a-fluorocortisol | −5.80 | −7.56 | −6.97 | −6.22 | −6.74 |
| 0.78 | 0.86 | 0.89 | 0.65 | ||
| 0.60 | 0.74 | 0.78 | 0.42 |
Comparison of novel stereoselective descriptor predictability with other published QSAR methods against the Cramer’s steroid set. Calculation of can be found in the methods section. Statistical model generation method is indicated as well as QSAR method employed are indicated for each reference.
| QSAR Method | Model Creation | q2 | Reference |
|---|---|---|---|
| Artificial Neural Network | |||
| Artificial Neural Network | |||
| Artificial Neural Network | |||
| Stochastic 3D-chiral linear indices | Multiple Linear Regression | 0.87 | [ |
| Chiral Topological Indices | Stepwise Regression Analysis | 0.85 | [ |
| Chiral Graph Kernels | Support Vector Machine | 0.78 | [ |
| Chirality Correction and Topological Descriptors | K-nearest neighbor | 0.83 | [ |
| Molecular Quantum Similarity Measures | Multilinear Regression | 0.84 | [ |
| Shape and Electrostatic Similarity Matrixes | Non-linear Neural Network | 0.94 | [ |
| Comparative Molecular Moment Analysis | Partial Least Squares (PLS) | 0.83 | [ |
| Comparative Molecular Similarity Indices Analysis | PLS | 0.67 | [ |
| Comparative Molecular Field Analysis | PLS | 0.65 | [ |
| E-state Descriptors | PLS | 0.62 | [ |
| Molecular Electronegativity Distance Vector | Genetic Algorithm PLS | 0.78 | [ |
| Molecular Quantum Similarity Measures | Multilinear Regression and PLS | 0.80 | [ |
Figure 4ROC and PPV results for the feature forward analysis with the control set of features compared with the control set combined with EMAS features (A) AID891 prediction ROC curves generated from the ANN models trained with the best descriptor set generated from the forward feature analysis beginning with the control set of features combined with the novel EMAS features (red) show improved performance when compared with ROC curves generated from the ANN models trained with the best descriptor set generated from the forward feature analysis beginning with the control set of features (blue) (B) PPV curves for models trained with the best descriptor set of control features combined with the EMAS features (red) shows improved performance over those models trained with the best descriptor set of control features only (blue). Dashed lines of corresponding colors show the average PPV values over the FPP region from which the models were optimized (0.005 to 0.05 fraction positive predicted values).