| Literature DB >> 26019722 |
Naveen Khatri1, Viney Lather2, A K Madan1.
Abstract
BACKGROUND: Purine nucleoside analogs (PNAs) constitute an important group of cytotoxic drugs for the treatment of neoplastic and autoimmune diseases. In the present study, classification models have been developed for the prediction of the anti-HIV activity of purine nucleoside analogs.Entities:
Keywords: Anti-HIV activity; Balaban-type index from Z-weighted distance matrix; Moving average analysis; Purine nucleoside analogs; Superaugmented pendentic topochemical index; Support vector machine
Year: 2015 PMID: 26019722 PMCID: PMC4446118 DOI: 10.1186/s13065-015-0109-0
Source DB: PubMed Journal: Chem Cent J ISSN: 1752-153X Impact factor: 4.215
Fig. 1Basic structures of purine nucleoside analogs from serial number 1 to 36 [25]
Relationship between molecular descriptors and anti-HIV activity in human PBM cells
| Serial number | Basic structure of compound | Substituent (R) | A2 | A4 | A23 | A37 | Anti-HIV activity in human PBM cells (EC50) | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Predicted | Reported [ | ||||||||||
| A2 | A4 | A23 | A37 | ||||||||
| 1 | I | 3-[4-(Hydroxymethyl)-2-cyclopent-1-yl] | 0.919 | 3.287 | 9.61 | 2.16 | + | + | ± | − | + |
| 2 | I | 3-(β-D-1,3-Dioxolanyl) | 0.922 | 3.1 | 9.982 | 2.254 | + | + | ± | − | + |
| 3 | I | 3-(3-Azido-2,3-dideoxy-β-D-erythro-pentofuranosyl) | 0.903 | 4.599 | 50.599 | 2.197 | + | + | + | − | + |
| 4 | I | 3-(β-D-2-C-Methyl-ribofuranosyl) | 0.848 | 0.001 | 615.521 | 2.332 | − | − | − | − | − |
| 5 | II | 4-MeO-Ph | 0.973 | 11.666 | 4.866 | 1.596 | − | − | − | − | − |
| 6 | II | 4-Me-Ph | 0.972 | 11.15 | 5.338 | 1.614 | − | − | − | − | − |
| 7 | II | 4-Br-Ph | 0.97 | 11.545 | 3.883 | 1.631 | − | − | − | − | − |
| 8 | II | 4-NEt2-Ph | 0.976 | 14.051 | 43.583 | 1.531 | − | − | + | + | + |
| 9 | II | 4-NMe2-Ph | 0.976 | 12.597 | 43.44 | 1.573 | − | − | + | + | + |
| 10 | II | 2-Thiophenyl | 0.968 | 9.999 | 5.937 | 1.655 | − | − | − | − | − |
| 11 | II | 3-Thiophenyl | 0.969 | 10.637 | 5.223 | 1.641 | − | − | − | − | − |
| 12 | III | Et | 0.945 | 6.157 | 7.307 | 1.938 | − | − | − | − | − |
| 13 | III | Ph | 0.971 | 9.869 | 2.313 | 1.674 | − | − | − | − | − |
| 14 | III | 4-MeO-Ph | 0.974 | 11.602 | 4.849 | 1.627 | − | − | − | − | − |
| 15 | III | 3-MeO-Ph | 0.974 | 10.545 | 5.38 | 1.652 | − | − | − | − | − |
| 16 | III | 2-MeO-Ph | 0.97 | 9.834 | 6.121 | 1.678 | − | − | − | − | − |
| 17 | III | 4-Me-Ph | 0.973 | 11.081 | 5.436 | 1.648 | − | − | − | − | − |
| 18 | III | 4-Cl-Ph | 0.971 | 11.307 | 4.712 | 1.661 | − | − | − | − | − |
| 19 | III | 4-F-Ph | 0.971 | 10.929 | 5.132 | 1.655 | − | − | − | − | − |
| 20 | III | 2,4-F-Ph | 0.971 | 10.969 | 35.94 | 1.672 | − | − | ± | − | − |
| 21 | III | 4-NEt2-Ph | 0.974 | 13.458 | 44.566 | 1.555 | − | − | + | + | + |
| 22 | III | 4-NMe2-Ph | 0.976 | 12.544 | 44.599 | 1.601 | − | − | + | − | − |
| 23 | III | 2-Thiophenyl | 0.969 | 9.92 | 6.049 | 1.69 | − | − | − | − | − |
| 24 | III | 3-Thiophenyl | 0.97 | 9.951 | 5.261 | 1.675 | − | − | − | − | − |
| 25 | III | 4-N3-Ph | 0.961 | 12.091 | 4.532 | 1.611 | − | − | − | − | − |
| 26 | III | 4-CN-Ph | 0.975 | 12.57 | 4.893 | 1.626 | − | − | − | − | − |
| 27 | IV | Ph | 0.939 | 0.566 | 6.494 | 1.645 | − | − | − | − | − |
| 28 | IV | 4-MeO-Ph | 0.889 | 1.923 | 39.143 | 1.609 | − | − | ± | − | − |
| 29 | IV | 4-NEt2-Ph | 0.907 | 4.048 | 357.424 | 1.551 | + | + | + | + | + |
| 30 | IV | 4-NMe2-Ph | 0.901 | 2.76 | 344.712 | 1.589 | + | + | + | + | + |
| 31 | V | Et | 0.878 | 0.018 | 993.843 | 1.986 | − | − | − | − | − |
| 32 | V | 4-MeO-Ph | 0.937 | 0.982 | 1972.793 | 1.663 | − | − | − | − | − |
| 33 | V | 4-NEt2-Ph | 2.345 | 0.94 | 16,083.229 | 1.594 | − | − | − | − | − |
| 34 | V | 4-NMe2-Ph | 0.983 | 0.939 | 12,785.121 | 1.639 | − | − | − | − | − |
| 35 | V | 2-Thiophenyl | 0.123 | 0.923 | 2081.689 | 1.72 | − | − | − | − | − |
| 36 | V | 3-Thiophenyl | 0.122 | 0.923 | 2059.054 | 1.706 | − | − | − | − | − |
+, active; −, inactive; ±, transitional
List of molecular descriptors
| Code | Name of descriptor |
|---|---|
| A1 | Eccentricity index, ECC |
| A2 | Spherocity index, SPH |
| A3 | Molecular connectivity topochemical index, |
| A4 | Shape profile no. 20, SP20 |
| A5 | Shape profile no. 07, SP07 |
| A6 | Shape profile no. 08, SP08 |
| A7 | Eccentric adjacency topochemical index |
| A8 | Radial distribution function - 10.5/weighted by atomic masses, RDF105m |
| A9 | Second Zagreb index M2, ZM2 |
| A10 | Augmented eccentric connectivity topochemical index, |
| A11 | Mean information content on the distance magnitude, IDM |
| A12 | Molecular profile no. 10, DP10 |
| A13 | Molecular profile no. 11, DP11 |
| A14 | Molecular profile no. 12, DP12 |
| A15 | Molecular profile no. 13, DP13 |
| A16 | Molecular profile no. 14, DP14 |
| A17 | Radius of gyration (mass weighted), RGyr |
| A18 | Eccentric connectivity topochemical index, |
| A19 | Connective eccentricity topochemical index, |
| A20 | Average vertex distance degree, VDA |
| A21 | Mean square distance index (Balaban), MSD |
| A22 | Schultz molecular topological index, SMTI |
| A23 | Superaugmented pendentic topochemical index-4, |
| A24 | Gutman MTI by valence vertex degrees, GMTIV |
| A25 | Xu index, Xu |
| A26 | Mean Wiener index, WA |
| A27 | Superadjacency topochemical index, |
| A28 | Harary |
| A29 | Quasi-Wiener index (Kirchhoff number), QW |
| A30 | First Mohar index, TI1 |
| A31 | Weiner’s topochemical index, |
| A32 | Reciprocal hyper-detour index, Rww |
| A33 | Distance/detour index, D/D |
| A34 | All-path Wiener index, Wap |
| A35 | Superaugmented eccentric connectivity topochemical index-3, |
| A36 | Wiener-type index from |
| A37 | Balaban-type index from |
| A38 | Maximal electrotopological negative variation, MAXDN |
| A39 | Molecular electrotopological variation, DELS |
| A40 | Superaugmented eccentric connectivity topochemical index-4, |
| A41 | Three-path Kier alpha-modified shape index, S3K |
| A42 | Centralization, CENT |
| A43 | Distance/detour ring index of order 9, D/Dr09 |
| A44 | Molecular connectivity index, |
| A45 | Eigenvalue 11 from edge adjacency matrix weighted by resonance integrals, EEig11r |
| A46 | Average geometric distance degree, AGDD |
| A47 | Absolute eigenvalue sum on geometry matrix, SEig |
| A48 | Eccentric adjacency index, |
| A49 | 3D-MoRSE - signal 26/unweighted, Mor26u |
| A50 | 3D-MoRSE - signal 25/weighted by atomic Sanderson electronegativities, Mor25e |
| A51 | Augmented eccentric connectivity index, |
| A52 | First component size directional WHIM index/unweighted, L1u |
| A53 | K global shape index/weighted by atomic Sanderson electronegativities, Ke |
| A54 | Superpendentic index, ∫ |
| A55 | Mean information content on the leverage magnitude, HIC |
| A56 |
|
| A57 |
|
| A58 |
|
| A59 | Superaugmented eccentric connectivity index-1, |
| A60 | Weiner’s index, |
Most of the Dragon descriptors are largely defined in ref. [26]
Fig. 2A decision tree for distinguishing active purine nucleoside analogs (A) from inactive analogs (B)
Confusion matrix for anti-HIV activity of purine nucleoside analogs in human PBM cells
| Model | Description | Ranges | Number of compounds predicted | Sensitivity (%) | Specificity (%) | Non-error rate (%) | Overall accuracy of prediction (%) | MCC | |
|---|---|---|---|---|---|---|---|---|---|
| Active | Inactive | ||||||||
| Decision tree | Training set | Active | 08 | 00 | 100 | 100 | 100 | >99.9 | 1.00 |
| Inactive | 00 | 28 | |||||||
| Tenfold cross-validated set | Active | 06 | 02 | 75 | 93 | 84 | 89 | 0.68 | |
| Inactive | 02 | 26 | |||||||
| Random forest | Active | 05 | 03 | 62.5 | 89 | 75.7 | 83 | 0.52 | |
| Inactive | 03 | 25 | |||||||
| Support vector machine | Training set | Active | 04 | 02 | 66.7 | 100 | 83.3 | 93 | 0.78 |
| Inactive | 00 | 21 | |||||||
| Test set | Active | 01 | 01 | 50 | 86 | 68 | 78 | 0.36 | |
| Inactive | 01 | 06 | |||||||
The recognition rate of decision tree-, random forest-, and support vector machine-based models is also shown
Proposed MAA models for the prediction of anti-HIV activity of PNAs in human PBM cells
| Descriptor | Nature of range in the proposed model | Descriptor value | Number of compounds in each range | Sensitivity (%) | Specificity (%) | Non-error rate (%) | Overall accuracy of prediction (%) | MCC | Average EC50 (μM) of correctly predicted compounds in each range | Average SI of correctly predicted compounds in each range | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Total | Correctly predicted | ||||||||||
| A2 | Lower inactive | <0.901 | 3 | 3 | 63 | 100 | 81.5 | 91.7 | 0.75 | 67.1 | 15.205 |
| Active | 0.901 to 0.922 | 5 | 5 | 0.182 | 825.741 | ||||||
| Upper inactive | >0.922 | 28 | 25 | 33.613 | 66.665 | ||||||
| A4 | Lower inactive | <2.76 | 9 | 9 | 63 | 100 | 81.5 | 91.7 | 0.75 | 78.855 | 6.544 |
| Active | 2.76 to 4.599 | 5 | 5 | 0.182 | 825.741 | ||||||
| Upper inactive | >4.599 | 22 | 19 | 17.47 | 43.186 | ||||||
| A23 | Lower inactive | <9.61 | 18 | 18 | 100 | 96 | 98 | 96.9 | 0.91 | 18.056 | 36.560 |
| Transitional | 9.61 to 43.43 | 4 | NA | 4.138 | 102.32 | ||||||
| Active | 43.44 to 357.424 | 7 | 6 | 0.141 | 859.542 | ||||||
| Upper inactive | >357.424 | 7 | 7 | 100 | 100 | ||||||
| A37 | Active | 1.531 to 1.589 | 5 | 5 | 63 | 100 | 81.5 | 91.7 | 0.75 | 0.134 | 920.339 |
| Inactive | >1.589 | 24 | 21 | 37.201 | 31.408 | ||||||
NA not applicable
Intercorrelation matrix
| A2 | A4 | A23 | A37 | |
|---|---|---|---|---|
| A2 | 1.00 | 0.85 | −0.13 | −0.62 |
| A4 | 1.00 | −0.39 | −0.47 | |
| A23 | 1.00 | −0.11 | ||
| A37 | 1.00 |
Fig. 3Average EC50 of anti-HIV activity of correctly predicted PNAs in various ranges of MAA-based models
Fig. 4Average SI against PBM cells of correctly predicted PNAs in various ranges of MAA-based models