| Literature DB >> 30011783 |
Ismail Hdoufane1, Imane Bjij2,3, Mahmoud Soliman4, Alia Tadjer5, Didier Villemin6, Jane Bogdanov7, Driss Cherqaoui8.
Abstract
Quantitative Structure Activity Relationships (QSAR or SAR) have helped scientists to establish mathematical relationships between molecular structures and their biological activities. In the present article, SAR studies have been carried out on 89 tetrahydroimidazo[4,5,1-jk][1,4]benzodiazepine (TIBO) derivatives using different classifiers, such as support vector machines, artificial neural networks, random forests, and decision trees. The goal is to propose classification models that will be able to classify TIBO compounds into two groups: high and low inhibitors of HIV-1 reverse transcriptase. Each molecular structure was encoded by 10 descriptors. To check the validity of the established models, all of them were subjected to various validation tests: internal validation, Y-randomization, and external validation. The established classification models have been successful. The correct classification rates reached 100% and 90% in the learning and test sets, respectively. Finally, molecular docking analysis was carried out to understand the interactions between reverse transcriptase enzyme and the TIBO compounds studied. Hydrophobic and hydrogen bond interactions led to the identification of active binding sites. The established models could help scientists to predict the inhibition activity of untested compounds or of novel molecules prior to their synthesis. Therefore, they could reduce the trial and error process in the design of human immunodeficiency virus (HIV) inhibitors.Entities:
Keywords: HIV inhibitors; TIBO; decision trees; random forests and artificial neural networks; structure activity relationship; support vector machines
Year: 2018 PMID: 30011783 PMCID: PMC6160994 DOI: 10.3390/ph11030069
Source DB: PubMed Journal: Pharmaceuticals (Basel) ISSN: 1424-8247
Figure 1General structure of TIBO derivatives studied.
Chemical structures of the compounds studied and their anti-HIV activity
| Substituents | Classes | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| N | X | Z | R | X’ | a Exp | b SVM | c ANN | d DT | e RF |
| 1 | H | S | DMA | 5-Me( | H | H | H | H | H |
| 2 | 9-Cl | S | DMA | 5-Me( | H | H | H | H | H |
| t 3 | 8-Cl | S | DMA | 5-Me( | H | H | H | H | H |
| 4 | 8-F | S | DMA | 5-Me( | H | H | H | H | H |
| 5 | 8-SMe | S | DMA | 5-Me( | H | H | H | H | H |
| t 6 | 8-OMe | S | DMA | 5-Me( | H | H | H | H | H |
| 7 | 8-OC2H5 | S | DMA | 5-Me( | H | H | H | H | H |
| 8 | 8-CN | O | DMA | 5-Me( | H | H | H | H | H |
| t 9 | 8-CN | S | DMA | 5-Me( | H | H | H | H | H |
| 10 | 8-CHO | S | DMA | 5-Me( | H | H | H | H | H |
| 11 | 8-CONH2 | O | DMA | 5-Me( | L | L | L | L | L |
| 12 | 8-Br | O | DMA | 5-Me( | H | H | H | H | H |
| t 13 | 8-Br | S | DMA | 5-Me( | H | H | H | H | H |
| 14 | 8-I | O | DMA | 5-Me( | H | H | H | H | H |
| t 15 | 8-I | S | DMA | 5-Me( | H | H | H | H | H |
| 16 | 8-C=CH | O | DMA | 5-Me( | H | H | H | H | H |
| t 17 | 8-C=CH | S | DMA | 5-Me( | H | H | H | H | H |
| 18 | 8-Me | O | DMA | 5-Me( | H | H | H | H | H |
| 19 | 8-Me | S | DMA | 5-Me( | H | H | H | H | H |
| 20 | 9-NO2 | O | CPM | 5-Me( | L | L | L | L | L |
| t 21 | 8-NH2 | O | CPM | 5-Me( | L | L | L | L | L |
| 22 | 8-NMe2 | O | CPM | 5-Me( | L | L | L | L | L |
| 23 | 9-NH2 | O | CPM | 5-Me( | L | L | L | L | L |
| t 24 | 9-NMe2 | O | CPM | 5-Me( | L | L | L | L | L |
| 25 | 9-NHCOMe | O | CPM | 5-Me( | L | L | L | L | L |
| t 26 | 9-NO2 | S | CPM | 5-Me( | L | H | L | H | H |
| 27 | 9-F | S | DMA | 5-Me( | H | H | H | H | H |
| 28 | 9-CF3 | O | DMA | 5-Me( | L | L | L | L | L |
| t 29 | 9-CF3 | S | DMA | 5-Me( | H | H | H | H | H |
| t 30 | 9-Me | O | DEA | 5-Me( | H | L | L | L | L |
| 31 | 10-OMe | O | DMA | 5-Me( | L | L | L | L | L |
| t 32 | 10-OMe | S | DMA | 5-Me( | L | H | L | H | H |
| 33 | 9,10-di-Cl | S | DMA | 5-Me( | H | H | H | H | H |
| 34 | 10-Br | S | DMA | 5-Me( | H | H | H | H | H |
| 35 | H | O | CH2CH=CH2 | 5-Me( | L | L | L | L | L |
| 36 | H | O | 2-MA | 5-Me( | L | L | L | L | L |
| 37 | H | O | CH2CO2Me | 5-Me( | L | L | L | L | L |
| t 38 | H | O | CH2C≡CH | 5-Me( | L | L | L | L | L |
| 39 | H | O | CH2-2-furanyl | 5-Me( | L | L | L | L | L |
| 40 | H | O | CH2CH=CH2[S(+)] | 5-Me( | L | L | L | L | L |
| 41 | H | O | CH2CH2CH=CH2 | 5-Me( | L | L | L | L | L |
| 42 | H | O | CH2CH2CH3 | 5-Me( | L | L | L | L | L |
| 43 | H | O | 2-MA[S(+)] | 5-Me( | L | L | L | L | L |
| 44 | H | O | CPM | 5-Me( | L | L | L | L | L |
| t 45 | H | O | CH2CH=CHMe( | 5-Me( | L | L | L | L | L |
| 46 | H | O | CH2CH=CHMe( | 5-Me( | L | L | L | L | L |
| 47 | H | O | CH2CH2CH2Me | 5-Me( | L | L | L | L | L |
| 48 | H | O | DMA | 5-Me( | L | L | L | L | L |
| 49 | H | O | CH2C(Br)=CH2 | 5-Me( | L | L | L | L | L |
| 50 | H | O | CH2C(Me)=CHMe( | 5-Me( | L | L | L | L | L |
| 51 | H | O | DMA[ | 5-Me( | L | L | L | L | L |
| 52 | H | O | DMA[ | 5-Me( | L | L | L | L | L |
| t 53 | H | O | CH2C(C2H5)=CH2 | 5-Me( | L | L | L | L | L |
| 54 | H | O | CH2CH=CHC6H5( | 5-Me( | L | L | L | L | L |
| 55 | H | O | CH2C(CH=CH2)=CH2 | 5-Me( | L | L | L | L | L |
| 56 | 8-Cl | S | DMA | H | H | H | H | H | H |
| 57 | 9-Cl | S | DMA | H | H | H | H | H | H |
| 58 | H | O | 2-MA | 5,5-di-Me | L | L | L | L | L |
| 59 | H | O | 2-MA | 4-Me | L | L | L | L | L |
| 60 | 9-Cl | S | 2-MA | 4-Me( | H | H | H | L | H |
| 61 | 9-Cl | S | CPM | 4-Me( | L | L | L | L | L |
| 62 | H | O | C3H7 | 4-CHMe2 | L | L | L | L | L |
| 63 | H | O | 2-MA | 4-CHMe2 | L | L | L | L | L |
| 64 | H | O | 2-MA | 4-C3H7 | L | L | L | L | L |
| 65 | H | O | DMA | 7-Me | L | L | L | H | L |
| t 66 | 8-Cl | O | DMA | 7-Me | H | H | H | L | H |
| t 67 | 9-Cl | O | DMA | 7-Me | H | H | H | L | H |
| 68 | H | S | C3H7 | 7-Me | L | L | L | L | L |
| 69 | H | S | DMA | 7-Me | H | H | H | H | H |
| 70 | 8-Cl | S | DMA | 7-Me | H | H | H | H | H |
| 71 | 9-Cl | S | DMA | 7-Me | H | H | H | H | H |
| 72 | H | O | DMA | 4,5-di-Me( | L | L | L | L | L |
| 73 | H | S | DMA | 4,5-di-Me( | L | L | L | L | L |
| t 74 | H | S | CPM | 4,5-di-Me( | L | L | L | L | L |
| 75 | H | S | DMA | 4,5-di-Me( | L | L | L | L | L |
| 76 | H | S | DMA | 5,7-di-Me( | H | H | H | H | H |
| 77 | H | S | DMA | 5,7-di-Me( | H | H | H | H | H |
| 78 | 9-Cl | O | DMA | 5,7-di-Me( | H | H | H | H | H |
| 79 | 9-Cl | S | DMA | 5,7-di-Me( | H | H | H | H | H |
| 80 | H | S | DMA | 4,7-di-Me( | L | L | L | L | L |
| t 81 | 9-Cl | O | DMA | 5-Me( | H | H | H | L | L |
| 82 | 9-Cl | S | CPM | 5-Me( | H | H | H | H | H |
| t 83 | H | S | CPM | 5-Me( | H | H | L | H | L |
| 84 | H | O | C3H7 | 5-Me | L | L | L | L | L |
| 85 | H | S | C3H7 | 5-Me | L | L | L | L | L |
| 86 | H | O | 2-MA | 5-Me | L | L | L | L | L |
| 87 | H | S | DMA | 5-Me | H | H | H | H | H |
| 88 | H | O | DMA | 5-Me( | L | L | L | L | L |
| 89 | H | S | 2-MA | 5-Me( | H | H | L | H | H |
a Experimental activity. b–e Predicted classes by SVM, ANN, DT, and RF, respectively. t Test set. DMA: 3,3-Dimethylallyl. MA: Methylallyl. CPM: Cyclopropylmethyl. DEA: Diethylallyl.
List of the selected molecular descriptors and their physical and chemical meanings.
| Descriptors | Chemical Meaning |
|---|---|
| MD1 | logP: Octanol/water partition coefficient for the compound studied |
| MD2 | Average nucleophilic reaction index for a N atom |
| MD3 | Minimum total interaction for a H-N bond |
| MD4 | Minimum (>0.1) bond order of a N atom |
| MD5 | ESP-HBSA H-bonding surface area |
| MD6 | Maximum atomic state energy for a N atom |
| MD7 | 3χ: Molecular connectivity index to the third order |
Figure 2Decision tree for TIBO derivatives.
Classification results of the training and the test sets obtained by the four methods. Sn(H) and Sn(L) are Sensitivity and Specificity, respectively.
| Methods | Training Set (%) | Test Set (%) | ||||
|---|---|---|---|---|---|---|
| Total Accuracy |
|
| Total Accuracy |
|
| |
| SVM | 100.00 | 100.00 | 100.00 | 85.00 | 91.67 | 75.00 |
| ANN | 98.55 | 96.43 | 100.00 | 90.00 | 83.33 | 100.00 |
| DT | 97.10 | 96.43 | 97.56 | 70.00 | 66.67 | 75.00 |
| RF | 100.00 | 100.00 | 100.00 | 75.00 | 75.00 | 75.00 |
Figure 3Correct Classification Rate (CCR) of SVM, ANN, DT, and RF.
Misclassified samples by SVM, ANN, DT, and RF.
| Method | Sets | Misclassified Compounds |
|---|---|---|
| SVM | Training set | |
| Test set | 26,30,32 | |
| ANN | Training set | 89 |
| Test set | 30,83 | |
| DT | Training set | 60,65 |
| Test set | 26,30,32,66,67,81 | |
| RF | Training set | |
| Test set | 26,30,32,81,83 |
Contribution of molecular descriptors to the anti-HIV SAR study. Average values are mentioned in the bottom of the table.
| MD1 | MD2 | MD3 | MD4 | MD5 | MD6 | MD7 | MD8 | MD9 | MD10 |
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
| 12.96 | 0.00 | 13.88 | 6.70 | 10.20 | 22.18 | 8.80 | 8.66 | 8.80 | 7.82 |
|
| |||||||||
| 11.44 | 0.00 | 10.55 | 6.21 | 11.06 | 18.91 | 9.78 | 14.09 | 9.60 | 8.35 |
|
| |||||||||
| 12.02 | 0.00 | 13.63 | 6.34 | 10.70 | 20.52 | 9.35 | 10.01 | 9.27 | 8.15 |
|
| |||||||||
|
|
|
|
|
|
|
|
|
|
|
Figure 4(a): Alignment view of the pre-existing (blue) and docked (yellow) ligands. (b): Alignment view of high and low TIBO derivatives.
Figure 5Hydrogen bond between the imidazolone ring of the ligand and the LYS101 residue of the RT. Hydrophobic, electrostatic, and steric contour maps are represented by green, blue, and red contours, respectively.
Figure 6Hydrogen bond, hydrophobic, and electrostatic interactions as exhibited by compounds 8 (left) and 37 (right).