| Literature DB >> 34278133 |
Selvaraman Nagamani1, G Narahari Sastry1.
Abstract
The drug-resistant strains of <span class="Species">Mycobacterium tuberculosis (<span class="Species">M.tb) are evolving at an alarming rate, and this indicates the urgent need for the development of novel antitubercular drugs. However, genetic mutations, complex cell wall system of M.tb, and influx-efflux transporter systems are the major permeability barriers that significantly affect the M.tb drugs activity. Thus, most of the small molecules are ineffective to arrest the M.tb cell growth, even though they are effective at the cellular level. To address the permeability issue, different machine learning models that effectively distinguish permeable and impermeable compounds were developed. The enzyme-based (IC50) and cell-based (minimal inhibitory concentration) data were considered for the classification of M.tb permeable and impermeable compounds. It was assumed that the compounds that have high activity in both enzyme-based and cell-based assays possess the required M.tb cell wall permeability. The XGBoost model was outperformed when compared to the other models generated from different algorithms such as random forest, support vector machine, and naïve Bayes. The XGBoost model was further validated using the validation data set (21 permeable and 19 impermeable compounds). The obtained machine learning models suggested that various descriptors such as molecular weight, atom type, electrotopological state, hydrogen bond donor/acceptor counts, and extended topochemical atoms of molecules are the major determining factors for both M.tb cell permeability and inhibitory activity. Furthermore, potential antimycobacterial drugs were identified using computational drug repurposing. All the approved drugs from DrugBank were collected and screened using the developed permeability model. The screened compounds were given as input in the PASS server for the identification of possible antimycobacterial compounds. The drugs that were retained after two filters were docked to the active site of 10 different potential antimycobacterial drug targets. The results obtained from this study may improve the understanding of M.tb permeability and activity that may aid in the development of novel antimycobacterial drugs.Entities:
Year: 2021 PMID: 34278133 PMCID: PMC8280707 DOI: 10.1021/acsomega.1c01865
Source DB: PubMed Journal: ACS Omega ISSN: 2470-1343
Figure 1Performance of different machine learning models for the classification of M.tb permeable and impermeable compounds (RF, random forest; GBM, gradient boosting model; CART, classification and regression model; Glmnet, Lasso and elastic-net regularized generalized linear model; SVM, support vector machine; KNN, k-nearest neighbors; NB, naïve Bayes; and logistic, logistic regression).
Figure 2Optimization of RF models at different mtry and ntree values using top (A) 20 descriptors, (B) 40 descriptors, (C) 60 descriptors, (D) 80 descriptors, and (E) 100 descriptors as input features. Mtry is the number of variables randomly sampled as candidates at each split, and ntree is the number of trees to grow. The performance has been calculated by the percentage of OOB error.
Performance of RF Models at Different mtry Values Using Variable Number of Important Descriptorsa
| descriptors | mtry | sensitivity | specificity | precision | accuracy | MCC |
|---|---|---|---|---|---|---|
| Top 20 | 3 | 0.9814 | 0.8289 | 0.9420 | 0.9403 | 0.8455 |
| 4 | 0.9907 | 0.8289 | 0.9425 | 0.9692 | 0.8645 | |
| 5 | 0.9814 | 0.8205 | 0.9378 | 0.9412 | 0.8395 | |
| 6 | 0.9721 | 0.8289 | 0.9414 | 0.9130 | 0.8273 | |
| 7 | 0.9814 | 0.8289 | 0.9420 | 0.9403 | 0.8455 | |
| Top 40 | 2 | 0.9767 | 0.8158 | 0.9375 | 0.9254 | 0.8270 |
| 3 | 0.9814 | 0.8553 | 0.9505 | 0.9420 | 0.8641 | |
| 4 | 0.9907 | 0.8684 | 0.9552 | 0.9706 | 0.8918 | |
| 5 | 0.9814 | 0.8421 | 0.9462 | 0.9412 | 0.8548 | |
| 6 | 0.9860 | 0.8421 | 0.9464 | 0.9552 | 0.8641 | |
| Top 60 | 3 | 0.9953 | 0.8421 | 0.9469 | 0.9846 | 0.8832 |
| 5 | 0.9814 | 0.8289 | 0.9420 | 0.9403 | 0.8455 | |
| 7 | 0.9814 | 0.8158 | 0.9378 | 0.9394 | 0.8362 | |
| 9 | 0.9907 | 0.8205 | 0.9383 | 0.9697 | 0.8583 | |
| 11 | 0.9814 | 0.8289 | 0.9420 | 0.9403 | 0.8455 | |
| Top 80 | 36 | 0.9721 | 0.8421 | 0.9457 | 0.9143 | 0.8368 |
| 37 | 0.9721 | 0.8421 | 0.9457 | 0.9143 | 0.8368 | |
| 38 | 0.9953 | 0.8421 | 0.9469 | 0.9846 | 0.8832 | |
| 39 | 0.9587 | 0.8421 | 0.9457 | 0.8767 | 0.8115 | |
| 40 | 0.9814 | 0.8421 | 0.9462 | 0.9412 | 0.8548 | |
| Top 100 | 2 | 0.9814 | 0.8289 | 0.9420 | 0.9403 | 0.8455 |
| 4 | 0.9907 | 0.8421 | 0.9467 | 0.9697 | 0.8736 | |
| 6 | 0.9767 | 0.8421 | 0.9459 | 0.9275 | 0.8457 | |
| 8 | 0.9721 | 0.8421 | 0.9457 | 0.9143 | 0.8368 | |
| 10 | 0.9814 | 0.8421 | 0.9462 | 0.9412 | 0.8548 |
The top 40 descriptors at mtry 4 were selected as the best performing model. These descriptors were further optimized using the boosting method in XGBoost.
Figure 3ROC performance of the (A) gradient boosting model and (B) QSAR model on external validation data set (40 compounds).
Figure 4Schematic workflow for the development of M.tb permeability machine learning model.
Figure 5Significantly discriminating descriptors: (A) ETA_Epsilon_5, (B) nHBint5, (C) A log P, (D) MaxsOm, (E) Mpe, (F) GATS1e, (G) GATS3v, (H) MeanI, and (I) BUCTp.1l on the basis of Wilcoxon test (P < 0.05) among permeable and impermeable compounds.
The Top 10 Drug Molecules with Probable Tuberculosis Activity in PASS Prediction along with Docking Scores against 10 M.tb Targets
| docking
score (kcal/mol) | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S. no. | drug name | phase | MOA | mtcA2 | folA | inhA | Cyp51 | folP1 | tmk | ligA | pknB | kasA | dprE1 | |
| 1 | Nomegestrol | approved | progestrone receptor agonist | 0.9793 | –7.5 | –10.9 | –11.9 | –11.7 | –9.1 | –8.4 | –9.4 | –10.2 | –8.6 | –10.6 |
| 2 | NGX267 | investigational | muscarinic acetylcholine receptor M1 | 0.956 | –5.0 | –6.6 | –7.0 | –7.2 | –6.0 | –5.7 | –6.1 | –6.4 | –6.7 | –7.6 |
| 3 | Gamolenic acid | approved and investigational | NA | 0.8674 | –5.7 | –7.6 | –7.8 | –8.0 | –6.1 | –7.5 | –6.4 | –6.9 | –7.6 | –7.8 |
| 4 | Tetrazepam | experimental | NA | 0.8559 | –6.3 | –9.2 | –10.7 | –9.6 | –8.9 | –7.2 | –9.0 | –9.5 | –8.3 | –9.1 |
| 5 | Nitrofural | approved and investigational | anti-infective agent | 0.7415 | –4.9 | –6.4 | –6.2 | –6.5 | –5.6 | –7.3 | –6.1 | –5.7 | –6.4 | –7.3 |
| 6 | Quinine | approved | hemozoin biocrystallization inhibitor | 0.6352 | –6.7 | –9.6 | –9.7 | –10.4 | –8.4 | –8.1 | –8.1 | –8.2 | –9.9 | –10.6 |
| 7 | Quinidine | approved and investigational | sodium channel blocker | 0.6352 | –6.8 | –9.6 | –9.7 | –10.4 | –8.4 | –8.6 | –8.1 | –8.2 | –8.8 | –9.7 |
| 8 | But-3-enyl-[5-(4-chloro-phenyl)-3,6-dihydro-[1,3,4]thiadiazin-2-ylidene]-amine | experimental | NA | 0.6175 | –5.6 | –7.7 | –7.5 | –8.2 | –6.6 | –7.6 | –7.0 | –6.6 | –6.8 | –8.0 |
| 9 | Lefamulin | approved and investigational | 50S ribosomal protein L22 | 0.5986 | –7.5 | –11.9 | –13.3 | –12.0 | –10.2 | –9.9 | –10.6 | –11.3 | –9.1 | –12.2 |
| 10 | Stavudine | approved and investigational | nucleoside reverse transcriptase inhibitor | 0.5958 | –6.1 | –7.2 | –7.3 | –7.7 | –6.3 | –8.6 | –6.6 | –6.3 | –7.3 | –8.0 |
Pa is the probability of active.