| Literature DB >> 22463123 |
Vinita Periwal1, Shireesha Kishtapuram, Vinod Scaria.
Abstract
BACKGROUND: The emergence of Multi-drug resistant tuberculosis in pandemic proportions throughout the world and the paucity of novel therapeutics for tuberculosis have re-iterated the need to accelerate the discovery of novel molecules with anti-tubercular activity. Though high-throughput screens for anti-tubercular activity are available, they are expensive, tedious and time-consuming to be performed on large scales. Thus, there remains an unmet need to prioritize the molecules that are taken up for biological screens to save on cost and time. Computational methods including Machine Learning have been widely employed to build classifiers for high-throughput virtual screens to prioritize molecules for further analysis. The availability of datasets based on high-throughput biological screens or assays in public domain makes computational methods a plausible proposition for building predictive models. In addition, this approach would save significantly on the cost, effort and time required to run high throughput screens.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22463123 PMCID: PMC3342097 DOI: 10.1186/1471-2210-12-1
Source DB: PubMed Journal: BMC Pharmacol ISSN: 1471-2210
Misclassification cost used for false negatives with each classifier
| Classifier | Cost |
|---|---|
| 110 | |
| 14000 | |
| 35 | |
| 350 |
Statistics of best predictive models for AID449762
| 47.30 | 19.50 | 80.50 | 52.70 | 80.28% | 0.70 | 63.90 | |
| 47.00 | 19.20 | 80.80 | 53.00 | 80.58% | 0.712 | 63.90 | |
| 51.90 | 19.30 | 80.70 | 48.10 | 80.52% | 0.748 | 66.30 | |
| 40.60 | 19.10 | 80.90 | 59.40 | 80.62% | 0.61 | 60.75 |
*CSC denotes CostSensitiveClassifier, # BCR denotes Balanced Classification Rate
Figure 1Receiver operating characteristic (ROC) curve plot of all the models. A plot of ROC curve for all the classifiers. Among all classifiers, SMO achieved the maximum value for area under the curve (AUC) closely followed by Random Forest and Naïve Bayes. J48 had the least AUC. The corresponding scalar AUC values can be viewed in Table 2
Figure 2Sensitivity and Specificity plot of all models. The Sensitivity and Specificity plot of classifiers revealed an optimal prediction by all models. All the classifiers performed uniformly having high and equal specificity values with SMO being slightly more sensitive than others.
Figure 3A Schematic illustration of methodology used. HTS data was downloaded from PubChem [13] database. Molecular descriptors were calculated with software PowerMv [25]. Resulting data was processed to create train/test files and thereby used generate classification models on the Weka [27] workbench.