| Literature DB >> 21569562 |
Kailin Tang1, Ruixin Zhu, Yixue Li, Zhiwei Cao.
Abstract
BACKGROUND: To assess whether a compound is druglike or not as early as possible is always critical in drug discovery process. There have been many efforts made to create sets of 'rules' or 'filters' which, it is hoped, will help chemists to identify 'drug-like' molecules from 'non-drug' molecules. However, among the chemical space of the druglike molecules, the minority will be approved drugs. Classifying approved drugs from experimental drugs may be more helpful to obtain future approved drugs. Therefore, discrimination of approved drugs from experimental ones has been done in this paper by analyzing the compounds in terms of existing drugs features and machine learning methods.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21569562 PMCID: PMC3120701 DOI: 10.1186/1471-2105-12-157
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Commonly used datasets
| Dataset | Number of compounds |
|---|---|
| Comprehensive Medicinal Chemistry (CMC) [ | > 8000 compounds used or studied as medicinal agents |
| World Drug Index (WDI) [ | > 80,000 marketed and development drugs worldwide |
| MACCS-II Drug Data Report (MDDR) [ | >100,000 drugs launched or under development |
| Available Chemicals Directory (ACD) [ | > 1,160,000 unique chemicals |
The number of compounds per dataset
| Dataset | size | Pass Lipinski | Pass Oprea |
|---|---|---|---|
| Approved drugs | 1348 | 1158 | 1041 |
| Experimental drugs | 3206 | 2621 | 2271 |
| Herbal ingredients | 10370 | 7599 | 6058 |
Figure 1Results of classification. Boxplot of performance for the four classification methods.
Figure 2Results of cross-validation. Histograms to illustrate the Sn, Sp, and Ac of 10 times 5-fold cross-validation
classification results
| Sn | Sp | Ac | CC | |
|---|---|---|---|---|
| Best SVM | 0.5929 | 0.8743 | 0.7911 | 0.5077 |
| Voting model | 0.5523 | 0.9320 | 0.8197 | 0.5415 |
| Consensus model | 0.7242 | 0.9352 | 0.8517 | 0.6449 |
Predicted results
| Compound type | Compound | Pass Lipinski | Pass Oprea | Pass Our |
|---|---|---|---|---|
| typeI | 59 | 47 | 39 | 45 |
| typeII | 68 | 54 | 45 | 15 |
| typeIII | 10243 | 7498 | 5974 | 3666 |