| Literature DB >> 23181938 |
Mingyun Shen1, Sheng Tian, Youyong Li, Qian Li, Xiaojie Xu, Junmei Wang, Tingjun Hou.
Abstract
BACKGROUND: In this work, we analyzed and compared the distributionpan> profiles of a wide variety of molecular properties for three compounpan>d classes: drug-like compounpan>ds in MDL Drug Data Report (MDDR), non-drug-like compounds in Available Chemical Directory (ACD), and natural compounds in Traditional Chinese Medicine Compound Database (TCMCD). <br> RESULTS: The comparison of the property distributions suggests that, when all compounds in MDDR, ACD and TCMCD with molecular weight lower than 600 were used, MDDR and ACD are substantially different while TCMCD is much more similar to MDDR than ACD. However, when the three subsets of ACD, MDDR and TCMCD with similar molecular weight distributions were examined, the distribution profiles of the representative physicochemical properties for MDDR and ACD do not differ significantly anymore, suggesting that after the dependence of molecular weight is removed drug-like and non-drug-like molecules cannot be effectively distinguished by simple property-based filters; however, the distribution profiles of several physicochemical properties for TCMCD are obviously different from those for MDDR and ACD. Then, the performance of each molecular property on predicting drug-likeness was evaluated. No single molecular property shows good performance to discriminate between drug-like and non-drug-like molecules. Compared with the other descriptors, fractional negative accessible surface area (FASA-) performs the best. Finally, a PCA-based scheme was used to visually characterize the spatial distributions of the three classes of compounds with similar molecular weight distributions. <br> CONCLUSION: If FASA- was used as a drug-likeness filter, more than 80% molecules in TCMCD were predicted to be drug-like. Moreover, the principal component plots show that natural compounds in TCMCD have different and even more diverse distributions than either drug-like compounds in MDDR or non-drug-like compounds in ACD.Entities:
Year: 2012 PMID: 23181938 PMCID: PMC3538521 DOI: 10.1186/1758-2946-4-31
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Figure 1The distributions of eight important molecular property descriptors for ACD1, MDDR1 and TCMCD1.
The mean values of different properties for ACD, MDDR and TCMCD
| | | <600 | <800 | <600 | <600 | <800 | <600 | <800 | <600 |
| 1 | Alog | 2.29 | 2.30 | 3.84 | 3.24 | 3.30 | 2.84 | 2.73 | 3.02 |
| 2 | log | 1.78 | 1.79 | 3.45 | 2.71 | 2.77 | 2.53 | 2.41 | 2.71 |
| 3 | log | −3.76 | −3.79 | −6.00 | −5.51 | −5.76 | −4.34 | −4.41 | −4.71 |
| 4 | MW | 270 | 272 | 398 | 400 | 425 | 366 | 403 | 400 |
| 5 | 3.22 | 3.24 | 4.41 | 4.83 | 5.22 | 5.27 | 6.32 | 5.82 | |
| 6 | 1.25 | 1.26 | 1.24 | 1.64 | 1.82 | 2.12 | 2.72 | 2.33 | |
| 7 | 4.28 | 4.32 | 6.08 | 6.45 | 7.02 | 4.44 | 5.08 | 4.88 | |
| 8 | PSA | 61.4 | 61.8 | 81.3 | 88.2 | 95.4 | 84.5 | 99.7 | 92.4 |
| 9 | 3.88 | 3.90 | 5.33 | 6.02 | 6.51 | 5.50 | 6.48 | 6.04 | |
| 10 | 1.54 | 1.55 | 1.39 | 1.83 | 2.01 | 2.14 | 2.74 | 2.35 | |
| 11 | MSA | 262 | 264 | 365 | 382 | 406 | 358 | 394 | 390 |
| 12 | 13.5 | 13.6 | 19.6 | 21.2 | 22.4 | 20.3 | 22.4 | 22.2 | |
| 13 | 1.94 | 1.94 | 2.30 | 2.89 | 3.03 | 0.477 | 0.300 | 0.459 | |
| 14 | 1.94 | 1.96 | 3.03 | 3.12 | 3.48 | 5.03 | 6.18 | 5.59 | |
| 15 | 0.655 | 0.658 | 1.12 | 0.647 | 0.664 | 0.139 | 0.0130 | 0.154 | |
| 16 | 18.3 | 18.4 | 26.5 | 28.2 | 29.9 | 26.1 | 28.9 | 28.5 | |
| 17 | 19.2 | 19.3 | 28.5 | 30.6 | 32.5 | 28.3 | 31.5 | 31.0 | |
| 18 | 0.0478 | 0.048 | 0.0706 | 0.0519 | 0.056 | 0.0348 | 0.017 | 0.0347 | |
| 19 | 0.0508 | 0.052 | 0.0797 | 0.0696 | 0.074 | 0.0341 | 0.012 | 0.0341 | |
| 20 | 0.00428 | 0.0040 | 0.0101 | 0.0200 | 0.021 | 0.0388 | 0.042 | 0.0459 | |
| 21 | 0.0229 | 0.0240 | 0.0355 | 0.106 | 0.121 | 0.456 | 0.542 | 0.503 | |
| 22 | 10.6 | 10.7 | 16.8 | 18.6 | 19.3 | 16.9 | 18.7 | 18.5 | |
| 23 | 8.16 | 8.19 | 13.0 | 12.4 | 12.7 | 5.79 | 6.01 | 6.07 | |
| 24 | 0.0880 | 0.0930 | 0.139 | 0.456 | 0.555 | 1.75 | 2.13 | 1.92 | |
| 25 | 1.92 | 1.93 | 3.00 | 3.41 | 3.52 | 3.27 | 3.59 | 3.58 | |
| 26 | 1.43 | 1.44 | 2.27 | 2.19 | 2.25 | 0.999 | 1.03 | 1.05 | |
| 27 | 1.65 | 1.66 | 2.43 | 2.52 | 2.59 | 1.51 | 1.67 | 1.61 | |
| 28 | 0.0407 | 0.041 | 0.0198 | 0.0401 | 0.043 | 0.0770 | 0.0721 | 0.0823 | |
| 29 | 0.00864 | 0.0090 | 0.00635 | 0.0427 | 0.044 | 0.0111 | 0.0120 | 0.0103 | |
| 30 | 0.474 | 0.475 | 0.644 | 0.815 | 0.825 | 0.705 | 0.696 | 0.762 | |
| 31 | 1.38 | 1.38 | 2.31 | 2.44 | 2.53 | 2.35 | 2.68 | 2.59 | |
| 32 | 0.0154 | 0.015 | 0.0193 | 0.0625 | 0.062 | 0.0693 | 0.064 | 0.0735 | |
| 33 | 0.00202 | 0.0020 | 0.00148 | 0.00356 | 0.0040 | 0.0128 | 0.020 | 0.0152 | |
| 34 | 6.21E-4 | 8.27E-4 | 0.00285 | 0.00620 | 0.018 | 0.0425 | 0.043 | 0.0446 | |
| 35 | 21.0 | 21.1 | 26.7 | 30.6 | 32.7 | 33.3 | 37.3 | 36.5 | |
| 36 | 8.51 | 8.55 | 12.7 | 13.5 | 14.0 | 12.0 | 13.3 | 13.1 | |
| 37 | 0.550 | 0.561 | 0.720 | 1.25 | 1.49 | 4.14 | 5.12 | 4.76 | |
| 38 | 0.163 | 0.166 | 0.442 | 0.493 | 0.524 | 1.06 | 1.09 | 1.13 | |
| 39 | SC0 | 18.3 | 18.4 | 26.5 | 28.2 | 29.9 | 26.0 | 28.9 | 28.4 |
| 40 | SC1 | 19.2 | 19.3 | 28.5 | 30.6 | 32.5 | 28.3 | 31.5 | 31.0 |
| 41 | SC2 | 26.1 | 26.3 | 40.0 | 43.2 | 45.9 | 42.9 | 48.1 | 47.4 |
| 42 | SC3P | 32.0 | 32.3 | 51.5 | 56.5 | 60.0 | 60.0 | 67.8 | 66.9 |
| 43 | SC3C | 6.20 | 6.25 | 9.88 | 10.7 | 11.5 | 13.6 | 15.4 | 15.4 |
| 44 | SC3CH | 0.0407 | 0.041 | 0.0198 | 0.0401 | 0.043 | 0.0770 | 0.072 | 0.0823 |
Figure 2The distributions of eight important molecular property descriptors for ACD3, MDDR1 and TCMCD3.
Figure 3The plot of the first two principal components for (a) ACD4, (b) MDDR3 and (c) TCMCD3.