| Literature DB >> 23173901 |
Bucong Han1, Xiaohua Ma, Ruiying Zhao, Jingxian Zhang, Xiaona Wei, Xianghui Liu, Xin Liu, Cunlong Zhang, Chunyan Tan, Yuyang Jiang, Yuzong Chen.
Abstract
BACKGROUND: Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates.Entities:
Year: 2012 PMID: 23173901 PMCID: PMC3538513 DOI: 10.1186/1752-153X-6-139
Source DB: PubMed Journal: Chem Cent J ISSN: 1752-153X Impact factor: 4.215
Figure 1The structures of representative c-Src inhibitors.
Molecular descriptors used in this work
| Simple molecular properties [ | 18 | Number of C,N,O,P,S, Number of total atoms, Number of rings, Number of bonds, Number of non-H bonds, Molecular weight,, Number of rotatable bonds, number of H-bond donors, number of H-bond acceptors, Number of 5-member aromatic rings, Number of 6-member aromatic rings, Number of N heterocyclic rings, Number of O heterocyclic rings, Number of S heterocyclic rings. |
| Chemical properties [ | 3 | Sanderson electronegativity, Molecular polarizability, aLogp |
| Molecular Connectivity and shape [ | 35 | Schultz molecular topological index, Gutman molecular topological index, Wiener index, Harary index, Gravitational topological index, Molecular path count of length 1–6, Total path count, Balaban Index J, 0-2th valence connectivity index, 0-2th order delta chi index, Pogliani index, 0-2th Solvation connectivity index, 1-3th order Kier shape index, 1-3th order Kappa alpha shape index, Kier Molecular Flexibility Index, Topological radius, Graph-theoretical shape coefficient, Eccentricity, Centralization, Logp from connectivity. |
| Electro-topological state [ | 42 | Sum of Estate of atom type sCH3, dCH2, ssCH2, dsCH, aaCH, sssCH, dssC, aasC, aaaC, sssC, sNH3, sNH2, ssNH2, dNH, ssNH, aaNH, dsN, aaN, sssN, ddsN, aOH, sOH, ssO, sSH; Sum of Estate of all heavy atoms, all C atoms, all hetero atoms, Sum of Estate of H-bond acceptors, Sum of H Estate of atom type HsOH, HdNH, HsSH, HsNH2, HssNH, HaaNH, HtCH, HdCH2, HdsCH, HaaCH, HCsats, HCsatu, Havin, Sum of H Estate of H-bond donors |
Figure 2The process of training and using a SVM VS model for screening compounds. Schematic diagram is illustrating the process of the training a prediction model and using it for predicting active compounds of a compound class from their structurally-derived properties (molecular descriptors) by using support vector machines. A, B, E, F and (hj, pj, vj,…) represents such structural and physicochemical properties as hydrophobicity, volume, polarizability, etc.
Performance of SVM for identifying Src inhibitors and non-inhibitors evaluated by 5-fold cross validation study
| 1 | 1362/341 | 320 | 21 | 93.84% | 50654/12664 | 12651 | 13 | 99.90% | 99.74% | 0.948 |
| 2 | 1362/341 | 324 | 17 | 95.01% | 50654/12664 | 12650 | 14 | 99.89% | 99.76% | 0.953 |
| 3 | 1362/341 | 324 | 17 | 95.01% | 50654/12664 | 12640 | 24 | 99.81% | 99.68% | 0.939 |
| 4 | 1363/340 | 318 | 22 | 93.53% | 50655/12663 | 12642 | 21 | 99.83% | 99.67% | 0.935 |
| 5 | 1363/340 | 322 | 18 | 94.71% | 50655/12663 | 12643 | 20 | 99.84% | 99.71% | 0.943 |
| Average | | | | 94.42% | | | | 99.85% | 99.71% | 0.944 |
| SD | | | | 0.0069 | | | | 0.0004 | 0.0004 | 0.0072 |
| SE | 0.0031 | 0.0002 | 0.0002 | 0.0032 | ||||||
Performance of kNN for identifying Src inhibitors and non-inhibitors evaluated by 5-fold cross validation study
| 1 | 1362/341 | 302 | 39 | 88.56% | 50654/12664 | 12635 | 29 | 99.77% | 99.48% | 0.896 |
| 2 | 1362/341 | 313 | 28 | 91.79% | 50654/12664 | 12620 | 44 | 99.65% | 99.45% | 0.894 |
| 3 | 1362/341 | 311 | 30 | 91.20% | 50654/12664 | 12610 | 54 | 99.57% | 99.35% | 0.878 |
| 4 | 1363/340 | 316 | 24 | 92.94% | 50655/12663 | 12619 | 44 | 99.65% | 99.48% | 0.901 |
| 5 | 1363/340 | 302 | 38 | 88.82% | 50655/12663 | 12632 | 31 | 99.76% | 99.47% | 0.895 |
| Average | | | | 90.66% | | | | 99.68% | 99.44% | 0.893 |
| SD | | | | 0.0191 | | | | 0.0008 | 0.0005 | 0.0085 |
| SE | 0.0085 | 0.0004 | 0.0002 | 0.0038 | ||||||
Performance of PNN for identifying Src inhibitors and non-inhibitors evaluated by 5-fold cross validation study
| 1 | 1362/341 | 319 | 22 | 93.55% | 50654/12664 | 12413 | 251 | 98.02% | 97.90% | 0.715 |
| 2 | 1362/341 | 324 | 17 | 95.01% | 50654/12664 | 12380 | 284 | 97.76% | 97.69% | 0.702 |
| 3 | 1362/341 | 330 | 11 | 96.77% | 50654/12664 | 12395 | 269 | 97.88% | 97.85% | 0.722 |
| 4 | 1363/340 | 330 | 10 | 97.06% | 50655/12663 | 12389 | 274 | 97.84% | 97.82% | 0.720 |
| 5 | 1363/340 | 318 | 22 | 93.53% | 50655/12663 | 12413 | 250 | 98.03% | 97.91% | 0.715 |
| Average | | | | 95.19% | | | | 97.90% | 97.83% | 0.715 |
| SD | | | | 0.0169 | | | | 0.0012 | 0.0009 | 0.0075 |
| SE | 0.0076 | 0.0005 | 0.0004 | 0.0034 | ||||||
Figure 3Performance for identifying Src inhibitors evaluated by 5-fold cross validation study across methods. Figure 3 is illustrating the 5-fold cross-validation studies of Src inhibitors across methods with the averaged sensitivity together with their respective error bars.
Virtual screening performance of support vector machines for identifying Src inhibitors from large compound libraries
| Inhibitors in Testing Set | Number of Inhibitors | 44 |
| Number of Chemical Families Covered by Inhibitors | 35 | |
| Percent of Inhibitors in Chemical Families Covered by Inhibitors in Training Set | 51.43% | |
| Virtual Screening Performance | Yield | 70.45% |
| Number and Percent of Identified True Inhibitors Outside Training Chemical Families | 15 (34.1%) | |
| Number and Percent of 13.56M PubChemCompounds Identified as Inhibitors | 44,843 (0.33%) | |
| Number and Percent of the 168K MDDR Compounds Identified as Inhibitors | 1,496 (0.89%) | |
| Number and Percent of the 9,305 MDDR Compounds Similar to the Known Inhibitors Identified as Inhibitors | 719 (7.73%) |
MDDR classes that contain higher percentage (≥3%) of SVM virtual-hits and the percentage values
| Antineoplastic | 623 | 2.9% |
| Tyrosine-Specific Protein Kinase Inhibitor | 231 | 19.6% |
| Signal Transduction Inhibitor | 194 | 9.5% |
| Antiarthritic | 176 | 1.5% |
| Antiallergic/Antiasthmatic | 83 | 0.8% |
| Antihypertensive | 76 | 0.7% |
| Antiangiogenic | 75 | 4.6% |
| Treatment for Osteoporosis | 55 | 2.2% |
| Antidepressant | 49 | 0.8% |
Virtual-hits are identified by SVMs in screening 168K MDDR compounds for Src inhibitors. The total number of SVM identified virtual hits is 1,496.
Figure 4Virtual hit inhibiting Src at a moderate rate of 4.85% at 20 μM.
Comparison of virtual screening performance of SVM with those of other methods
| Support Vector Machines | 1703 | 493 | 44 | 35 | 51.43% | 70.45% | 15(34.1%) | 1,496 (0.89%) | 719 (7.73%) |
| Tanimoto Similarity | 36.84% | 9(20.5%) | 9,305 (5.54%) | 9,305 (100%) | |||||
| K Nearest Neighbour | 38.64% | 10(22.7%) | 4,182 (2.49%) | 1,169 (12.57%) | |||||
| Probabilistic Neural Network | 50.0% | 13(29.5%) | 4,386 (2.60%) | 1,184 (12.72%) | |||||