| Literature DB >> 25880749 |
Harinder Singh1, Sandeep Singh2, Deepak Singla3, Subhash M Agarwal4, Gajendra P S Raghava5.
Abstract
BACKGROUND: Epidermal Growth Factor Receptor (EGFR) is a well-characterized cancer drug target. In the past, several QSAR models have been developed for predicting inhibition activity of molecules against EGFR. These models are useful to a limited set of molecules for a particular class like quinazoline-derivatives. In this study, an attempt has been made to develop prediction models on a large set of molecules (~3500 molecules) that include diverse scaffolds like quinazoline, pyrimidine, quinoline and indole.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25880749 PMCID: PMC4372225 DOI: 10.1186/s13062-015-0046-9
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Figure 1Average frequency with standard deviation of various functional groups in inhibitors and non-inhibitors of EGFR10 and EGFR1000 datasets respectively.
Figure 2Shows EGFR inhibitor gefitinib marked with two frequently occurring functional groups (R2NH and rings).
Figure 3Maximum common substructures (MCS) and their count found in active/inhibitors of EGFR10 dataset.
List of best 10 positive and negative fingerprints that occurs more frequently in inhibitors and non-inhibitors of EGFR10 dataset respectively.
|
|
| ||||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| 380 | 71.85 | 43.64 | 28.21 | 698 | 21.26 | 45.15 | −23.89 |
| 579 | 75.79 | 52.82 | 22.97 | 673 | 8.27 | 31.80 | −23.53 |
| 189 | 38.78 | 17.35 | 21.43 | 690 | 57.48 | 76.51 | −19.03 |
| 388 | 67.52 | 46.41 | 21.11 | 700 | 19.29 | 38.00 | −18.71 |
| 816 | 24.21 | 6.24 | 17.97 | 714 | 3.54 | 20.42 | −16.88 |
| 815 | 32.68 | 16.68 | 15.99 | 145 | 30.31 | 45.28 | −14.96 |
| 374 | 39.96 | 27.06 | 12.90 | 701 | 14.37 | 28.50 | −14.13 |
| 613 | 32.87 | 20.95 | 11.92 | 669 | 2.17 | 15.92 | −13.75 |
| 661 | 31.50 | 19.82 | 11.68 | 195 | 6.50 | 18.02 | −11.52 |
| 348 | 40.16 | 29.50 | 10.66 | 382 | 2.56 | 11.61 | −9.05 |
Figure 4Structural representation of PubChem fingerprints found more frequently in inhibitors as compare to non-inhibitors.
Figure 5The performance of simple method that predicts inhibitors based on occurrence of best 20 fingerprints found in inhibitors and non-inhibitors. The secondary Y-axis shows the range of MCC and X-axis shows the summed up values of best 20 Fingerprints.
The performance of models based on various classifiers developed & evaluated on EGFR10 dataset
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| IBK | 68.69 | 84.98 | 82.63 | 0.45 | 0.87 |
| Bayes | 68.73 | 70.57 | 70.31 | 0.29 | 0.72 |
| Naive Bayes | 69.87 | 67.96 | 68.23 | 0.27 | 0.70 |
| SVM | 67.11 | 86.24 | 83.48 | 0.46 | 0.87 |
| Random Forest | 68.74 | 87.67 | 84.95 | 0.49 | 0.89 |
The performance of models developed on EGFR10 dataset, class-specific molecules and EGFR10 excluding single class, evaluated using cross-validation techniques for testing on same-class of molecules
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| EGFR10 train | EGFR10 train | 68.74 | 87.67 | 84.95 | 0.49 | 0.89 |
| EGFR10 train | EGFR10 Validation | 69.89 | 86.03 | 83.66 | 0.49 | 0.89 |
| Pyrimidine | Pyrimidine | 69.25 | 92.13 | 86.92 | 0.62 | 0.92 |
| Pyrimidine | Quinazoline | 68.62 | 54.88 | 58.88 | 0.21 | 0.67 |
| Quinazoline | Quinazoline | 68.15 | 79.63 | 76.31 | 0.45 | 0.81 |
| Quinazoline | Pyrimidine | 67.86 | 64.04 | 64.91 | 0.27 | 0.74 |
| EFGR10-Pyrimidine | EFGR10-Pyrimidine | 68.7 | 94.08 | 91.34 | 0.59 | 0.92 |
| EFGR10-Quinazoline | EFGR10- Quinazoline | 69.66 | 96.4 | 94.04 | 0.64 | 0.95 |
| EFGR10- Pyrimidine | Pyrimidine | 68.06 | 76.74 | 74.77 | 0.4 | 0.77 |
| EFGR10- Quinazoline | Quinazoline | 60.31 | 76.25 | 71.66 | 0.35 | 0.72 |
The performance of models developed on EGFR100 and EGFR1000 train sets using different PubChem fingerprints and evaluated on validations sets
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| EGFR100 train set | EGFR100 train set | PubChem 881 | 88.01 | 73.34 | 78.2 | 0.58 | 0.90 |
| EGFR100 train set | EGFR100 validation set | PubChem 881 | 91.1 | 68.5 | 76.8 | 0.58 | 0.90 |
| EGFR1000 train set | EGFR1000 train set | PubChem 881 | 86.97 | 78.36 | 82.92 | 0.66 | 0.89 |
| EGFR1000 train set | EGFR1000 validation set | PubChem 881 | 85.7 | 85.5 | 85.6 | 0.71 | 0.90 |