| Literature DB >> 29888050 |
Derek Jones1, Jeevith Bopaiah1, Fatemah Alghamedy1, Nathan Jacobs1, Heidi L Weiss1, W A de Jong2, Sally R Ellingson1.
Abstract
Protein kinases generate nearly a thousand different protein products and regulate the majority of cellular pathways and signal transduction. It is therefore not surprising that the deregulation of kinases has been implicated in many disease states. In fact, kinase inhibitors are the largest class of new cancer therapies. Understanding polypharmacology within the full kinome, how drugs interact with many different kinases, would allow for the development of safer and more efficacious cancer therapies. A full understanding of these interactions is not experimentally feasible making highly accurate computational predictions extremely useful and important. This work aims at making a machine learning model useful for investigating the full kinome. We evaluate many feature sets for our model and get better performance over molecular docking with all of them. We demonstrate that you can achieve a nearly 60% increase in success rate at identifying binding compounds using our model over molecular docking scores.Entities:
Year: 2018 PMID: 29888050 PMCID: PMC5961802
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Representation of each kinase in the test set
| whole dataset | test dataset | |||||||
|---|---|---|---|---|---|---|---|---|
| kinase | total | ratio 0:1 | 0 | 1 | percent 0 | percent 1 | total % | ratio 0:1 |
| 11,180 | 37 | 2,105 | 60 | 0.188 | 0.005 | 0.194 | 35 | |
| 16,999 | 39 | 3,298 | 73 | 0.194 | 0.004 | 0.198 | 45 | |
| 7,142 | 37 | 1,462 | 34 | 0.205 | 0.005 | 0.209 | 43 | |
| 10,349 | 40 | 2,036 | 46 | 0.197 | 0.004 | 0.201 | 44 | |
| 29,126 | 35 | 5,604 | 158 | 0.192 | 0.005 | 0.198 | 35 | |
| 12,720 | 43 | 2,563 | 54 | 0.201 | 0.004 | 0.206 | 47 | |
| 36,274 | 43 | 6,936 | 158 | 0.191 | 0.004 | 0.196 | 44 | |
| 5,516 | 47 | 1,085 | 14 | 0.197 | 0.003 | 0.199 | 78 | |
| 736 | 2 | 183 | 116 | 0.249 | 0.158 | 0.406 | 2 | |
| 9,633 | 42 | 1,958 | 42 | 0.203 | 0.004 | 0.208 | 47 | |
| 6,743 | 43 | 1,343 | 32 | 0.199 | 0.005 | 0.204 | 42 | |
| 10,861 | 42 | 2,144 | 52 | 0.197 | 0.005 | 0.202 | 41 | |
| 9,092 | 36 | 1,732 | 42 | 0.190 | 0.005 | 0.195 | 41 | |
| 28,539 | 41 | 5,569 | 136 | 0.195 | 0.005 | 0.200 | 41 | |
| 6,450 | 30 | 1,240 | 40 | 0.192 | 0.006 | 0.198 | 31 | |
| 11,677 | 47 | 2,278 | 47 | 0.195 | 0.004 | 0.199 | 48 | |
| 4,767 | 33 | 889 | 31 | 0.186 | 0.007 | 0.193 | 29 | |
| 6,900 | 36 | 1,358 | 50 | 0.197 | 0.007 | 0.204 | 27 | |
| 37,347 | 40 | 7,265 | 184 | 0.195 | 0.005 | 0.199 | 39 | |
| 8,483 | 34 | 1,693 | 41 | 0.200 | 0.005 | 0.204 | 41 | |
| 7,034 | 44 | 1,402 | 32 | 0.199 | 0.005 | 0.204 | 44 | |
| 6,580 | 31 | 1,306 | 41 | 0.198 | 0.006 | 0.205 | 32 | |
| 35,790 | 42 | 6,980 | 167 | 0.195 | 0.005 | 0.200 | 42 | |
| 8,958 | 31 | 1,704 | 63 | 0.190 | 0.007 | 0.197 | 27 | |
| 25,900 | 41 | 5,119 | 123 | 0.198 | 0.005 | 0.202 | 42 | |
| 6,371 | 46 | 1,245 | 25 | 0.195 | 0.004 | 0.199 | 50 | |
Evaluation Models
| Model | Feature Set | Model | Feature Set | Model | Feature Set |
|---|---|---|---|---|---|
| 1 | FS1 | 3 | FS1 + FS3 | 5 | FS1 + FS2 + FS3 + FS4 |
| 2 | FS4 | 4 | FS1 + FS3 + FS4 | 6 | all features |
Metrics used in this study
| Name | Definition | Formula |
|---|---|---|
| Youden’s index | Performance of dichotomous test. The value 1 indicates a perfect test and -1 indicates a useless test. | |
| F1 | Harmonic mean of precision and recall | |
| Precision | Positive predictive value | |
| Recall | True positive rate |
Youden’s Index.
| Kinase | Youdens Index | Best docking score | Kinase | Youdens Index | Best docking score |
|---|---|---|---|---|---|
| 0.35 | -9.1 | 0.29 | -8.9 | ||
| akt1 | 0.02 | -8 | 0.45 | -8 | |
| 0.22 | -8.5 | 0.51 | -8.9 | ||
| 0.53 | -9.6 | 0.6 | -9.1 | ||
| 0.33 | -8.2 | 0.4 | -8.7 | ||
| csf1r | 0.23 | -8.9 | 0.28 | -8.5 | |
| 0.15 | -8.7 | 0.13 | -7.8 | ||
| 0.52 | -8.4 | 0.2 | -8.6 | ||
| 0.01 | -8 | 0.4 | -7.7 | ||
| igf1r | 0.46 | -8.4 | 0.17 | -8.1 | |
| 0.37 | -9.3 | 0.56 | -9.6 | ||
| 0.24 | -8.5 | 0.36 | -9 | ||
| 0.33 | -8.5 | 0.76 | -10 | ||
| 0.23 | -8.6 | ||||
Figure 1:PCA of FS1 and FS4
Figure 2:PCA of FS2 and FS3
Comparison of performance for both classes on the testing set.
| Model | Class | Precision | Recall | F1-Score | Class | Precision | Recall | F1-Score |
|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 1.00 | 1.00 | 1.00 | 1 | 0.83 | 0.92 | 0.87 |
| 2 | 0 | 0.98 | 0.98 | 0.98 | 1 | 0.26 | 0.30 | 0.28 |
| 3 | 0 | 1.00 | 1.00 | 1.00 | 1 | 0.83 | 0.92 | 0.87 |
| 4 | 0 | 1.00 | 1.00 | 1.00 | 1 | 0.83 | 0.91 | 0.87 |
| 5 | 0 | 1.00 | 1.00 | 1.00 | 1 | 0.84 | 0.92 | 0.88 |
| 6 | 0 | 1.00 | 1.00 | 1.00 | 1 | 0.85 | 0.89 | 0.87 |
| Docking | 0 | 0.99 | 0.58 | 0.73 | 1 | 0.04 | 0.67 | 0.07 |
Evaluation metrics per kinase for the positive class
| Per Kinase | Leave-one-out | Docking | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Kinase | Prec. | Recall | F1 | Prec. | Recall | F1 | Prec. | Recall | F1 |
| 0.92 | 0.95 | 0.93 | 0.86 | 0.95 | 0.9 | 0.06 | 0.72 | 0.11 | |
| 0.86 | 0.96 | 0.91 | 0.84 | 0.81 | 0.82 | 0.03 | 0.29 | 0.05 | |
| 0.94 | 0.94 | 0.94 | 0.83 | 0.8 | 0.81 | 0.04 | 0.59 | 0.07 | |
| 0.85 | 0.98 | 0.91 | 0.76 | 0.85 | 0.8 | 0.06 | 0.80 | 0.11 | |
| 0.85 | 0.77 | 0.81 | 0.6 | 0.31 | 0.41 | 0.04 | 0.78 | 0.08 | |
| 0.82 | 0.83 | 0.83 | 0.62 | 0.59 | 0.6 | 0.03 | 0.63 | 0.06 | |
| 0.85 | 0.91 | 0.88 | 0.76 | 0.75 | 0.75 | 0.03 | 0.81 | 0.05 | |
| 0.75 | 0.86 | 0.8 | 0.8 | 0.77 | 0.78 | 0.03 | 0.86 | 0.06 | |
| 0.93 | 1.00 | 0.97 | 0.9 | 0.92 | 0.91 | 0.05 | 0.86 | 0.09 | |
| 0.97 | 0.94 | 0.95 | 0.87 | 0.77 | 0.82 | 0.06 | 0.5 | 0.10 | |
| 0.83 | 0.92 | 0.87 | 0.74 | 0.88 | 0.81 | 0.05 | 0.58 | 0.08 | |
| 0.70 | 0.93 | 0.80 | 0.45 | 0.32 | 0.37 | 0.04 | 0.86 | 0.08 | |
| 0.94 | 0.93 | 0.94 | 0.8 | 0.83 | 0.82 | 0.05 | 0.49 | 0.09 | |
| 0.93 | 0.97 | 0.95 | 0.72 | 0.37 | 0.49 | 0.06 | 0.65 | 0.10 | |
| 0.87 | 0.96 | 0.91 | 0.75 | 0.77 | 0.76 | 0.05 | 0.81 | 0.09 | |
| 0.91 | 0.97 | 0.94 | 0.78 | 0.69 | 0.73 | 0.16 | 0.71 | 0.26 | |
| 0.88 | 0.86 | 0.87 | 0.67 | 0.51 | 0.58 | 0.06 | 0.82 | 0.11 | |
| 0.90 | 0.85 | 0.88 | 0.76 | 0.5 | 0.6 | 0.05 | 0.56 | 0.09 | |
| 0.80 | 0.95 | 0.87 | 0.59 | 0.48 | 0.53 | 0.03 | 0.93 | 0.05 | |
| 0.82 | 0.84 | 0.83 | 0.73 | 0.5 | 0.59 | 0.03 | 0.91 | 0.06 | |
| 0.88 | 0.85 | 0.86 | 0.63 | 0.32 | 0.42 | 0.05 | 0.88 | 0.10 | |
| 0.93 | 0.97 | 0.95 | 0.86 | 0.87 | 0.86 | 0.03 | 0.88 | 0.06 | |
| 0.94 | 0.98 | 0.96 | 0.89 | 0.83 | 0.86 | 0.09 | 0.84 | 0.17 | |
| 0.93 | 0.90 | 0.92 | 0.8 | 0.79 | 0.8 | 0.05 | 0.65 | 0.10 | |
| 0.93 | 1.00 | 0.96 | 0.75 | 0.58 | 0.65 | 0.14 | 0.8 | 0.24 | |