You-Wei Fan1, Wan-Hsin Liu2,3,4, Yun-Ti Chen2, Yen-Chao Hsu2, Nikhil Pathak5, Yu-Wei Huang6, Jinn-Moon Yang7,8. 1. Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 30050, Taiwan. 2. Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050, Taiwan. 3. Institute of Information Science, Academia Sinica, Taipei, 11564, Taiwan. 4. Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11564, Taiwan. 5. Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, 30044, Taiwan. 6. Institute of Biomedical Engineering, National Chiao Tung University, Hsinchu, 30050, Taiwan. 7. Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 30050, Taiwan. moon@faculty.nctu.edu.tw. 8. Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 30050, Taiwan. moon@faculty.nctu.edu.tw.
Abstract
BACKGROUND: While it has been known that human protein kinases mediate most signal transductions in cells and their dysfunction can result in inflammatory diseases and cancers, it remains a challenge to find effective kinase inhibitor as drugs for these diseases. One major challenge is the compensatory upregulation of related kinases following some critical kinase inhibition. To circumvent the compensatory effect, it is desirable to have inhibitors that inhibit all the kinases belonging to the same family, instead of targeting only a few kinases. However, finding inhibitors that target a whole kinase family is laborious and time consuming in wet lab. RESULTS: In this paper, we present a computational approach taking advantage of interpretable deep learning models to address this challenge. Specifically, we firstly collected 9,037 inhibitor bioassay results (with 3991 active and 5046 inactive pairs) for eight kinase families (including EGFR, Jak, GSK, CLK, PIM, PKD, Akt and PKG) from the ChEMBL25 Database and the Metz Kinase Profiling Data. We generated 238 binary moiety features for each inhibitor, and used the features as input to train eight deep neural networks (DNN) models to predict whether an inhibitor is active for each kinase family. We then employed the SHapley Additive exPlanations (SHAP) to analyze the importance of each moiety feature in each classification model, identifying moieties that are in the common kinase hinge sites across the eight kinase families, as well as moieties that are specific to some kinase families. We finally validated these identified moieties using experimental crystal structures to reveal their functional importance in kinase inhibition. CONCLUSION: With the SHAP methodology, we identified two common moieties for eight kinase families, 9 EGFR-specific moieties, and 6 Akt-specific moieties, that bear functional importance in kinase inhibition. Our result suggests that SHAP has the potential to help finding effective pan-kinase family inhibitors.
BACKGROUND: While it has been known that human protein kinases mediate most signal transductions in cells and their dysfunction can result in inflammatory diseases and cancers, it remains a challenge to find effective kinase inhibitor as drugs for these diseases. One major challenge is the compensatory upregulation of related kinases following some critical kinase inhibition. To circumvent the compensatory effect, it is desirable to have inhibitors that inhibit all the kinases belonging to the same family, instead of targeting only a few kinases. However, finding inhibitors that target a whole kinase family is laborious and time consuming in wet lab. RESULTS: In this paper, we present a computational approach taking advantage of interpretable deep learning models to address this challenge. Specifically, we firstly collected 9,037 inhibitor bioassay results (with 3991 active and 5046 inactive pairs) for eight kinase families (including EGFR, Jak, GSK, CLK, PIM, PKD, Akt and PKG) from the ChEMBL25 Database and the Metz Kinase Profiling Data. We generated 238 binary moiety features for each inhibitor, and used the features as input to train eight deep neural networks (DNN) models to predict whether an inhibitor is active for each kinase family. We then employed the SHapley Additive exPlanations (SHAP) to analyze the importance of each moiety feature in each classification model, identifying moieties that are in the common kinase hinge sites across the eight kinase families, as well as moieties that are specific to some kinase families. We finally validated these identified moieties using experimental crystal structures to reveal their functional importance in kinase inhibition. CONCLUSION: With the SHAP methodology, we identified two common moieties for eight kinase families, 9 EGFR-specific moieties, and 6 Akt-specific moieties, that bear functional importance in kinase inhibition. Our result suggests that SHAP has the potential to help finding effective pan-kinase family inhibitors.
Authors: Sabina Luszczak; Christopher Kumar; Vignesh Krishna Sathyadevan; Benjamin S Simpson; Kathy A Gately; Hayley C Whitaker; Susan Heavey Journal: Signal Transduct Target Ther Date: 2020-01-31