Kathryn E Kirchoff1, Shawn M Gomez2,3. 1. Department of Computer Science, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA. 2. Department of Pharmacology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA. 3. Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill and North Carolina State University, Chapel Hill, NC, 27599, USA.
Abstract
MOTIVATION: Kinase-catalyzed phosphorylation of proteins forms the backbone of signal transduction within the cell, enabling the coordination of numerous processes such as the cell cycle, apoptosis, and differentiation. While on the order of 105 phosphorylation events have been described, we know the specific kinase performing these functions for less than 5% of cases. The ability to predict which kinases initiate specific individual phosphorylation events has the potential to greatly enhance the design of downstream experimental studies, while simultaneously creating a preliminary map of the broader phosphorylation network that controls cellular signaling. RESULTS: We describe EMBER, a deep learning method that integrates kinase-phylogeny information and motif-dissimilarity information into a multi-label classification model for the prediction of kinase-motif phosphorylation events. Unlike previous deep learning methods that perform single-label classification, we restate the task of kinase-motif phosphorylation prediction as a multi-label problem, allowing us to train a single unified model rather than a separate model for each of the 134 kinase families. We utilize a Siamese network to generate novel vector representations, or an embedding, of motif sequences, and we compare our novel embedding to a previously proposed peptide embedding. Our motif vector representations are used, along with one-hot encoded motif sequences, as input to a classification network while also leveraging kinase phylogenetic relationships into our model via a kinase phylogeny-weighted loss function. Results suggest that this approach holds significant promise for improving our map of phosphorylation relations that underlie kinome signaling. AVAILABILITY: https://github.com/gomezlab/EMBER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Kinase-catalyzed phosphorylation of proteins forms the backbone of signal transduction within the cell, enabling the coordination of numerous processes such as the cell cycle, apoptosis, and differentiation. While on the order of 105 phosphorylation events have been described, we know the specific kinase performing these functions for less than 5% of cases. The ability to predict which kinases initiate specific individual phosphorylation events has the potential to greatly enhance the design of downstream experimental studies, while simultaneously creating a preliminary map of the broader phosphorylation network that controls cellular signaling. RESULTS: We describe EMBER, a deep learning method that integrates kinase-phylogeny information and motif-dissimilarity information into a multi-label classification model for the prediction of kinase-motif phosphorylation events. Unlike previous deep learning methods that perform single-label classification, we restate the task of kinase-motif phosphorylation prediction as a multi-label problem, allowing us to train a single unified model rather than a separate model for each of the 134 kinase families. We utilize a Siamese network to generate novel vector representations, or an embedding, of motif sequences, and we compare our novel embedding to a previously proposed peptide embedding. Our motif vector representations are used, along with one-hot encoded motif sequences, as input to a classification network while also leveraging kinase phylogenetic relationships into our model via a kinase phylogeny-weighted loss function. Results suggest that this approach holds significant promise for improving our map of phosphorylation relations that underlie kinome signaling. AVAILABILITY: https://github.com/gomezlab/EMBER. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Gayathri K Perera; Chrysanthi Ainali; Ekaterina Semenova; Christian Hundhausen; Guillermo Barinaga; Deepika Kassen; Andrew E Williams; Muddassar M Mirza; Mercedesz Balazs; Xiaoting Wang; Robert Sanchez Rodriguez; Andrej Alendar; Jonathan Barker; Sophia Tsoka; Wenjun Ouyang; Frank O Nestle Journal: Sci Transl Med Date: 2014-02-12 Impact factor: 17.956
Authors: Kyla A L Collins; Timothy J Stuhlmiller; Jon S Zawistowski; Michael P East; Trang T Pham; Claire R Hall; Daniel R Goulet; Samantha M Bevill; Steven P Angus; Sara H Velarde; Noah Sciaky; Tudor I Oprea; Lee M Graves; Gary L Johnson; Shawn M Gomez Journal: Oncotarget Date: 2018-01-29