| Literature DB >> 29695735 |
Liang-Chin Huang1, Karen E Ross2, Timothy R Baffi3, Harold Drabkin4, Krzysztof J Kochut5, Zheng Ruan1, Peter D'Eustachio6, Daniel McSkimming7, Cecilia Arighi8, Chuming Chen8, Darren A Natale2, Cynthia Smith4, Pascale Gaudet9, Alexandra C Newton3, Cathy Wu2,8, Natarajan Kannan10.
Abstract
Many bioinformatics resources with unique perspectives on the protein landscape are currently available. However, generating new knowledge from these resources requires interoperable workflows that support cross-resource queries. In this study, we employ federated queries linking information from the Protein Kinase Ontology, iPTMnet, Protein Ontology, neXtProt, and the Mouse Genome Informatics to identify key knowledge gaps in the functional coverage of the human kinome and prioritize understudied kinases, cancer variants and post-translational modifications (PTMs) for functional studies. We identify 32 functional domains enriched in cancer variants and PTMs and generate mechanistic hypotheses on overlapping variant and PTM sites by aggregating information at the residue, protein, pathway and species level from these resources. We experimentally test the hypothesis that S768 phosphorylation in the C-helix of EGFR is inhibitory by showing that oncogenic variants altering S768 phosphorylation increase basal EGFR activity. In contrast, oncogenic variants altering conserved phosphorylation sites in the 'hydrophobic motif' of PKCβII (S660F and S660C) are loss-of-function in that they reduce kinase activity and enhance membrane translocation. Our studies provide a framework for integrative, consistent, and reproducible annotation of the cancer kinomes.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29695735 PMCID: PMC5916945 DOI: 10.1038/s41598-018-24457-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1A framework for aggregate queries and integrative annotation. Integrative annotation across five resources, including ProKinO, PRO, iPTMnet, neXtProt, and MGI, is built using SPARQL federated query against four SPARQL endpoints; iPTMnet data is available via querying through PRO. Types of information are shown in each RDF data resource. PTM: post-translational modification; PPI: protein-protein interaction; GO: Gene Ontology.
Figure 2NIH metric versus annotation score. Scatter plots of annotation score ω versus the three measurements (in log scale) used in the NIH metric: (a) Jensen score, (b) R01 count, and (c) PubTator score, respectively. Spearman’s rank correlation coefficient ρ is shown in each scatter plot. (d) Visualization of the three annotation categories in the human kinome tree. BS/Low: better-studied protein kinases with low ω (red nodes); US/High: under-studied protein kinases with high ω (green nodes); US/Low: under-studied protein kinases with low ω (blue nodes); all the unmarked protein kinases are better-studied protein kinases with high ω, such as EGFR. Node size scales with the ω score. The five proteins with the highest ω in US/High category are labelled; the five proteins with the lowest ω in BS/Low and US/Low categories, respectively, are also labelled. The human kinome tree was generated using KinMap[43]. Illustration is reproduced courtesy of Cell Signaling Technology, Inc. (http://www.cellsignal.comwww.cellsignal.com).
Domains enriched in both disease variants and PTMs. Adj-PV: adjusted p-value; Average score: an average of VEA score and MEA score; bold text: protein kinase domains (either Pkinase or Pkinase_Tyr).
| Gene Symbol | UniProtKB ID | Domain | VEA Adj-PV | MEA Adj-PV | Average Score |
|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| EPHA7 |
|
|
|
|
|
| ZAP70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| AKT1 | P31749 | PH | <1.00E-14 | 2.12E-04 | 8.837 |
| EGFR | P00533 | GF_recep_IV | <1.00E-14 | 1.68E-03 | 8.387 |
| PAK2 | Q13177 | PBD | 2.15E-11 | 1.31E-03 | 6.775 |
| PRKCB | P05771 | Pkinase_C | 3.68E-08 | 6.94E-05 | 5.797 |
| EGFR | P00533 | Recep_L_domain | 6.07E-07 | 3.74E-04 | 4.822 |
| EPHB2 | P29323 | EphA2_TM | 4.08E-03 | 6.38E-07 | 4.293 |
| EPHA3 | P29320 | EphA2_TM | 5.75E-06 | 9.32E-04 | 4.136 |
| TTN | Q8WZ42 | PPAK | 1.93E-04 | 4.29E-05 | 4.041 |
| EPHB1 | P54762 | EphA2_TM | 3.75E-04 | 1.60E-04 | 3.612 |
| BRDT | Q58F21 | Bromodomain | 3.72E-04 | 1.66E-03 | 3.104 |
| TTN | Q8WZ42 | fn3 | 7.85E-04 | 9.89E-04 | 3.055 |
| EPHA7 | Q15375 | EphA2_TM | 2.05E-03 | 5.65E-04 | 2.968 |
| PRKCQ | Q04759 | Pkinase_C | 3.83E-03 | 1.08E-03 | 2.691 |
Figure 3Annotation scores of protein kinases with enriched domains and visualization of the enriched domains in the human kinome tree. (a) Data volume, variety, and annotation score ω for each protein kinase with enriched VEA and MEA. The name of the protein kinase is shown in the first column in red if it is better-studied or green if under-studied; gradient colour (red for the maximum and blue for the minimum in the whole human kinome) indicates the data volume for each variable (column 2–16) and ω (last column). A threshold of high ω ( >1.00) is shown by a red line. (b) Protein kinases with functional domains with adjusted p-values in VEA or MEA less than 0.005 are represented by red and blue nodes, respectively; node size represents the score in VEA or MEA. Protein names are labelled only if any of their functional domains shows significant adjusted p-values in both VEA and MEA; black labels represent proteins with enriched protein kinase domains (either Pkinase or Pkinase_Tyr), while gray labels show proteins that only have other enriched domains; enriched domains are shown in parentheses. The human kinome tree was generated using KinMap[43]. Illustration is reproduced courtesy of Cell Signaling Technology, Inc. (http://www.cellsignal.comwww.cellsignal.com).
Figure 4Variants and PTMs in the six most-enriched protein kinase domains. X-axis: sequence position and corresponding PKA position/subdomain/motif; y-axis: mutation frequency (logarithmic scale) across all cancer samples in COSMIC (exact frequency is labelled if it is greater than 100); red dot: phosphorylation; green dot: ubiquitination; purple dot: acetylation; dot on top of the bar: mutation-PTM overlapping site; Sub I~XI: subdomain I to XI; β1~5: beta 1 to beta 5 strands; C~I: C-helix to I-helix; GLY: glycine-rich loop; αC: alphaC-beta4 loop; Link: linker; CAT: catalytic loop; Activation: activation loop.
Integrative annotation for case studies.
| Gene | Position | Annotation | Variable | Value | Service |
|---|---|---|---|---|---|
| EGFR | Gene level | Cellular Component | Endosome membrane | neXtProt/MGI | |
| Plasma membrane | neXtProt/MGI | ||||
| basolateral plasma membrane | neXtProt/MGI | ||||
| Reaction | EGFR dimerization | ProKinO | |||
| Complex | EGF:EGFR dimer [plasma membrane] | ProKinO | |||
| L1-EGFR trans-heterodimer | PRO | ||||
| Pathway | Constitutive PI3K/AKT Signaling in Cancer | ProKinO | |||
| Molecular Function | protein localization to nucleus | PRO | |||
| Protein binding | neXtProt/MGI | ||||
| Protein phosphatase binding | neXtProt/MGI | ||||
| Biological Process | Positive regulation of cell proliferation | neXtProt/MGI | |||
| epidermal growth factor receptor signaling pathway | neXtProt/MGI | ||||
| Phenotype | dilated respiratory conducting tubes | MGI | |||
| respiratory distress | MGI | ||||
| abnormal lung interstitium morphology | MGI | ||||
| abnormal lung development | MGI | ||||
| S768 | Residue level(General) | PKA Position | 98 | ProKinO | |
| Motif | N-lobe | ProKinO | |||
| C_helix | ProKinO | ||||
| subdomainIII | ProKinO | ||||
| Residue level(Mutation) | Mutation Count | 252 | ProKinO | ||
| Mutant Type (MT) | I | ProKinO/neXtProt | |||
| G | ProKinO/neXtProt | ||||
| SNP | (MT = ’I’) rs397517108 | neXtProt | |||
| (MT = ’I’) rs121913465 | neXtProt | ||||
| (MT = ’G’) rs756614898 | neXtProt | ||||
| Mutation Description | (MT = ’I’) higher levels of basal autophosphorylation | neXtProt | |||
| Residue level(PTM) | Proteoform | PR:000049851; S768-phosphorylated form; inhibitory effect on EGFR kinase activity | PRO | ||
| Modification | Phosphorylation | PRO/iPTMnet | |||
| Enzyme | CAMK2A | iPTMnet | |||
| Equivalent Mouse Site | Q01279 S770: no evidence for modification | iPTMnet | |||
| PRKCB(PKCβ) | Gene level | Cellular Component | Cytoplasm | neXtProt/MGI | |
| Plasma membrane | neXtProt/MGI | ||||
| Reaction | PRKCB binds diacylglycerol and phosphatidylserine | ProKinO | |||
| Complex | Activated PKC beta [plasma membrane] | ProKinO | |||
| Pathway | Glioma | neXtProt | |||
| Pathways in cancer | neXtProt | ||||
| VEGFR2 mediated cell proliferation | neXtProt | ||||
| Biological Process | Positive regulation of B cell receptor signaling pathway | neXtProt/MGI | |||
| Positive regulation of I-kappaB kinase/NF-kappaB signaling | neXtProt/MGI | ||||
| Positive regulation of NF-kappaB transcription factor activity | neXtProt/MGI | ||||
| Positive regulation of vascular endothelial growth factor receptor signaling pathway | neXtProt/MGI | ||||
| Positive regulation of angiogenesis | neXtProt/MGI | ||||
| S661 (S660 in PKCβII) | Residue level(General) | Functional Domain | AGC-kinase C-terminal | neXtProt | |
| Residue level(Mutation) | Mutation Count | 6 | ProKinO | ||
| Mutant Type (MT) | F | ProKinO | |||
| C | ProKinO/neXtProt | ||||
| Residue level(PTM) | Proteoform | PR:000049877; S654 & S660- phosphorylated isoform-betaII; mouse | PRO | ||
| PR:000049878; S660 & S664- phosphorylated isoform-betaII | PRO | ||||
| PR:000049879; S660 & S673- phosphorylated isoform-betaII; mouse | PRO | ||||
| Modification | Phosphorylation | PRO/iPTMnet | |||
| Enzyme | PRKCB | iPTMnet | |||
| Equivalent Mouse Site | P68404 Prkcb S660; phosphorylated | PRO/iPTMnet | |||
| T642 & S661(T641 & S660 in PKCβII) | Residue level(PTM) | Proteoform | PR:000049855; T500, T641, S660-phosphorylated PRKCB-2; increased protein phosphorylation; increased localization to cytoplasm | PRO | |
| Modification | Phosphorylation | PRO | |||
| PRKCQ(PKCθ) | Gene level | Cellular Component | Cytoplasm | MGI | |
| Plasma membrane | MGI | ||||
| Reaction | Autophosphorylation of PKC-theta | ProKinO | |||
| Complex | Active PKC theta bound to DAG [plasma membrane] | ProKinO | |||
| Biological Process | Negative regulation of T cell apoptotic process | neXtProt/MGI | |||
| Positive regulation of T cell activation | neXtProt/MGI | ||||
| Positive regulation of T cell proliferation | neXtProt/MGI | ||||
| Positive regulation of telomerase activity | neXtProt | ||||
| Positive regulation of telomere maintenance via telomerase | neXtProt | ||||
| S695 | Residue level(General) | Functional Domain | AGC-kinase C-terminal | neXtProt | |
| Residue level(Mutation) | Mutation Count | 1 | ProKinO | ||
| Mutant Type (MT) | F | ProKinO/neXtProt | |||
| Residue level(PTM) | Proteoform | PR:000049857; S695-phosphorylated form; increased protein serine/threonine kinase activity | PRO | ||
| Modification | Phosphorylation | PRO/iPTMnet | |||
| Equivalent Mouse Site | Q02111 S695; phosphorylated | iPTMnet |
Figure 5Biochemical impact of S768 phosphorylation in EGFR and S660 mutations in PKCβII. (a) Phospho-mimic mutations (S768D, S768E) and oncogenic mutation (S768I) alters the EGFR C-terminal tail (Y1197) and activation loop (Y845) phosphorylation. EGFR downstream signalling (STAT3) phosphorylation is also perturbed by the mutations. Each blot is obtained independently using designated antibody from the same cell lysis samples. An exposure time of 5 min is used for all blots. (b) Structural models comparing unphosphorylated/phosphorylated (S768) EGFR dimer. The 3D structures are represented by PyMOL[44] (c) Normalized FRET-ratio changes (mean ± SEM) showing an agonist-induced activity of mCherry-tagged PKCβII wild-type (WT) or PKCβII hydrophobic (HF) motif mutants S660C and S660F in COS-7 cells co-expressing CKAR and treated with UTP (100 μM), followed by PDBu (200 nM). mCherry intensity (inset) reflects expression levels of overexpressed PKC protein. Data are from three independent experiments. (d) Representative mCherry images of PKC translocation in COS-7 cells before agonist addition (Basal, t = 0 min), after UTP addition (UTP, t = 12 min), and after PDBu addition (PDBu, t = 17 min). mCherry vector was used as a negative control.
Figure 6Mutation and PTM sites in the C terminal tail domain of AGC kinases. (a) Mutation and PTM sites in enriched protein kinase C terminal tail domains. X-axis: sequence position and corresponding PKA position/subdomain/motif; y-axis: mutation frequency across all cancer samples in COSMIC; red dot: phosphorylation; yellow dot: methylation; dot on top of the bar: mutation-PTM overlapping site; AST: active-site tether subdomain; NLT: N-lobe tether subdomain; NFD: NFD motif; T: turn motif; HF: hydrophobic motif. (b) Mutation and phosphorylation sites in all protein kinase C terminal domains. Sequences of all protein kinase C terminal tail domains defined by Pfam are aligned to PKA sequence by MAFFT alignment; black rectangles: subdomains; red rectangles: motifs; residues marked in blue: mutation sites (frequency is ignored); residues marked in red: phosphorylation sites; residues marked in purple: mutation-phosphorylation overlapping sites.