| Literature DB >> 27285823 |
Motonori Ota1, Hideki Gonja1, Ryotaro Koike1, Satoshi Fukuchi2.
Abstract
Protein-protein interactions are fundamental for all biological phenomena, and protein-protein interaction networks provide a global view of the interactions. The hub proteins, with many interaction partners, play vital roles in the networks. We investigated the subcellular localizations of proteins in the human network, and found that the ones localized in multiple subcellular compartments, especially the nucleus/cytoplasm proteins (NCP), the cytoplasm/cell membrane proteins (CMP), and the nucleus/cytoplasm/cell membrane proteins (NCMP), tend to be hubs. Examinations of keywords suggested that among NCP, those related to post-translational modifications and transcription functions are the major contributors to the large number of interactions. These types of proteins are characterized by a multi-domain architecture and intrinsic disorder. A survey of the typical hub proteins with prominent numbers of interaction partners in the type revealed that most are either transcription factors or co-regulators involved in signaling pathways. They translocate from the cytoplasm to the nucleus, triggered by the phosphorylation and/or ubiquitination of intrinsically disordered regions. Among CMP and NCMP, the contributors to the numerous interactions are related to either kinase or ubiquitin ligase activity. Many of them reside on the cytoplasmic side of the cell membrane, and act as the upstream regulators of signaling pathways. Overall, these hub proteins function to transfer external signals to the nucleus, through the cell membrane and the cytoplasm. Our analysis suggests that multiple-localization is a crucial concept to characterize groups of hub proteins and their biological functions in cellular information processing.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27285823 PMCID: PMC4902230 DOI: 10.1371/journal.pone.0156455
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The statistics of the subcellular localizations and the numbers of interactions.
A. The number of proteins (observations) in each subcellular localization and the average number of interactions are indicated, in the descending order of the observations. The data for NP, CP, MP, NCP, CMP, and NCMP are colored red, blue, orange, magenta, green and brown, respectively. None (locatization) means that no subcellular localization was denoted in Uniprot, and None (Uniprot) means that no Uniprot entry was assigned to the HPRD entry. The observations represented by the short bars are shown as numbers, and the number by the bottom bar is the total number of HPRD entries. B. The distribution of the number of interactions in the log-log plot. When the number of interactions was zero, the data are shown at 0.1 interactions. The distribution was approximated by the power function, using the data from 1 to 100 interactions. The scaling exponents are -2.01 and -1.37 for the distributions of all proteins and the NCP, respectively. C. The statistics of the numbers of subcellular compartments for each HPRD entry, and the average number of interactions against the numbers of subcellular compartments. The full size view of the right panel is shown in S1 Fig.
Fig 2A. The average number of interactions decomposed by the interaction partners in 7 categories: NP, CP, MP, NCP, CMP, NCMP, and others. The full size view is shown in S2 Fig. B. The Venn diagram representing the interaction of NP with NCP. In total, 2,498 NP were analyzed. The NP with intra-interactions and interacting with NCP are shown by the two circles in the box. The intersection of two circles represents the shared interaction partners. Among these proteins, 1,250 NP neither intra-interact nor interact with NCP. C. The summary of interactions in terms of interaction partners. S symbols indicate that the interaction is rich (at least 0.6 interaction partners) between the protein (the left column) and the interaction partners (the top row), and the interaction partners are shared (at least 0.5) with the intra-interactions of partner proteins (S6 Table). For other symbols, see the bottom of the panel.
Fig 3Scatter plot of the decrease in the average number of interactions after eliminating the proteins with the corresponding keyword, and the number of eliminated entries.
The horizontal and the vertical data are the rates normalized by the original average number of interactions (for NCP: 9.75) and the total number of entries (for NCP: 1,120). A. NCP. The keywords related to post-translational modifications and transcription are shown by blue and red dots, respectively. PTM* is the union of the “phosphoprotein”, “acetylation” and “Ubl conjugation” keywords. Transcription* is the union of the “transcription”, “DNA-binding”, “activator” and “repressor” keywords. PTM* + Transcription* and PTM* × Transcription* are the union and the intersection of PTM* and Transcription*, respectively. These groups of unified keywords are shown in orange. Keywords that appeared more than 50 times were examined. B. CMP. The keywords related to the kinase activity are next to the green dots. PTM’ is the union of the “Ubl conjugation” and “acetylation” keywords. PTM’ + Kinase and PTM’ + Nucleotide-binding are the unions of PTM’ and respective keywords, shown with orange dots. C. NCMP. In B and C, keywords that appeared more than 10 times were examined.
Fig 4The protein structures characterized according to the intrinsic disorder and the domain architecture.
From the left, the distributions of protein length, percentage of IDR, and longest IDR length are shown (see full size view in S3 Fig). The average percentage of multi-domain proteins is presented in the right panel, where the proteins were divided into those composed of only distinctive multi-domains (D), distinctive and repetitive multi-domains (B), and only repetitive multi-domains (R). The compositions are represented by the brightness of the colors. TC* and Nb are the abbreviations of transcription* and nucleotide-binding.
Hub proteins localized in both the nucleus and cytoplasm, annotated by PTM* and transcription* (more than 100 PPIs)
| protein | Uniprot | HPRD | PPI | keyword | length | %ID | LID | domains | function | process | TL | IDEAL | ProS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Histone acetyltransferase p300 | Q09472 | 4078 | 209 | P/Ac/U/T | 2414 | 57 | 319 | 8 | TcoA | TC | 70 | ||
| CREB-binding protein (CBP) | Q92793 | 2534 | 198 | P/Ac/U/T/Av | 2442 | 61 | 340 | 8 | TcoA | TC | 92 | x | |
| Mothers against decapentaplegic homolog 3 (Smad3) | P84022 | 4380 | 182 | P/Ac/U/T/D | 425 | 16 | 57 | 2 | TF | Smad | P | 113 | x |
| Mothers against decapentaplegic homolog 2 (Smad2) | Q15796 | 3221 | 165 | P/Ac/U/T/D | 467 | 14 | 60 | 2 | TF | Smad | P | 127 | x |
| Mitogen-activated protein kinase 1 (MAPK1) | P28482 | 1496 | 160 | P/Ac/U/T/D/R | 360 | 4 | 11 | 1 | PK | MAPK | P | ||
| Ataxin-1 | P54253 | 3333 | 159 | P/U/T/D/R | 815 | 42 | 118 | 3 | CB | Notch | |||
| Mothers against decapentaplegic homolog 4 (Smad4) | Q13485 | 2995 | 150 | P/Ac/U/T/D | 552 | 31 | 164 | 2 | TF | Smad | P | 132 | |
| Androgen receptor (AR) | P10275 | 2437 | 150 | P/U/T/D/Av | 919 | 60 | 554 | 4 | NR/TF | NR | L | 20 | |
| Coiled-coil domain-containing protein 85B (Ccdc85B) | Q15834 | 16101 | 129 | Ac/T/R | 202 | 40 | 51 | 1 | TcoR | ||||
| Transcription factor p65 | Q04206 | 1241 | 113 | P/Ac/U/T/D/Av | 551 | 45 | 231 | 1 | TF | NF-κΒ | P | 207 | x |
| Mothers against decapentaplegic homolog 9 (Smad9) | O15198 | 4484 | 110 | P/T/D | 467 | 24 | 102 | 2 | TF | Smad | P | ||
| Mothers against decapentaplegic homolog 1 (Smad1) | Q15797 | 3356 | 109 | P/Ac/U/T/D | 465 | 23 | 103 | 2 | TF | Smad | P | 174 | x |
| Signal transducer and activator of transcription 3 (STAT3) | P40763 | 26 | 101 | P/Ac/T/D/Av | 770 | 9 | 55 | 4 | TF | JAK/STAT | P |
Abbreviations: Uniprot, Uniprot accession; HPRD, HPRD ID; PPI, number of PPIs; in the keyword column, P, phosphoprotein; Ac, acetylation; U, Ubl conjugation; T, transcription; D, DNA-binding; Av, activator: R, repressor; %ID, percentage of IDR; LID, length of the longest IDR; domain, number of domains; in the function column, TcoA, transcription co-activator; TF, transcription factor; PK, protein kinase; CB, chromatin binding; NR, nuclear receptor; TcoR, transcription co-repressor; in the process section, TC, transcription; Smad, Smad signaling pathway; MAPK, MAPK cascade; Notch, Notch signaling pathway; NR, nuclear receptor signaling pathway; NF-κΒ, NF-κΒ signaling pathway; JAK/STAT, JAK/STAT signaling pathway; TL, trigger of translocation into nucleus; P, phosphorylation; L, ligand binding; IDEAL, IDEAL identifier; in the ProS column, x, existence of protean segments.
Hub proteins localized in both the cytoplasm and cell membrane, annotated by PTM’ or nucleotide-binding.
| protein | Uniprot | HPRD | PPI | keyword | length | %ID | LID | TM | domains | function | process | step | IDEAL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tyrosine-protein kinase Lck | P06239 | 1080 | 105 | Ac/Nb/U | 509 | 13 | 64 | 0 | 3 | PK | T-Cell | 2 | |
| E3 ubiquitin-protein ligase CBL | P22681 | 1320 | 85 | U | 906 | 53 | 426 | 0 | 5 | UbL | Tyr-K | 2 | IID00300 |
| Tyrosine-protein kinase SYK | P43405 | 2514 | 74 | Nb/U | 635 | 17 | 100 | 0 | 2 | PK | B-cell | 2 | |
| E3 ubiquitin-protein ligase SMURF1 | Q9HCE7 | 6902 | 74 | U | 757 | 21 | 85 | 0 | 3 | UbL | BMP | 1 | IID00328 |
| Adapter molecule crk | P46108 | 1267 | 65 | Ac | 304 | 4 | 13 | 0 | 3 | AP | Reeling | 2 | |
| Mast/stem cell growth factor receptor Kit | P10721 | 1287 | 54 | Nb/U | 976 | 11 | 41 | 46 | 5 | RC | VA | 0 | |
| Tyrosine-protein kinase ZAP-70 | P43403 | 1495 | 48 | Ac/Nb | 619 | 2 | 13 | 0 | 2 | PK | T-Cell | 1 | |
| Guanine nucleotide-binding protein G(i) subunit α-2 | P04899 | 764 | 48 | Nb | 355 | 2 | 5 | 0 | 1 | MD | VA | 1 |
Abbreviations: Uniprot, Uniprot accession; HPRD, HPRD ID; PPI, number of PPIs; in the keyword column, Ac, acetylation; Nb, nucleotide-binding; U, Ubl conjugation; %ID, percentage of IDR; LID, length of the longest IDR; TM, length of the predicted trans-membrane regions; domains, number of domains; in the function column, PK, protein kinase; UbL, ubiquitin ligase; AP, adaptor in signaling pathways; RC, receptor; MD, Modulator in signaling pathways; in the process section, T-Cell, T-Cell signaling pathway; Tyr-K Tyrosine kinase signaling pathway; B-Cell, B-cell signaling pathway; BMP, BMP signaling pathway; Reeling, Reeling signaling pathway; VA, various signaling processes; step, number of steps from the cell membrane in KEGG pathway maps; IDEAL, IDEAL identifier.
Hub proteins localized in the nucleus, cytoplasm and cell membrane.
| protein | Uniprot | HPRD | PPI | keyword | length | %ID | LID | TM | domains | function | process | step | IDEAL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Tyrosine-protein kinase Fyn | P06241 | 655 | 154 | P/Nb | 537 | 16 | 86 | 0 | 3 | PK | sphingolopid | 1 | |
| RAC-α serine/threonine-protein kinase (Akt1) | P31749 | 1261 | 117 | P/Ac/U/Nb | 480 | 0 | 3 | 0 | 3 | PK | HIF-1 | 2 | IID00412 |
| Glycogen synthase kinase-3 β | P49841 | 5418 | 73 | P/Nb | 420 | 16 | 35 | 0 | 1 | PK | Wnt | 2 | IID00052 |
| Protein NDRG1 | Q92597 | 5586 | 61 | P/Ac | 394 | 21 | 83 | 0 | 1 | RP | stress | ||
| Tyrosine-protein kinase BTK | Q06187 | 2248 | 56 | P/Ac/T/Nb | 659 | 6 | 40 | 0 | 5 | PK | NF-κΒ | 2 | |
| Guanine nucleotide-binding protein G(i) subunit α -1 | P63096 | 756 | 50 | 354 | 3 | 7 | 0 | 1 | MD | cGMP | 1 | ||
| Receptor tyrosine-protein kinase erbΒ-2 | P04626 | 1281 | 46 | P/T/Av | 1255 | 28 | 262 | 46 | 3 | R-PK | ErbΒ | 0 | IID00293 |
| Protein kinase C ε type | Q02156 | 1500 | 45 | P/Nb | 737 | 4 | 31 | 0 | 4 | PK | sphingolopid | 3 | IID00066 |
| Peripheral plasma membrane protein CASK | O14936 | 2164 | 37 | P/Nb | 926 | 14 | 43 | 0 | 5 | SC | adhension | ||
| Catenin δ-1 | O60716 | 3026 | 22 | P/Ac/T | 968 | 48 | 357 | 0 | 1 | RB | Wnt | 1 |
Abbreviations: Uniprot, Uniprot accession; HPRD, HPRD ID; PPI, number of PPIs; in the keyword column, P, phosphoprotein; Ac, acetylation; U, Ubl conjugation; T, transcription; Nb, nucleotide-binding; %ID, percentage of IDR; LID, length of the longest IDR; TM, length of the predicted trans-membrane regions; domains, number of domains; in the function column, PK, protein kinase; RP, response protein; MD, Modulator in signaling pathways; R-PK, receptor type protein kinase; SC, scaffold protein; RB, receptor binding; in the process section, sphingolipid, sphingolipid signaling pathway; HIF-1, HIF-1 signaling pathway; Wnt, Wnt signaling pathway; stress, stress response; NF-κΒ, NF-κΒ signaling pathway; cGMP, cGMP-PKG signaling pathway; ErbΒ, ErbΒ signaling pathway; adhesion, cell adhesion; step, number of steps from the cell membrane in KEGG pathway maps; IDEAL, IDEAL identifier.
Fig 5Venn diagrams showing the overlap of the proteins that undergo post-translational modifications multiple times (mPTM) [27], and the multiple-localized hub proteins.
The mPTM proteins were obtained from Supplementary File 2 of [27]. We regarded mPTM proteins as proteins that undergo PTM more than once (exactly speaking, the classification group symbol in the File is at least 2), and did not decompose further classifications. Proteins with PTM that were not annotated in the File were disregarded. A. Detailed classification. The small circles in NCP and CMP are the proteins with PTM* × Transcription* (TC*) and those with PTM’ + Nucleotide-binding (Nb), respectively. B. Schematic classification.