| Literature DB >> 23749465 |
Karen E Ross1, Cecilia N Arighi, Jia Ren, Hongzhan Huang, Cathy H Wu.
Abstract
Knowledge representation of the role of phosphorylation is essential for the meaningful understanding of many biological processes. However, such a representation is challenging because proteins can exist in numerous phosphorylated forms with each one having its own characteristic protein-protein interactions (PPIs), functions and subcellular localization. In this article, we evaluate the current state of phosphorylation event curation and then present a bioinformatics framework for the annotation and representation of phosphorylated proteins and construction of phosphorylation networks that addresses some of the gaps in current curation efforts. The integrated approach involves (i) text mining guided by RLIMS-P, a tool that identifies phosphorylation-related information in scientific literature; (ii) data mining from curated PPI databases; (iii) protein form and complex representation using the Protein Ontology (PRO); (iv) functional annotation using the Gene Ontology (GO); and (v) network visualization and analysis with Cytoscape. We use this framework to study the spindle checkpoint, the process that monitors the assembly of the mitotic spindle and blocks cell cycle progression at metaphase until all chromosomes have made bipolar spindle attachments. The phosphorylation networks we construct, centered on the human checkpoint kinase BUB1B (BubR1) and its yeast counterpart MAD3, offer a unique view of the spindle checkpoint that emphasizes biologically relevant phosphorylated forms, phosphorylation-state-specific PPIs and kinase-substrate relationships. Our approach for constructing protein phosphorylation networks can be applied to any biological process that is affected by phosphorylation. Database URL: http://www.yeastgenome.org/Entities:
Mesh:
Substances:
Year: 2013 PMID: 23749465 PMCID: PMC3675891 DOI: 10.1093/database/bat038
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Overview of the workflow for the construction of phosphorylation-focused PPI networks.
Total phosphoproteins, phosphorylation sites, kinases and scientific publications (PMIDs) curated by seven databases that capture protein phosphorylation information (UniProtKB, Phospho.ELM, PhosphoSitePlus, HPRD, PhosphoGrid, PhosPhAt and P3DB)
| Total | |
|---|---|
| Phosphoproteins* | 28 158 |
| Sites | 125 896 |
| Kinases* | 689 |
| Sites w/ kinase information | 12 702 |
| PMIDs | 10 213 |
*The numbers of phosphoproteins and kinases are the numbers of distinct UniProtKB identifiers that were obtained by mapping the data to UniProtKB entries.
Number of phosphoproteins (distinct UniProtKB identifiers) for the top 15 annotated organisms
| Organism | No of phospho-proteins |
|---|---|
| 8738 | |
| 6423 | |
| 3533 | |
| 2649 | |
| 2525 | |
| 969 | |
| 750 | |
| 639 | |
| 583 | |
| 555 | |
| 111 | |
| 87 | |
| 86 | |
| 86 | |
| 66 |
Figure 2.The human BUB1B network. The BUB1B node is shown in red; other core spindle checkpoint proteins are shown in purple; nodes representing phosphorylation-state–specific forms are shown in blue. Triangles indicate protein kinases. Green and yellow edges are PPIs identified by text mining and data mining, respectively; blue edges connect kinases to their phosphorylated products; black edges indicate the has_part relation connecting protein complexes to their components.
Phosphorylation-state–specific PPIs in the human BUB1B network
| Protein Form #1 | Protein Form #2 |
|---|---|
| BUB1B/Phos:1 | PLK1 |
| BUB1B/Phos:2 | BUB1 |
| BUB1B/Phos:2 | CDC20 |
| BUB1B/Phos:2 | PPP2R5A |
| CDC27/Phos:1 | BUB1B |
| CDC27/PhosRes- | BUB1B |
| CDC20/Phos:1 | BUB1B |
| CDC20/Phos:1 | MAD2L1 |
| CDC20/PhosRes- | BUB1B |
| CDC20/PhosRes- | MAD2L1 |
Figure 3.Comparison of the human BUB1B phosphorylation network with the human BUB1B STRING network. The portion of the BUB1B phosphorylation network comprising proteins that directly interact with BUB1B, their isoforms and their PTM forms is shown. Gene level terms that appear in the STRING BUB1B network are shown in blue; gene level terms that do not appear in the STRING network are shown in red; isoforms or PTM forms are shown in gray. Triangles indicate protein kinases. Green, yellow and blue edges are as described in Figure 2; black edges indicate the is_a relationship connecting isoforms and PTM forms to their parent gene-level forms.
BUB1B phosphorylation sites in Phospho.ELM, PhosphoSitePlus, HPRD and PRO identified in low-throughput experiments
| Site | Phospho.ELM | PhosphoSitePlus | HPRD | PRO |
|---|---|---|---|---|
| Ser-435 | LTP, HTP | LTP, HTP | HTP | BUB1B/Phos:4 |
| Ser-543 | LTP, HTP | LTP, HTP | HTP | BUB1B/Phos:4 |
| Ser-574 | HTP | LTP, HTP | BUB1B/Phos:4 | |
| Thr-608 | HTP | BUB1B/Phos:3 | ||
| Thr-620 | LTP | LTP, HTP | BUB1B/Phos:1, BUB1B/Phos:2 | |
| Ser-670 | LTP, HTP | LTP, HTP | HTP | BUB1B/Phos:4 |
| Ser-676 | LTP | LTP, HTP | BUB1B/Phos:2 | |
| Thr-680 | LTP | BUB1B/Phos:2 | ||
| Ser-720 | HTP | LTP, HTP | BUB1B/Phos:4 | |
| Thr-792 | LTP | BUB1B/Phos:2 | ||
| Ser-884 | LTP, HTP | |||
| Thr-1008 | LTP | BUB1B/Phos:2 | ||
| Ser-1043 | LTP, HTP | LTP, HTP | HTP | BUB1B/Phos:4 |
LTP, low throughput; HTP, high throughput.
In addition to the sites listed above, the following phosphorylation sites were identified in high-throughput experiments only:
PhosphoSitePlus: Ser-39, Thr-40, Thr-54, Ser-83, Thr-315, Thr-368, Ser-384, Tyr-404, Thr-471, Thr-600, Ser-633, Tyr-660, Tyr-766, Ser-797
PhosphoSite Plus and Phospho.ELM: Ser-367, Ser-537, Ser-733
PhosphoSitePlus and HPRD: Thr-434, Thr-1042.
Figure 4.Comparison of the yeast MAD3 and human BUB1B networks. (A) The MAD3 network. (B) The portion of the BUB1B network containing the human homologs of the yeast network proteins. In (A) and (B), nodes representing homologous gene-level protein forms in yeast and humans are colored blue; nodes representing homologous phosphorylated protein forms are colored red. Edges are color-coded as in Figure 3.