| Literature DB >> 29654922 |
Shireen Al-Momani1, Da Qi2, Zhe Ren3, Andrew R Jones4.
Abstract
Phosphorylation is one of the most prevalent post-translational modifications and plays a key role in regulating cellular processes. We carried out a bioinformatics analysis of pre-existing phosphoproteomics data, to profile two model species representing the largest subclasses in flowering plants the dicot Arabidopsis thaliana and the monocot Oryza sativa, to understand the extent to which phosphorylation signaling and function is conserved across evolutionary divergent plants. We identified 6537 phosphopeptides from 3189 phosphoproteins in Arabidopsis and 2307 phosphopeptides from 1613 phosphoproteins in rice. We identified phosphorylation motifs, finding nineteen pS motifs and two pT motifs shared in rice and Arabidopsis. The majority of shared motif-containing proteins were mapped to the same biological processes with similar patterns of fold enrichment, indicating high functional conservation. We also identified shared patterns of crosstalk between phosphoserines with enrichment for motifs pSXpS, pSXXpS and pSXXXpS, where X is any amino acid. Lastly, our results identified several pairs of motifs that are significantly enriched to co-occur in Arabidopsis proteins, indicating cross-talk between different sites, but this was not observed in rice. SIGNIFICANCE: Our results demonstrate that there are evolutionary conserved mechanisms of phosphorylation-mediated signaling in plants, via analysis of high-throughput phosphorylation proteomics data from key monocot and dicot species: rice and Arabidposis thaliana. The results also suggest that there is increased crosstalk between phosphorylation sites in A. thaliana compared with rice. The results are important for our general understanding of cell signaling in plants, and the ability to use A. thaliana as a general model for plant biology.Entities:
Keywords: Bioinformatics; Evolutionary conservation; Motif identification; Pathway analysis; Phosphoproteomics
Mesh:
Substances:
Year: 2018 PMID: 29654922 PMCID: PMC5971217 DOI: 10.1016/j.jprot.2018.04.011
Source DB: PubMed Journal: J Proteomics ISSN: 1874-3919 Impact factor: 4.044
The data sets from ProteomeXchange that were re-analysed, including brief details of the experimental protocols, and the search parameters and cutoff filter used in our PEAKS database searching.
| Species | Data set identifier | Phospho-peptide enrichment | Precursor mass error tol. | Fragment mass error tol. | Missed cleav-ages | Fixed modifications | Variable modifications | −10lgP threshold | Instrument/fragmentation |
|---|---|---|---|---|---|---|---|---|---|
| PXD000033 | IMAC and TiO2 | ±5.0 ppm | ±0.5 Da | Three | Beta-methylthiolation (+45.99 Da), iTRAQ 4plex (K, N-term) (+144.10 Da) | Oxidation (M) (+15.99 Da), Phosphorylation (STY) (+79.97 Da), Acetylation (N-term) (+42.01 Da), Deamidation (NQ) (+0.98 Da), iTRAQ 4plex (Y) (+144.10 Da) | 30 | Orbitrap | |
| PXD000421 | MOAC | ±7.0 ppm | ±0.8 Da | Two | Carbamidomethylation (+57.02 Da) | Oxidation (M) (+15.99 Da), Phosphorylation (STY) (+79.97 Da), Acetylation (N-term) (+42.01 Da), Deamidation (NQ) (+0.98 Da) | 24.2 | Orbitrap | |
| PXD002222 | TiO2 | ±20.0 ppm | ±0.05 Da | One | Carbamidomethylation (+57.02 Da) | Oxidation (M) (+15.99 Da), Phosphorylation (STY) (+79.97 Da), Acetylation (N-term) (+42.01 Da) | 19 | Orbitrap | |
| PXD000923 | IMAC | ±15 ppm | ±0.05 Da | One | Carbamidomethylation (+57.02 Da) | Oxidation (M) (+15.99 Da), Phosphorylation (STY) (+79.97 Da) | 21 | Triple TOF |
MOAC = Metal Oxide Affinity Chromatography; IMAC = Immobilized Metal Affinity Chromatography.
PEAKS DB’s −10lgP threshold score was set as a cutoff filter to achieve false discovery rate (FDR) of 1.0% at the peptide sequence level.
2×2 contingency table was set to determine whether a pair of motifs occurred more often together than would be expected by chance where: q is the count of proteins with motifs a and b; m is the count of proteins with motif a; n is the count of proteins with motif b; N is the count of all identified phosphorylated proteins. Enrichment Factor (EF) calculated as EFa,b = q/(mn/N).
| Observed count | Expected count | |
|---|---|---|
| Proteins with motif a and b | q | mn/N |
| Proteins without motif a and b | N – m – n + q | ((N − m)(N − n))/N |
Summary statistics for protein and peptide identifications in our study.
| Identified proteins | 6332 | 2878 |
| Identified peptides | 52,430 | 10,117 |
| Identified phosphoproteins | 3189 | 1613 |
| Identified phosphopeptides | 6537 | 2307 |
| Total phospho-sites | 9249 | 2580 |
| Singly phosphorylated proteins | 1342 (42%) | 1025 (63.5%) |
| Multi-phosphorylated proteins | 1847 (58%) | 588 (36.5%) |
| Shared motif containing proteins | 791 (24.80%) | 1012 (62.74%) |
| pS%:pT%:pY% | 88.3: 11.4: 0.4 | 86.7: 12.8: 0.5 |
| 1Pi%:2Pi%:3Pi% | 78.9: 18.7: 2.3 | 89.7: 9.2: 1.1 |
| (1Pi%:2Pi%:3Pi%) in serine containing phosphopeptides | 78.7: 18.9: 2.4 | 90.9: 8.0: 1.1 |
| (1Pi%:2Pi%:3Pi%) in threonine containing phosphopeptides | 71.7: 23.8: 4.5 | 87.1: 12.9: 0 |
| (1Pi%:2Pi%:3Pi%) in tyrosine containing phosphopeptides | 58.3: 41.7: 0 | 91.7: 8.3: 0 |
Phosphorylation sites assignment with 99% certainty based on a p-value of 0.01 (Ascore ≥ 20).
Relative abundance of serine, threonine, and tyrosine phosphorylation sites based on analyzing 9249 phosphorylation sites in Arabidopsis and 2580 phosphorylation sites in rice.
Relative frequency of singly, doubly, and triply phosphorylated peptides based on analyzing a total of 6537 p-peptides Arabidopsis and 2307 p-peptides in rice - no phosphopeptides with four or more phosphates were found in this study.
Relative frequency of singly, doubly, and triply phosphorylated peptides in serine containing phosphopeptides based on analyzing a total of 7489 serine containing phosphopeptides in Arabidopsis and 2148 serine containing phosphopeptides in rice.
Relative frequency of singly, doubly, and triply phosphorylated peptides in threonine-containing phosphopeptides based on analyzing a total of 1022 threonine containing phosphopeptides in Arabidopsis and 325 threonine-containing phosphopeptides in rice.
Relative frequency of singly, doubly, and triply phosphorylated peptides in tyrosine containing phosphopeptides based on analyzing a total of 36 tyrosine containing phosphopeptides in Arabidopsis and 12 tyrosine containing phosphopeptides in rice.
Fig. 1Relative frequency distribution of distances between two phospho-sites within a 30 amino acid window in rice (left) and Arabidopsis (right). In each subfigure, the gray line represents the background distributions in which we calculate as the distance between all serine residues in the theoretical proteome.
Fig. 2Motif logos of the ten most abundant motifs predicted in rice (left) and Arabidopsis (right). The logo indicates statistically significant residues at the p < 0.05 level corresponding to the 0.00018 significance threshold that was specified in motif-x analysis and the frequency of other residues that failed to exceed the significance threshold, yet may be statistically overrepresented. Framed in yellow, four motifs are shared (SM1-SM4) between the two species. Note that GS and RSP are shared motifs but not in the ten most abundant motifs in Arabidopsis and rice, respectively.
Fig. 3Log2(fold enrichment) for PANTHER GO-slim categories biological process terms mapped from rice (left bar of each pair) and Arabidopsis (right bar of each pair) phosphoproteins containing shared phosphorylation motifs. *p < 0.05; **p < 0.001; ***p < 0.0001.
Fig. 4Co-occurrence of shared motifs in Arabidopsis and rice. The ratio of the observed-to-expected count of proteins that contain a motif pair from the twenty-one shared motifs identified in this study was calculated across confidentially identified phosphoproteins. n is the number of observed proteins for each shared motif pair.