| Literature DB >> 30397019 |
Yanhui Hu1,2, Richelle Sopko1, Verena Chung1,2, Marianna Foos1, Romain A Studer3, Sean D Landry4, Daniel Liu2, Leonard Rabinow1, Florian Gnad4, Pedro Beltrao3, Norbert Perrimon5,2,6.
Abstract
Post-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing their stability, interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, most commonly serine, threonine and tyrosine in metazoans. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that any given phosphorylation site might be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from Drosophila embryos collected from six closely-related species. We built iProteinDB (https://www.flyrnai.org/tools/iproteindb/), a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for Drosophila At iProteinDB, scientists can view the PTM landscape for any Drosophila protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related Drosophila species. Further, iProteinDB enables comparison of PTM data from Drosophila to that of orthologous proteins from other model organisms, including human, mouse, rat, Xenopus tropicalis, Danio rerio, and Caenorhabditis elegans.Entities:
Keywords: Drosophila; phosphoproteomics; post-translational modification
Mesh:
Substances:
Year: 2019 PMID: 30397019 PMCID: PMC6325894 DOI: 10.1534/g3.118.200637
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Database content and statistics. Distribution of 168,997 observed PTMs in the proteomics dataset (A). Representation of different types of PTMs (B). Overlap of phosphorylation data for Drosophila melanogaster from five different sources (C). Distribution of phosphorylation sites observed at three phospho-acceptor residues (serine (S), threonine (T) and tyrosine (Y)) within protein domains and their conservation based on at least 50% similarity to human sequence, considering a sliding window of five amino acids (D).
Integrated phosphoproteomics data for Drosophila melanogaster
| Source | Sample | Number of sites | Site overlap with at least one other source | Site overlap with at least 2 other sources |
|---|---|---|---|---|
| This study | fly embryo | 21,750 | 13,200 (61%) | 7,887 (36%) |
| Publication (PMID:18327897) | fly embryo | 23,347 | 14,758 (63%) | 8,466 (36%) |
| PHOSIDA | SL2 cells | 25,197 | 16,121 (64%) | 8,709 (35%) |
| PhosphoPep | Kc167 cells | 26,679 | 14,277 (54%) | 7,724 (29%) |
| UniProt | varies | 3,095 | 2,714 (88%) | 1,738 (56%) |
| All | 62,298 | 23,300 (37%) | 10,027 (16%) |
Numbers are based on non-redundant protein reference.
Figure 2Features of iProteinDB user interface. Observed PTM sites are marked in red on the Drosophila melanogaster protein sequence. Predicted phosphosites based on phospho-proteomic data from five other Drosophila species are marked in blue. Sites observed in more than one Drosophila species are underlined. The protein domains are highlighted in green. The data sources of PTMs are summarized. At the “Predicted Orthologs” page, the multiple sequence alignment of orthologous genes of major model organisms and human are displayed with observed sites color-coded (red arrows), conserved sites bolded (brown arrow) and human disease variant mutations underlined (navy arrows).
Figure 3Evolutionary relationships among Drosophila melanogaster tyrosine kinases. The core of the plot illustrates the phylogenetic relationships among Drosophila melanogaster tyrosine kinases estimated by total sequence similarity. The outer circle reflects the presence of orthologs in other species. Relationships among Drosophila melanogaster Serine/Threonine kinases are presented in Supplementary Figure 4.
Figure 4Conservation of phosphorylated proteins and sites. The line plot illustrates the proportions of Drosophila melanogaster phosphoproteins (blue) and non-phosphoproteins (orange) showing orthologs in other species (A). The line plot shows the proportions of conserved Drosophila melanogaster phosphosites (blue) and non-phosphorylated serines, threonines, and tyrosines (orange) across species (B).
Figure 5Analysis of the conservation of PTM sites of Drosophila melanogaster. Correlations of sequence conservation and observed phosphorylation in Drosophila melanogaster: 11,619 phosphosites identified in Drosophila melanogaster proteins can be aligned to phospho-acceptor amino acids of the human orthologs while 2601 acetylation sites identified in Drosophila melanogaster can be aligned to human orthologs. Considering a sliding window of five amino acids surrounding the identified phosphosite, the probability of the corresponding phospho-acceptor site having been observed as phosphorylated in human data correlates with the degree of sequence similarity. This correlation has also been observed with acetylation sites (A). Correlation of phosphorylation with disease related protein variants: The chance of the aligned human sites corresponding to the phosphosites identified in Drosophila locating within 10 amino acids distance to disease variants correlates with the sequence similarities between human and Drosophila sequences. The correlation is prevalent for phospho-acceptor sites in human (B).
Figure 6Examples of phosphosite conservation between human and Drosophila melanogaster. Examples of phosphosites identified in Drosophila melanogaster (red), also identified as phosphorylated in human (red), that share 100% identity with human (arrow) and indicated model organisms (A). Phosphosites where the observed phospho-acceptor residue has changed (B) and phosphosites where the phospho-acceptors have been lost but the surrounding sequences are 100% identical (C). The abbreviation of taxonomy name represents different model organisms (hs - Homo sapiens; mm- Mus musculus; rn - Rattus norvegicus; xt - Xenopus tropicalis; dr- Danio rerio; dm- Drosophila melanogaster; ce- Caenorhabditis elegans).