| Literature DB >> 17940529 |
Bernd Bodenmiller1, Johan Malmstrom, Bertran Gerrits, David Campbell, Henry Lam, Alexander Schmidt, Oliver Rinner, Lukas N Mueller, Paul T Shannon, Patrick G Pedrioli, Christian Panse, Hoo-Keun Lee, Ralph Schlapbach, Ruedi Aebersold.
Abstract
The ability to analyze and understand the mechanisms by which cells process information is a key question of systems biology research. Such mechanisms critically depend on reversible phosphorylation of cellular proteins, a process that is catalyzed by protein kinases and phosphatases. Here, we present PhosphoPep, a database containing more than 10 000 unique high-confidence phosphorylation sites mapping to nearly 3500 gene models and 4600 distinct phosphoproteins of the Drosophila melanogaster Kc167 cell line. This constitutes the most comprehensive phosphorylation map of any single source to date. To enhance the utility of PhosphoPep, we also provide an array of software tools that allow users to browse through phosphorylation sites on single proteins or pathways, to easily integrate the data with other, external data types such as protein-protein interactions and to search the database via spectral matching. Finally, all data can be readily exported, for example, for targeted proteomics approaches and the data thus generated can be again validated using PhosphoPep, supporting iterative cycles of experimentation and analysis that are typical for systems biology research.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17940529 PMCID: PMC2063582 DOI: 10.1038/msb4100182
Source DB: PubMed Journal: Mol Syst Biol ISSN: 1744-4292 Impact factor: 11.429
Figure 1Phosphoprotein properties. (A) Depletion/enrichment of molecular functions derived from ‘panther' ontology (Mi ) of the corresponding phosphoproteins (red) and the proteins identified from the separated peptides before enrichment (yellow) relative to the FlyBase database (0%) is shown. (B) Depletion/enrichment of biological functions derived from ‘panther' ontology (Mi ) of the corresponding phosphoproteins (red) and the proteins identified from the separated peptides before enrichment (yellow) compared to the FlyBase database (0%) is shown. (C) A comparison of the predicted phosphoprotein abundance (blue) with the predicted abundance (Duret and Mouchiroud, 1999) of all proteins of the used FlyBase database (pink) is shown. The scale ranges from 0 (low abundance) to 1 (highly abundant). Proteins for which no molecular function or biological process could be assigned were omitted for (A) and (B). χ2 test results for (A) and (B) are shown in Supplementary Table II.
Figure 2(A) Design of the PhosphoPep database. By using the ‘Search interface' (α) PhosphoPep can be interrogated for single proteins, a set of proteins or pathways. For each protein, several types of information including the observed phosphopeptides is shown in the ‘Protein information' page (see panel B and β. Single proteins or a set of proteins can be placed into their pathways (χ). From this ‘Pathway view' all phosphoproteins can be exported to Cytoscape (Shannon ) (δ). This software tool allows integrating data from PhosphoPep with external data such as protein–protein interaction networks (ɛ). For most phosphopeptides, consensus MS2 spectra (φ) are given which can be exported for targeted proteomics experiments such as multiple reaction monitoring (Domon and Aebersold, 2006) (γ). As we supply an online spectral matching search tool, results generated by such experiments can be validated using PhosphoPep. (B) Representative output of the PhosphoPep database. The PhosphoPep (www.phosphopep.org) database contains more than 10 000 phosphorylation sites from nearly 3500 gene models and nearly 5800 phosphoproteins derived from the FlyBase (Grumbling and Strelets, 2006) nonredundant database (r4.3). For each phosphoprotein, the phosphopeptide sequence, the protein annotation and the predicted subcellular location is shown. Furthermore, additional information for each phosphopeptide is given: The probability, the number of tryptic ends, the dCn value, the mass, how often it was observed and to how many gene models and transcripts it maps. The phosphopeptides are represented in both the protein sequence and in a graphical representation, the protein map. Finally, a link to the ‘Pathway view', to the ‘Cytoscape export' function and to http://scansite.mit.edu/ (Obenauer ) is given as represented by the three symbols besides the FlyBase gene entry.
Figure 3Proteins involved in the target of rapamycin (TOR) and insulin signaling. To demonstrate the usefulness of our database, we compared the already known phosphoproteins (left) with our identified phosphoproteins (right). As can be seen, compared to the literature in which only 6 out of the 15 proteins were found to be phosphorylated, we extended the phosphorylation map to all proteins of the pathway (Hay and Sonenberg, 2004; Oldham and Hafen, 2003) (phosphorylation sites are depicted by the P in a red circle, the number assigns the number of distinct phosphorylation sites). The number of identified phosphorylations ranged from 1 to 20 (CHICO). Peptides with P>0.8 and a defined phosphorylation site (dCn>0.1) were considered.