Literature DB >> 18931372

P3DB: a plant protein phosphorylation database.

Jianjiong Gao1, Ganesh Kumar Agrawal, Jay J Thelen, Dong Xu.   

Abstract

P(3)DB (http://www.p3db.org/) provides a resource of protein phosphorylation data from multiple plants. The database was initially constructed with a dataset from oilseed rape, including 14,670 nonredundant phosphorylation sites from 6382 substrate proteins, representing the largest collection of plant phosphorylation data to date. Additional protein phosphorylation data are being deposited into this database from large-scale studies of Arabidopsis thaliana and soybean. Phosphorylation data from current literature are also being integrated into the P(3)DB. With a web-based user interface, the database is browsable, downloadable and searchable by protein accession number, description and sequence. A BLAST utility was integrated and a phosphopeptide BLAST browser was implemented to allow users to query the database for phosphopeptides similar to protein sequences of their interest. With the large-scale phosphorylation data and associated web-based tools, P(3)DB will be a valuable resource for both plant and nonplant biologists in the field of protein phosphorylation.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18931372      PMCID: PMC2686431          DOI: 10.1093/nar/gkn733

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Protein phosphorylation is the most studied posttranslational modification that controls the dynamic behaviors and decision processes in cells of various organisms. In recent years, large-scale studies on protein phosphorylation based on mass spectrometry have been conducted on different organisms. Most of these studies were undertaken in mammals and bacteria (1–5). Some of them were carried out in plants (6–8). As a result, a number of phosphorylation databases emerged, most of which focus on mammalian and prokaryotic systems. Phospho.ELM (9) contains verified eukaryotic phosphorylation sites, but most are from mammals. PHOSIDA (10) contains large-scale phosphorylation data in Homo sapiens, Bacillus subtilis and Escherichia coli. PhosphoSitePlus (http://www.phosphosite.org/) contains curated phosphorylation sites mainly in vertebrates. Some of the phosphorylation databases focus on plants. PlantsP (11) contains phosphorylation data on a few different plants, but it focuses on the annotation of plant protein kinases and protein phosphatases. PhosphAt (12) provides a database of phosphorylation sites collected from current literature solely for the model organism Arabidopsis thaliana. P3DB is unique in that it provides a resource of protein phosphorylation sites from various plant sources and contains multiple embedded search capacities for querying the database. By collecting and annotating plant phosphorylation data from different plant sources in a single database as a ‘one-stop’ shop, we anticipate P3DB that will serve as a useful resource not only for molecular biologists to study protein phosphorylation in plants and nonplant systems by comparison, but also for bioinformaticians to develop computational prediction tools on protein phosphorylation.

DATA COLLECTION

The database was constructed with a dataset from oilseed rape (Brassica napus var. Reston) developing seed obtained using a combination of data-dependent neutral loss and multistage activation on an LTQ linear ion trap liquid chromatography tandem mass spectrometry system. Details on the experimental design, which are available on the website (P3DB V1.0 release note), and the associated results and data analysis will be published elsewhere (Agrawal et al., unpublished results). The dataset includes 14 670 nonredundant phosphorylation sites (8350 phosphoserine sites, 4750 phosphothreonine sites and 1567 phosphotyrosine sites) from 6382 substrate proteins, representing the largest collection of plant phosphorylation data to date. Experimental details about each phosphopeptide, such as charge state, cross-correlation score, peptide probability, spectrum count, spectrum plot, etc., are available in the database. More protein phosphorylation data are being deposited into this database from recently completed large-scale studies of A. thaliana (Columbia) and soybean (Glycine max var. Maverick). Phosphorylation data from other, previous investigations are also being integrated into the P3DB. For example, we have integrated a dataset published in Ref. (8) into the P3DB. Users are also encouraged to submit their own plant phosphorylation data to P3DB. Submitted data will be displayed according to the current database format with full credit given to the submitting investigators.

ACCESS TO THE DATA

Protein phosphorylation data are stored in a MySQL relational database. With a PHP-based web graphical interface, the phosphorylation data in the database are downloadable, browsable and searchable. The entire dataset can be downloaded in a tab-delimited format. A user can browse the annotated phosphoproteins by organisms or by gene ontology categories (13). A user can search for phosphoproteins by protein identifiers (NCBI GI numbers, UniProt accession numbers or RefSeq accession numbers) or protein descriptions, and search for phosphopeptides by peptide sequences. The main page of the search result lists all phosphoproteins/peptides meeting the searching criteria and gives some brief information, such as protein accession, protein description, source organism, consensus score, spectrum count, etc. The user can sort the result table according to different criteria, e.g. sort the phosphoproteins according to spectrum count from high to low. From the search result page, the user can navigate among pages of phosphoproteins, phosphopeptides and phosphorylation sites. The phosphoprotein page gives the details on the substrate protein, including the protein sequence with phosphorylation sites linked. Clicking on a phosphorylation site will display its detailed information, such as its surrounding amino acids (+/−10) and a list of phosphopeptides that contain this phosphorylation site. The information on each phosphopeptide is hidden by default to simplify entry page appearance. Clicking on ‘Show details’ presents the information about the peptide and clicking on ‘More’ takes the user to the phosphopeptide page which contains additional information about the peptide. Another useful feature on the website is the phosphopeptide BLAST utility as shown in Figure 1. By uploading a protein sequence as in Figure 1A and querying it against the database using BLAST, a user can identify all the peptides in the query sequence that match one or more phosphopeptides in the database (according to a user-defined E-value cutoff). In the BLAST result page as Figure 1B, the BLAST alignments are displayed with links to the phosphopeptides and phosphorylation sites. In addition, to graphically representing phosphopeptide BLAST results, we developed a tool to view phosphopeptide BLAST results. Figure 1C shows one example after submitting a phosphopeptide BLAST result to this tool by clicking on ‘Send to Phosphopeptide BLAST browser’ in Figure 1B. All the BLAST alignments are displayed with an E-value color scheme so that the user can know how similar the peptides in the query sequence are to the phosphopeptides. In addition, each residue in the query sequence that is aligned to one or more phosphorylation sites in the matching phosphopeptides is explicitly colored and hyperlinked, if it is serine, threonine or tyrosine. A user can also submit a protein query sequence directly to this tool under the ‘Tools’ menu. The BLAST utility and BLAST result browser does not aim to explicitly predict phosphorylation sites or phosphorylation motifs in the query protein sequences, but does help the user to gain some related biological meaning about the query sequences. For example, if a user has a human phosphoprotein in hand and is interested to know whether similar phosphopeptides exist in plants, he/she may find this tool useful. Alternatively, if a user wants to know whether a plant protein contains phosphorylation sites, this tool may help him/her to gain some knowledge of the empirical evidence for phosphorylation based on the related sequences from the database in a conservative or semi-conservative manner.
Figure 1.

Phosphopeptide BLAST example (Query sequence: Arabidopsis ATP binding protein; Dataset: Oilseed rape phosphopeptides). (A) User interface for inputting a query protein sequence and selecting E-value threshold. (B) BLAST result page (partial). The BLAST alignments with their E-values are displayed. The matching phosphopeptides are hyperlinked to the corresponding phosphopeptide pages which show their detailed information. The phosphorylation sites in the matching peptides are also hyperlinked and colored as red. (C) BLAST result browser. Sequences following ‘Query: #’ are from the query protein sequence. Sequences between angle brackets (<>) are the matching phosphopeptides. They are hyperlinked to the phosphopeptide pages. The peptide sequences are rendered with different colors to show the different E-values of hits as indicated in the BLAST E-value color scheme legend. Each phosphorylation site in each phosphopeptide is also hyperlinked and colored as blue, and so is the corresponding residue in the query sequence if it is ‘S’(serine), ‘T’(threonine) or ‘Y’(tyrosine).

Phosphopeptide BLAST example (Query sequence: Arabidopsis ATP binding protein; Dataset: Oilseed rape phosphopeptides). (A) User interface for inputting a query protein sequence and selecting E-value threshold. (B) BLAST result page (partial). The BLAST alignments with their E-values are displayed. The matching phosphopeptides are hyperlinked to the corresponding phosphopeptide pages which show their detailed information. The phosphorylation sites in the matching peptides are also hyperlinked and colored as red. (C) BLAST result browser. Sequences following ‘Query: #’ are from the query protein sequence. Sequences between angle brackets (<>) are the matching phosphopeptides. They are hyperlinked to the phosphopeptide pages. The peptide sequences are rendered with different colors to show the different E-values of hits as indicated in the BLAST E-value color scheme legend. Each phosphorylation site in each phosphopeptide is also hyperlinked and colored as blue, and so is the corresponding residue in the query sequence if it is ‘S’(serine), ‘T’(threonine) or ‘Y’(tyrosine).

FUTURE DIRECTION

Deposit phosphorylation datasets from large-scale studies of A. thaliana and soybean, which are in the process of being annotated. Integrate more plant phosphorylation datasets from other investigators into P3DB and continue updating the database with new advances in mining and prediction analysis of plant phosphorylation. Integrate information on phosphorylation motifs and protein kinase specificity. Integrate additional information on protein phosphorylation data, such as Pfam domain, cross-species conservation data, pathway information, etc. Improve the current utilities and implement more tools, such as advanced search tool for querying by a user-defined combination of different criteria. Predict protein structures of phosphoproteins and highlight phosphorylation sites in a web-based protein structure viewer.

FUNDING

National Science Foundation (grant number DBI-0604439 to J.T.). Funding for open access charges: National Science Foundation. Conflict of interest statement. None declared.
  13 in total

1.  The PlantsP and PlantsT Functional Genomics Databases.

Authors:  Jason H Tchieu; Fariba Fana; J Lynn Fink; Jeffrey Harper; T Murlidharan Nair; R Hannes Niedner; Douglas W Smith; Kenneth Steube; Tobey M Tam; Stella Veretnik; Degeng Wang; Michael Gribskov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

2.  Large-scale phosphorylation analysis of mouse liver.

Authors:  Judit Villén; Sean A Beausoleil; Scott A Gerber; Steven P Gygi
Journal:  Proc Natl Acad Sci U S A       Date:  2007-01-22       Impact factor: 11.205

3.  Quantitative phosphoproteomics of early elicitor signaling in Arabidopsis.

Authors:  Joris J Benschop; Shabaz Mohammed; Martina O'Flaherty; Albert J R Heck; Monique Slijper; Frank L H Menke
Journal:  Mol Cell Proteomics       Date:  2007-02-21       Impact factor: 5.911

4.  Global proteomic profiling of phosphopeptides using electron transfer dissociation tandem mass spectrometry.

Authors:  Henrik Molina; David M Horn; Ning Tang; Suresh Mathivanan; Akhilesh Pandey
Journal:  Proc Natl Acad Sci U S A       Date:  2007-02-07       Impact factor: 11.205

5.  Large scale identification and quantitative profiling of phosphoproteins expressed during seed filling in oilseed rape.

Authors:  Ganesh Kumar Agrawal; Jay J Thelen
Journal:  Mol Cell Proteomics       Date:  2006-07-06       Impact factor: 5.911

6.  Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.

Authors:  Jesper V Olsen; Blagoy Blagoev; Florian Gnad; Boris Macek; Chanchal Kumar; Peter Mortensen; Matthias Mann
Journal:  Cell       Date:  2006-11-03       Impact factor: 41.582

7.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; Vivian Lee; Emily Dimmer; John Maslen; David Binns; Nicola Harte; Rodrigo Lopez; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

8.  Analysis of phosphorylation sites on proteins from Saccharomyces cerevisiae by electron transfer dissociation (ETD) mass spectrometry.

Authors:  An Chi; Curtis Huttenhower; Lewis Y Geer; Joshua J Coon; John E P Syka; Dina L Bai; Jeffrey Shabanowitz; Daniel J Burke; Olga G Troyanskaya; Donald F Hunt
Journal:  Proc Natl Acad Sci U S A       Date:  2007-02-07       Impact factor: 11.205

9.  Phospho.ELM: a database of phosphorylation sites--update 2008.

Authors:  Francesca Diella; Cathryn M Gould; Claudia Chica; Allegra Via; Toby J Gibson
Journal:  Nucleic Acids Res       Date:  2007-10-25       Impact factor: 16.971

10.  PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor.

Authors:  Joshua L Heazlewood; Pawel Durek; Jan Hummel; Joachim Selbig; Wolfram Weckwerth; Dirk Walther; Waltraud X Schulze
Journal:  Nucleic Acids Res       Date:  2007-11-04       Impact factor: 16.971

View more
  41 in total

1.  Musite, a tool for global prediction of general and kinase-specific phosphorylation sites.

Authors:  Jianjiong Gao; Jay J Thelen; A Keith Dunker; Dong Xu
Journal:  Mol Cell Proteomics       Date:  2010-08-11       Impact factor: 5.911

2.  Modulation of protein phosphorylation, N-glycosylation and Lys-acetylation in grape (Vitis vinifera) mesocarp and exocarp owing to Lobesia botrana infection.

Authors:  Marcella N Melo-Braga; Thiago Verano-Braga; Ileana R León; Donato Antonacci; Fábio C S Nogueira; Jay J Thelen; Martin R Larsen; Giuseppe Palmisano
Journal:  Mol Cell Proteomics       Date:  2012-07-09       Impact factor: 5.911

Review 3.  Toward a complete in silico, multi-layered embryonic stem cell regulatory network.

Authors:  Huilei Xu; Christoph Schaniel; Ihor R Lemischka; Avi Ma'ayan
Journal:  Wiley Interdiscip Rev Syst Biol Med       Date:  2010 Nov-Dec

4.  Protein databases on the internet.

Authors:  Dong Xu; Ying Xu
Journal:  Curr Protoc Mol Biol       Date:  2004-11

5.  Phosphoproteomic identification and phylogenetic analysis of ribosomal P-proteins in Populus dormant terminal buds.

Authors:  Chang-Cai Liu; Tian-Cong Lu; Hua-Hua Li; Hong-Xia Wang; Gui-Feng Liu; Ling Ma; Chuan-Ping Yang; Bai-Chen Wang
Journal:  Planta       Date:  2009-11-29       Impact factor: 4.116

Review 6.  Genomics and bioinformatics resources for crop improvement.

Authors:  Keiichi Mochida; Kazuo Shinozaki
Journal:  Plant Cell Physiol       Date:  2010-03-05       Impact factor: 4.927

7.  Phosphoproteomic analysis of seed maturation in Arabidopsis, rapeseed, and soybean.

Authors:  Louis J Meyer; Jianjiong Gao; Dong Xu; Jay J Thelen
Journal:  Plant Physiol       Date:  2012-03-22       Impact factor: 8.340

8.  Multisite phosphorylation of 14-3-3 proteins by calcium-dependent protein kinases.

Authors:  Kirby N Swatek; Rashaun S Wilson; Nagib Ahsan; Rebecca L Tritz; Jay J Thelen
Journal:  Biochem J       Date:  2014-04-01       Impact factor: 3.857

9.  Quantitative phosphoproteomic analysis of soybean root hairs inoculated with Bradyrhizobium japonicum.

Authors:  Tran Hong Nha Nguyen; Laurent Brechenmacher; Joshua T Aldrich; Therese R Clauss; Marina A Gritsenko; Kim K Hixson; Marc Libault; Kiwamu Tanaka; Feng Yang; Qiuming Yao; Ljiljana Pasa-Tolić; Dong Xu; Henry T Nguyen; Gary Stacey
Journal:  Mol Cell Proteomics       Date:  2012-07-25       Impact factor: 5.911

10.  Functional phosphoproteomic profiling of phosphorylation sites in membrane fractions of salt-stressed Arabidopsis thaliana.

Authors:  Jue-Liang Hsu; Lan-Yu Wang; Shu-Ying Wang; Ching-Huang Lin; Kuo-Chieh Ho; Fong-Ku Shi; Ing-Feng Chang
Journal:  Proteome Sci       Date:  2009-11-10       Impact factor: 2.480

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.