Literature DB >> 15980451

GPS: a comprehensive www server for phosphorylation sites prediction.

Yu Xue1, Fengfeng Zhou, Minjie Zhu, Kashif Ahmed, Guoliang Chen, Xuebiao Yao.   

Abstract

Protein phosphorylation plays a fundamental role in most of the cellular regulatory pathways. Experimental identification of protein kinases' (PKs) substrates with their phosphorylation sites is labor-intensive and often limited by the availability and optimization of enzymatic reactions. Recently, large-scale analysis of the phosphoproteome by the mass spectrometry (MS) has become a popular approach. But experimentally, it is still difficult to distinguish the kinase-specific sites on the substrates. In this regard, the in silico prediction of phosphorylation sites with their specific kinases using protein's primary sequences may provide guidelines for further experimental consideration and interpretation of MS phosphoproteomic data. A variety of such tools exists over the Internet and provides the predictions for at most 30 PK subfamilies. We downloaded the verified phosphorylation sites from the public databases and curated the literature extensively for recently found phosphorylation sites. With the hypothesis that PKs in the same subfamily share similar consensus sequences/motifs/functional patterns on substrates, we clustered the 216 unique PKs in 71 PK groups, according to the BLAST results and protein annotations. Then, we applied the group-based phosphorylation scoring (GPS) method on the data set; here, we present a comprehensive PK-specific prediction server GPS, which could predict kinase-specific phosphorylation sites from protein primary sequences for 71 different PK groups. GPS has been implemented in PHP and is available on a www server at http://973-proteinweb.ustc.edu.cn/gps/gps_web/.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15980451      PMCID: PMC1160154          DOI: 10.1093/nar/gki393

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Protein phosphorylation is an important and dynamic type of protein modification, orchestrating a variety of cellular signaling processes. About 2% of the human and mouse proteomes encode protein kinases (PKs) with 518 and 540 distinct PKs determined in human (1) and mouse (2), respectively. In vivo or in vitro identification of phosphorylation sites is labor-intensive, time-consuming and often limited by the availability and optimization of enzymatic reactions. Recently, several large-scale phosphoproteomic data using the mass spectrometry (MS) approach have been published for yeast (3), mouse (4) and human (5). But in these cases, it is still difficult to distinguish the kinase-specific sites on the substrates. The in silico prediction of phosphorylation sites with their specific PKs plays an important role in this field. Most of the existing systems adopt the putative rule that protein substrates are phosphorylated at the specific sites with flanking consensus sequences/motifs/functional patterns (6). Application of 3D structure conservation/similarity can significantly improve the prediction specificity (7,8), but the 3D structure information of proteins is very limited compared with the huge number of protein primary sequences available in the public databases. Thus, it would be more feasible and convenient to predict the phosphorylation sites with the specific PKs solely from the protein primary sequences. Several such systems have been implemented over the Internet. For example, DISPHOS distinguishes the potential phosphorylation sites with position-specific amino acid frequencies and disorder information (9). Another system NetPhos outperforms the consensus sequence-based methods by training the artificial neural networks with the known data set (10). However, the above two systems provide little information about the corresponding PKs for the predicted phosphorylation sites. The enhanced version of NetPhos, NetPhosK, incorporates the functionality of providing PKs' information for ∼17 PKs (11). Scansite (12) constructs the profiles of known phosphorylation sites of ∼20 eukaryotic PKs for prediction. In this work, we present a comprehensive PK-specific prediction server GPS (group-based phosphorylation scoring), which can predict kinase-specific phosphorylation sites in the substrate sequence for 71 PK groups, including many newly considered PKs, such as Aurora-A, Aurora-B and NIMA (NimA-like protein kinases), etc. The detailed algorithm of this system was described previously (13). We evaluate the sensitivity and specificity of different cut-off scores for each PK group by ‘leave-one-out’ validation. The default cut-off scores are chosen for the balanced pair of sensitivity and specificity. We also use the rat Spinophilin (O35274) as an example to illustrate the usage of the GPS server. Compared with the two separate in vivo or in vitro experiments (14,15) and the two in silico phosphorylation sites prediction tools ScanSite 2.0 and NetPhosK 1.0, the GPS server provides satisfying prediction performance. Thus, we propose that GPS server will be more useful and helpful in further research in the field of protein phosphorylation.

IMPLEMENTATION

First of all, we obtained the data set of phosphorylation sites with their PKs from Phospho.ELM (16), which also included the data of PhosphoBase (17). After removing the phosphorylation sites with ambiguous information of PKs, we were left with 1404 instances. We also browsed the recent publications and obtained ∼660 instances. We retrieved the sequences for the PKs of the data sets from Swiss-Prot and performed homology search in human proteome for each PK. Next, we checked the BLAST results and protein annotations manually to validate the PK subfamily information. After clustering some homology PKs with too few known phosphorylation sites into groups, we obtained 71 PK groups with 216 unique PKs. The GPS www server is implemented in PHP + MySQL and the prediction page is shown in Figure 1. The detailed information for each PK group can be visited by clicking on the PK group's name. Several pairs of sensitivity and specificity at different cut-off values for each PK group by the ‘leave-one-out’ validation are also listed for each PK group. Users can choose their required cut-off or select zero for the full list of the scores of the S/T or Y sites.
Figure 1

The prediction page of GPS www server. The detailed information for each PK can be viewed by clicking on the PK names.

USAGE

Here, we present the rat Spinophilin protein (Swiss-Prot accession no: O35274) as an example to demonstrate the simplicity and precision in the usage of GPS web server. Rat Spinophilin (also called neural tissue-specific F-actin binding protein II or Neurabin-II) is an 817 amino acid, actin- and protein phosphotase-1 (PP1)-binding protein that is ubiquitously expressed but enriched in dendritic spines of rat brain and adherents junction of rat liver (14,15,18–22). Spinophilin binds to and bundles F-actin in vivo, involving in the regulation of dendritic spine morphogenesis (14,19). Spinophilin also modulates excitatory synaptic transmission by binding PP1, redirecting it to postsynaptic densities, and regulating its dephosphorylation activity toward glutamate receptors (18,20,21). Moreover, Spinophilin knock-out mice show impaired synaptic transmission and long-term depression. In addition, young Spinophilin-deficient mice exhibit a significant increase in spine density and enhanced filopodial formation (21). The phosphorylated Spinophilin plays important roles in spine morphology. The N-terminal 221 residues of Spinophilin (actin-binding domain) can be phosphorylated by at least four PKs, including PKA (14), Ca2+/calmodulin-dependent PK II (CaMKII/CaM-II) (22), cyclin-dependent PK5 (Cdk5) and ERK2 (MAPK1) (15). Phosphorylation of Spinophilin by PKA and CaM-II disrupts its association with the F-actin (14,22). It is proposed that ERK2 phosphorylates the actin-binding domain of Spinophilin to reduce its interaction with actin filaments (15). The phosphorylation sites on Spinophilin were experimentally identified by tryptic phosphopeptide mapping, site-directed mutagenesis, microsequencing analysis and phosphospecific antibodies (14,15,22). We scanned our training data set and found that the CaM-II sites have been used in the current GPS server. The sites for the other three PKs are not included in our training data set. Therefore, we choose to predict the phosphorylation sites on Spinophilin for kinases PKA, ERK2 and Cdk5. The GPS server clusters all CDK PKs, except p34Cdc2 (Cdk1) as the PK group ‘CDKs’, and all MAP kinases as the PK group ‘MAPK’. We obtained the primary sequence of rat Spinophilin (O35274) from the Swiss-Prot database and pasted it into the ‘Prediction’ section of the GPS server (Figure 1). The default parameters were chosen. The prediction result of PKA (default cut-off value: 2.4; sensitivity: 88.9%; and specificity: 90.6%) is shown in Figure 2a, and the prediction results of CDKs (default cut-off value: 2.5; sensitivity: 94.4%; and specificity: 91.7%) and MAPK (default cut-off value: 2.5; sensitivity: 83.0%; and specificity: 91.9%) are shown in Figure 2b. For comparison, the prediction results from ScanSite 2.0 and NetPhosK 1.0 are also listed in Table 1. Both ScanSite 2.0 and NetPhosK 1.0 do not provide prediction for MAPK1, so we choose p38MAPK (MAPK5) since the substrate specificity is very similar among the MAPK family. We followed the low stringency for ScanSite 2.0 and default threshold 0.5 for NetPhosK 1.0. The total information on phosphorylation sites of Spinophilin is listed in Table 1.
Figure 2

Prediction results of GPS server for the rat Spinophilin (O35274). (a) The prediction results of kinase PKA for Spinophilin. There are 13 predicted hits (S17, S87, S94, S99, S100, S122, S126, S177, S356, S694, S756, T777 and S814). (b) The prediction results of kinases CDKs and MAPK for Spinophilin. There are seven predicted hits (S17, S131, S205, T337, S339, S635 and S658) for CDKs and five predicted hits (S17, S205, S339, S635 and S658) for MAPK.

Table 1

The experimental verified and predicted phosphorylation sites of the rat Spinophilin (O35274)

Spinophilin (O35274)PMIDPhosphorylation sites
PKACdk5/CDKsp38MAPK/MAPK
See Hsieh-Wilson et al. (14)12417592S94, S100 and S177
See Futter et al. (15)15728359S17S15 and S205
ScanSite 2.012824383S17, S59, S94, S100, S177, S694 and S756S17 and S339
NetPhosK 1.015174133S59, S87, S99, S126, S694, S756 and S814S17 and S635S17, S635
GPS server 1.10S17, S87, S94, S99, S100, S122, S126, S177, S356, S694, S756, T777 and S814S17, S131, S205, T337, S339, S635 and S658S17, S205, S339, S635 and S658
PKA phosphorylates Spinophilin at three major sites, S94, S100 and S177 (14). Both ScanSite and GPS could predict the three sites properly while NetPhosK could not. Since only the N-terminal sequence (1–221 amino acids) was used in the experimental identification, it is possible that PKA may have additional phosphorylation sites in rest of the Spinophilin sequence with unknown functions. For the N-terminal sequence of Spinophilin, ScanSite predicts five sites as positive hits (S17, S59, S94, S100 and S177) while GPS claims eight potential sites (S17, S87, S94, S99, S100, S122, S126 and S177). In the other region, ScanSite and NetPhosK predict two (S694 and S756) and three sites (S694, S756 and S814), respectively, while GPS claims three potential sites (S356, T777 and S814). Since GPS has no entries for Cdk5 and ERK2 (MAPK1), we choose their PKs group CDKs and MAPK instead. Both ScanSite and NetPhosK have Cdk5 without ERK2. Hence, we choose p38MAPK (MAPK5) as a homolog PK to predict its sites on Spinophilin, based on the hypothesis that PKs in one subfamily exhibit very similar substrate specificity. Cdk5 and ERK2 can phosphorylate Spinophilin at S17, and S15 and S205, respectively. ScanSite and NetPhosK could accurately predict S17 properly, while GPS predicted three sites (S17, S131 and S205). For p38MAPK/MAPK, only GPS could correctly predict one site S205. Since we used PK groups of CDKs and MAPK to predict the Spinophilin, the prediction result suggests that Spinophilin may also be phosphorylated by other CDKs and MAPK kinases. The example used here shows that GPS could be a good complementary tool for the experimental work and for other in silico prediction tools, e.g. ScanSite and NetPhosK.
  22 in total

1.  Sequence and structure-based prediction of eukaryotic protein phosphorylation sites.

Authors:  N Blom; S Gammeltoft; S Brunak
Journal:  J Mol Biol       Date:  1999-12-17       Impact factor: 5.469

2.  Amino acids determining enzyme-substrate specificity in prokaryotic and eukaryotic protein kinases.

Authors:  Lewyn Li; Eugene I Shakhnovich; Leonid A Mirny
Journal:  Proc Natl Acad Sci U S A       Date:  2003-04-04       Impact factor: 11.205

Review 3.  The protein kinase complement of the human genome.

Authors:  G Manning; D B Whyte; R Martinez; T Hunter; S Sudarsanam
Journal:  Science       Date:  2002-12-06       Impact factor: 47.728

4.  Structural basis and prediction of substrate specificity in protein serine/threonine kinases.

Authors:  Ross I Brinkworth; Robert A Breinl; Bostjan Kobe
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-26       Impact factor: 11.205

5.  Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs.

Authors:  John C Obenauer; Lewis C Cantley; Michael B Yaffe
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

Review 6.  Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence.

Authors:  Nikolaj Blom; Thomas Sicheritz-Pontén; Ramneek Gupta; Steen Gammeltoft; Søren Brunak
Journal:  Proteomics       Date:  2004-06       Impact factor: 3.984

7.  Phosphorylation of spinophilin modulates its interaction with actin filaments.

Authors:  Linda C Hsieh-Wilson; Fabio Benfenati; Gretchen L Snyder; Patrick B Allen; Angus C Nairn; Paul Greengard
Journal:  J Biol Chem       Date:  2002-11-01       Impact factor: 5.157

8.  Phosphorylation of spinophilin by ERK and cyclin-dependent PK 5 (Cdk5).

Authors:  Marie Futter; Ken Uematsu; Stewart A Bullock; Yong Kim; Hugh C Hemmings; Akinori Nishi; Paul Greengard; Angus C Nairn
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-22       Impact factor: 11.205

9.  Spinophilin regulates the formation and function of dendritic spines.

Authors:  J Feng; Z Yan; A Ferreira; K Tomizawa; J A Liauw; M Zhuo; P B Allen; C C Ouimet; P Greengard
Journal:  Proc Natl Acad Sci U S A       Date:  2000-08-01       Impact factor: 11.205

10.  The importance of intrinsic disorder for protein phosphorylation.

Authors:  Lilia M Iakoucheva; Predrag Radivojac; Celeste J Brown; Timothy R O'Connor; Jason G Sikes; Zoran Obradovic; A Keith Dunker
Journal:  Nucleic Acids Res       Date:  2004-02-11       Impact factor: 16.971

View more
  99 in total

1.  MMFPh: a maximal motif finder for phosphoproteomics datasets.

Authors:  Tuobin Wang; Arminja N Kettenbach; Scott A Gerber; Chris Bailey-Kellogg
Journal:  Bioinformatics       Date:  2012-04-23       Impact factor: 6.937

2.  A novel phosphatase cascade regulates differentiation in Trypanosoma brucei via a glycosomal signaling pathway.

Authors:  Balázs Szöor; Irene Ruberto; Richard Burchmore; Keith R Matthews
Journal:  Genes Dev       Date:  2010-06-15       Impact factor: 11.361

3.  Mutations in two putative phosphorylation motifs in the tomato pollen receptor kinase LePRK2 show antagonistic effects on pollen tube length.

Authors:  Tamara Salem; Agustina Mazzella; María Laura Barberini; Diego Wengier; Viviana Motillo; Gustavo Parisi; Jorge Muschietti
Journal:  J Biol Chem       Date:  2010-12-03       Impact factor: 5.157

4.  Prediction of Nepsilon-acetylation on internal lysines implemented in Bayesian Discriminant Method.

Authors:  Ao Li; Yu Xue; Changjiang Jin; Minghui Wang; Xuebiao Yao
Journal:  Biochem Biophys Res Commun       Date:  2006-10-02       Impact factor: 3.575

5.  Phosphorylation of HsMis13 by Aurora B kinase is essential for assembly of functional kinetochore.

Authors:  Yong Yang; Fang Wu; Tarsha Ward; Feng Yan; Quan Wu; Zhaoyang Wang; Tanisha McGlothen; Wei Peng; Tianpa You; Mingkuan Sun; Taixing Cui; Renming Hu; Zhen Dou; Jingde Zhu; Wei Xie; Zihe Rao; Xia Ding; Xuebiao Yao
Journal:  J Biol Chem       Date:  2008-07-17       Impact factor: 5.157

6.  The nuclear DEK interactome supports multi-functionality.

Authors:  Eric A Smith; Eric F Krumpelbeck; Anil G Jegga; Malte Prell; Marie M Matrka; Ferdinand Kappes; Kenneth D Greis; Abdullah M Ali; Amom R Meetei; Susanne I Wells
Journal:  Proteins       Date:  2017-11-11

7.  Hedgehog signaling is a novel therapeutic target in tamoxifen-resistant breast cancer aberrantly activated by PI3K/AKT pathway.

Authors:  Bhuvaneswari Ramaswamy; Yuanzhi Lu; Kun-Yu Teng; Gerard Nuovo; Xiaobai Li; Charles L Shapiro; Sarmila Majumder
Journal:  Cancer Res       Date:  2012-08-08       Impact factor: 12.701

8.  Grb10 interacts with Bim L and inhibits apoptosis.

Authors:  Zhi-qian Hu; Jia-yi Zhang; Chao-neng Ji; Yi Xie; Jin-zhong Chen; Yu-min Mao
Journal:  Mol Biol Rep       Date:  2010-02-20       Impact factor: 2.316

9.  A phosphomimetic mutation at threonine-57 abolishes transactivation activity and alters nuclear localization pattern of human pregnane x receptor.

Authors:  Satyanarayana R Pondugula; Cynthia Brimer-Cline; Jing Wu; Erin G Schuetz; Rakesh K Tyagi; Taosheng Chen
Journal:  Drug Metab Dispos       Date:  2009-01-26       Impact factor: 3.922

10.  Negative regulation of the yeast ABC transporter Ycf1p by phosphorylation within its N-terminal extension.

Authors:  Christian M Paumi; Matthew Chuk; Igor Chevelev; Igor Stagljar; Susan Michaelis
Journal:  J Biol Chem       Date:  2008-07-29       Impact factor: 5.157

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.