| Literature DB >> 27002216 |
Xiaofeng Wang1, Renxiang Yan2, Jiangning Song3,4,5.
Abstract
Protein dephosphorylation, which is an inverse process of phosphorylation, plays a crucial role in a myriad of cellular processes, including mitotic cycle, proliferation, differentiation, and cell growth. Compared with tyrosine kinase substrate and phosphorylation site prediction, there is a paucity of studies focusing on computational methods of predicting protein tyrosine phosphatase substrates and dephosphorylation sites. In this work, we developed two elegant models for predicting the substrate dephosphorylation sites of three specific phosphatases, namely, PTP1B, SHP-1, and SHP-2. The first predictor is called MGPS-DEPHOS, which is modified from the GPS (Group-based Prediction System) algorithm with an interpretable capability. The second predictor is called CKSAAP-DEPHOS, which is built through the combination of support vector machine (SVM) and the composition of k-spaced amino acid pairs (CKSAAP) encoding scheme. Benchmarking experiments using jackknife cross validation and 30 repeats of 5-fold cross validation tests show that MGPS-DEPHOS and CKSAAP-DEPHOS achieved AUC values of 0.921, 0.914 and 0.912, for predicting dephosphorylation sites of the three phosphatases PTP1B, SHP-1, and SHP-2, respectively. Both methods outperformed the previously developed kNN-DEPHOS algorithm. In addition, a web server implementing our algorithms is publicly available at http://genomics.fzu.edu.cn/dephossite/ for the research community.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27002216 PMCID: PMC4802303 DOI: 10.1038/srep23510
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Selected parameters for three algorithms.
| Method | kNN-DEPHOS | MGPS-DEPHOS | CKSAAP-DEPHOS | ||||
|---|---|---|---|---|---|---|---|
| PLa | PL | Positional weight | PL | C | |||
| PTP1b | 23 | 2.9 | 23 | 0,1,3,0,1,0,2,1,2,2,1,-3,3,1,1,1,1,0,2,1,1,1,1 | 45 | 2−7 | 2−3 |
| SHP1 | 25 | 5 | 25 | 3,0,1,1,1,0,0,2,1,1,2,1,6,2,2,3,2,2,0,1,0,1,1,0,1 | 47 | 2−9 | 2−2 |
| SHP2 | 39 | 6 | 25 | 2,0,0,3,1,0,0,1,1,2,2,2,7,3,3,2,2,0,0,1,1,1,1,1,1 | 57 | 2−10 | 2−2 |
Figure 1Sequence logo visualization for dephosphorylation sites of PTP1B, SHP-1 and SHP-2.
Gap ‘-’ can’t be recognized by pLogo, so it is represented by X in the sequence logo.
Figure 2ROC curves of kNN-DEPOHOS, CKSAAP-DEPHOS and MGPS-DEPHOS for the prediction of depohosphorylation sites of the three enzymes, PTP1B, SHP-1, and SHP-2.
Performance comparison at different specificity levels for PTP1B-specific dephosphorylation site prediction.
| Specificity level | High | Middle | Low | ||||||
|---|---|---|---|---|---|---|---|---|---|
| SPE | SEN | MCC | SPE | SEN | MCC | SPE | SEN | MCC | |
| kNN-DEPHOS | 0.955 | 0.222 | 0.191 | 0.860 | 0.571 | 0.285 | – | – | – |
| MGPS-DEPHOS | 0.900 | 0.556 | 0.335 | 0.850 | 0.683 | 0.339 | 0.800 | 0.794 | 0.344 |
| CKSAAP-DEPHOS | 0.900 | 0.619 | 0.377 | 0.850 | 0.667 | 0.330 | 0.800 | 0.746 | 0.318 |
“–” denotes that the performance value was not available at the corresponding specificity level.
Performance comparison at different specificity levels for SHP-1-specific dephosphorylation site prediction.
| Specificity level | High | Middle | Low | ||||||
|---|---|---|---|---|---|---|---|---|---|
| SPE | SEN | MCC | SPE | SEN | MCC | SPE | SEN | MCC | |
| kNN-DEPHOS | 0.921 | 0.44 | 0.277 | – | – | – | 0.817 | 0.68 | 0.279 |
| MGPS-DEPHOS | 0.900 | 0.600 | 0.343 | 0.850 | 0.700 | 0.327 | 0.800 | 0.780 | 0.315 |
| CKSAAP-DEPHOS | 0.900 | 0.520 | 0.293 | 0.850 | 0.680 | 0.316 | 0.800 | 0.760 | 0.304 |
“–” denotes that the performance value was not available at the corresponding specificity level.
Performance comparison at different specificity levels for SHP-2-specific dephosphorylation site prediction.
| Specificity level | High | Middle | Low | ||||||
|---|---|---|---|---|---|---|---|---|---|
| SPE | SEN | MCC | SPE | SEN | MCC | SPE | SEN | MCC | |
| kNN-DEPHOS | 0.933 | 0.431 | 0.315 | 0.830 | 0.706 | 0.330 | – | – | – |
| MGPS-DEPHOS | 0.900 | 0.745 | 0.458 | 0.850 | 0.804 | 0.411 | 0.800 | 0.882 | 0.394 |
| CKSAAP-DEPHOS | 0.900 | 0.569 | 0.345 | 0.850 | 0.647 | 0.319 | 0.801 | 0.745 | 0.320 |
“–” denotes that the performance value was not available at the corresponding specificity level.
Average AUC on 5-fold cross validation.
| Method | PTP1B | SHP-1 | SHP-2 |
|---|---|---|---|
| kNN-DEPHOS | 0.785 | 0.795 | 0.825 |
| MGPS-DEPHOS | 0.853 | 0.863 | 0.877 |
| CKSAAP-DEPHOS | 0.873 | 0.841 | 0.866 |
Figure 3Influence of the peptide length and position weight tuning on the performance of the MGPS-DEPHOS algorithm.
Green triangles and blue squares represent AUC values before and after the position weight tuning, respectively.
The predictive performance between different methods for the prediction of dephosphorylation sites of the three phosphatases, PTP1B, SHP-1, and SHP-2.
| Method | PTP1B (42, 291) | SHP-1 (41, 301) | SHP-2 (37, 298) |
|---|---|---|---|
| kNN-DEPHOS | 0.725 | 0.765 | 0.806 |
| GPS-DEPHOS | 0.84 | 0.881 | 0.881 |
| CKSAAP-DEPHOS | 0.894 | 0.848 | 0.873 |
The two numbers in the parentheses indicate the size of the positive and negative data set.