| Literature DB >> 22305189 |
Shufu Que1,2, Kuan Li2, Min Chen2, Yongfei Wang2, Qiaobin Yang2, Wenfeng Zhang1,2, Baoqian Zhang1,2, Bangshu Xiong3, Huaqin He1,2.
Abstract
BACKGROUND: As a result of the growing body of protein phosphorylation sites data, the number of phosphoprotein databases is constantly increasing, and dozens of tools are available for predicting protein phosphorylation sites to achieve fast automatic results. However, none of the existing tools has been developed to predict protein phosphorylation sites in rice.Entities:
Year: 2012 PMID: 22305189 PMCID: PMC3395875 DOI: 10.1186/1746-4811-8-5
Source DB: PubMed Journal: Plant Methods ISSN: 1746-4811 Impact factor: 4.993
Prediction performance of the element predictors on the test dataset
| Element predictor | Sn (%) | Sp (%) | ACC (%) | MCC |
|---|---|---|---|---|
| KinasePhos2.0_80 | 81.6 | 51.2 | 65.5 | 0.341 |
| KinasePhos_default | 80.2 | 57.4 | 68.1 | 0.383 |
| KinasePhos_90 | 77.0 | 62.3 | 69.2 | 0.395 |
| KinasePhos_95 | 65.8 | 73.7 | 70.0 | 0.396 |
| KinasePhos_100 | 37.6 | 89.6 | 65.1 | 0.321 |
| Scansite_low | 75.9 | 54.8 | 64.7 | 0.313 |
| Scansite_middle | 38.1 | 86.6 | 63.8 | 0.285 |
| Scansite_high | 12.8 | 96.5 | 57.1 | 0.173 |
| Prephospho | 95.5 | 13.7 | 52.2 | 0.158 |
| DISPHOS_default | 80.6 | 59.1 | 69.2 | 0.403 |
| DISPHOS_ Arabidopsis | 43.9 | 86.6 | 66.5 | 0.341 |
| DISPHOS_ Eukaryotes | 41.7 | 87.5 | 66.0 | 0.331 |
| NetPhosK_0.5 | 75.9 | 46.6 | 60.4 | 0.235 |
| NetPhosK_0.7 | 17.0 | 87.9 | 54.5 | 0.070 |
| NetPhos2.0 | 70.7 | 59.9 | 65.0 | 0.307 |
Predicting performance assessed on the dataset of rice phosphorylation sites.
The prediction performance of meta-predictors constructed by unweighted voting, unreduced weighted voting and reduced weighted voting strategies
| predictor | ACC (%) | MCC |
|---|---|---|
| Best element predictor | 69.2 | 0.403 |
| (Disphos_default) | ||
| Unweighted voting | 72.4 | 0.449 (1.58E-03)* |
| Best unreduced weighted voting | 72.5 | 0.450 (1.18E-03) * |
| (with weights set by ACC) | ||
| Best unreduced weighted voting | 72.8 | 0.453 (5.4E-04) * |
| (with weights set by MCC) | ||
| Best reduced weighted voting | 72.8 | 0.453 (6.0E-04) * |
| (with weights set by ACC) | ||
| Best reduced weighted voting | 72.9 | 0.454 (3.4E-04) * |
| (with weights set by MCC) |
* P-values in Fisher's Z-transformation test (compared with the MCC of the best element predictor) are shown in parentheses.
The parameters in the weighted voting meta-predictors selected by a restricted grid search and a conditional random search
| Element Predictor | Parameter selected by Restricted Grid search | Random number* | Parameter selected by conditional random search |
|---|---|---|---|
| Predphospho | 0 | Random (1) | 0 |
| NetPhos2.0 | 1 | Random (3) | 1.23 |
| NetPhosK_0.5 | 0 | Random (1) | 0 |
| NetPhosK_0.7 | 0 | Random (1) | 0 |
| KinasePhos_default | 3 | 1+Random (4) | 2.75 |
| KinasePhos_90 | 1 | Random (3) | 2.76 |
| KinasePhos_95 | 0 | Random (1) | 0.79 |
| KinasePhos_100 | 0 | Random (1) | 0 |
| DISPHOS_default | 3 | 1+Random (4) | 4.25 |
| DISPHOS_ Eukaryotes | 1 | Random (3) | 1.65 |
| DISPHOS_Arabidopsis | 1 | Random (3) | 2.22 |
| KinasePhos2.0_80 | 0 | Random (1) | 0.71 |
| Scansite_middle | 1 | Random (3) | 1.6 |
| Scansite_low | 3 | 1+Random (4) | 3.9 |
| Scansite_high | 1 | Random (3) | 2.57 |
| T value | 8 | 13.3 | |
| ACC (%) | 73.5 | 73.8 | |
| MCC | 0.469 (2.60E-06)** | 0.474 (6.00E-07) ** |
* Random (3) means the weight could fluctuate from 0 to 3. For instance, by restricted grid search, the weight value of NetphoK 2.0 was 1, and the last grid value and next grid value were 0 and 3, respectively. In a conditional random search, the weight of Netphos 2.0 was set as random (3). The weight value of KinasePhos_default was 3, and the last grid value and next grid value were 1 and 5, respectively. Therefore, its weight was set as '1+random (4)' in a conditional random search.
** P-values in Fisher's Z-transformation test (compared with the MCC of the best element predictor) are shown in parentheses.
Figure 1Receiver operating characteristics curves of the prediction performance of meta predictors in comparison to that of the best element predictor (Disphos_default). In the diagrams, improved classification performance is indicated for predictors with increased area under the ROC. The areas under the ROC curve were showed in Table 4. A: ROC curve of unweight-voting predictor in comparison to Disphos_default. B: ROC curve of restricted-grid predictor in comparison to Disphos_default. C: ROC curve of random-voting predictor in comparison to Disphos_default. D: ROC curve of unreduced-weight-voting predictor in comparison to Disphos_default (by ACC). E: ROC curve of unreduced- weight-voting predictor in comparison to Disphos_default (by MCC). F: ROC curve of reduced- weight-voting predictor in comparison to Disphos_default (by ACC). G: ROC curve of reduced- weight-voting predictor in comparison to Disphos_default (by MCC). * By ACC: the weights of meta-predictor were selected to result in the optimal ACC; By MCC: the weights of meta-predictor were selected to result in the optimal MCC.
Areas under the ROC curves for the best element predictor, meta-predictors constructed by unweighted voting, unreduced weighted voting, reduced weighted voting and weighted voting strategies.
| Predictor | Area |
|---|---|
| Best element predictor | 0.758 |
| (Disphos_default) | |
| Unweighted voting | 0.788 |
| Best unreduced weighted voting | 0.791 |
| (with weights set by ACC) | |
| Best unreduced weighted voting | 0.792 |
| (with weights set by MCC) | |
| Best reduced weighted voting | 0.791 |
| (with weights set by ACC) | |
| Best reduced weighted voting | 0.791 |
| (with weights set by MCC) | |
| Weighted voting | 0.794 |
| (By restricted grid search) | |
| A combination of weight voting and random | 0.796 |
The prediction performance of PhosphoRice in comparison to that of Musite
|
|
|
|
|
|---|---|---|---|
| PhosphoRice | 72.4 | 0.474 (0.044) * | 0.796 |
| Musite | 73.8 | 0.446 | 0.793 |
* P-value in Fisher's Z-transformation test (compared with the MCC of Musite) is shown in parenthes.
Number of phosphoserine, phosphothreonine and phosphotyrosine sites in positive and negative dataset
|
|
| Total | ||
|---|---|---|---|---|
| Serine | Threonine | Tyrosine | ||
| Positive dataset | 4220 | 605 | 141 | 4966 |
| Negative dataset | 2954 | 1798 | 834 | 5586 |
Weight combinations, permutations and possible weights sum values in the restricted grid search scheme
|
| Number of corresponding weight** |
|---|---|
| 15 × (1) | |
| 1 × (2)+13 × (1) | |
| 1 × (1)+3 × (1)+11 × (1) | |
| 1 × (4)+11 × (1) | |
| 1 × (1)+5 × (1)+9 × (1) | |
| 3 × (2)+9 × (1) | |
| 1 × (3)+3 × (1)+9 × (1) | |
| 1 × (6)+9 × (1) | |
| 1 × (1)+7 × (2) | |
| 3 × (1)+5 × (1)+7 × (1) | |
| 1 × (3)+5 × (1)+7 × (1) | |
| 1 × (2)+3 × (2)+7 × (1) | |
| 1 × (5)+3 × (1)+7 × (1) | |
| 1 × (8)+7 × (1) | |
| 5 × (3) | |
| 1 × (2)+3 × (1)+5 × (2) | |
| 1 × (5)+5 × (2) | |
| 1 × (1)+3 × (3)+5 × (1) | |
| 1 × 4+3 × 2+5 × 1 | |
| 1 × (7)+3 × (1)+5 × (1) | |
| 1 × (10)+5 × (1) | |
| 3 × (5) | |
| 1 × (3)+3 × (4) | |
| 1 × (6)+3 × (3) | |
| 1 × (9)+3 × (2) | |
| 1 × (12)+3 × (1) | |
| 1 × (15) | |
| Possible weighted | 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 |
* Weight combinations are denoted as the sum of each weight value multiplied by the number of weights taking the weight value, with the weight value = 0 omitted.
** For instance, "15 × (1)"represents that 1 of the 15 weights takes the value 15, and the other 14 weights take the value 0; and "1 × (1)+3 × (1)+11 × (1)" represents that 1 of the 15 weights takes the value 1, 1 weight takes the value 3, 1 weight takes 11 and the remaining 12 weights take the value 0. Each weight combination corresponds to one or more weight permutations. For instance, for weight combination "15 × (1)," the weight value 15 can be taken by each of the 15 weights; thus, it corresponds to weight permutations.