| Literature DB >> 18234718 |
Ji Wan1, Shuli Kang, Chuanning Tang, Jianhua Yan, Yongliang Ren, Jie Liu, Xiaolian Gao, Arindam Banerjee, Lynda B M Ellis, Tongbin Li.
Abstract
Meta-predictors make predictions by organizing and processing the predictions produced by several other predictors in a defined problem domain. A proficient meta-predictor not only offers better predicting performance than the individual predictors from which it is constructed, but it also relieves experimentally researchers from making difficult judgments when faced with conflicting results made by multiple prediction programs. As increasing numbers of predicting programs are being developed in a large number of fields of life sciences, there is an urgent need for effective meta-prediction strategies to be investigated. We compiled four unbiased phosphorylation site datasets, each for one of the four major serine/threonine (S/T) protein kinase families-CDK, CK2, PKA and PKC. Using these datasets, we examined several meta-predicting strategies with 15 phosphorylation site predictors from six predicting programs: GPS, KinasePhos, NetPhosK, PPSP, PredPhospho and Scansite. Meta-predictors constructed with a generalized weighted voting meta-predicting strategy with parameters determined by restricted grid search possess the best performance, exceeding that of all individual predictors in predicting phosphorylation sites of all four kinase families. Our results demonstrate a useful decision-making tool for analysing the predictions of the various S/T phosphorylation site predictors. An implementation of these meta-predictors is available on the web at: http://MetaPred.umn.edu/MetaPredPS/.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18234718 PMCID: PMC2275094 DOI: 10.1093/nar/gkm848
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Procedure of compiling the MetaPS06 dataset.
Summary of the 15 element predictors
| Element predictor | Refs. | URLs |
|---|---|---|
| GPS | ( | |
| KinasePhos_90 | ( | |
| KinasePhos_95 | ||
| KinasePhos_100 | ||
| KinasePhos_bitscore | ||
| NetPhosK_0.3 | ( | |
| NetPhosK_0.5 | ||
| NetPhosK_0.7 | ||
| PPSP_highsens | ( | |
| PPSP_balanced | ||
| PPSP_highspec | ||
| PredPhospho | ( | |
| Scansite_low | ( | |
| Scansite_medium | ||
| Scansite_high |
For data features and classification methods, see text.
Weight combinations, permutations and possible weighted sum values in the restricted grid search parameter selection scheme
| Weight combinations | Number of corresponding weight permutations |
|---|---|
| 1 × 1 | 15 |
| 1365 | |
| 2730 | |
| 15 015 | |
| 2730 | |
| 1365 | |
| 60 060 | |
| 45 045 | |
| 1365 | |
| 2730 | |
| 60 060 | |
| 90 090 | |
| 270 270 | |
| 45 045 | |
| 455 | |
| 90 090 | |
| 135 135 | |
| 60 060 | |
| 675 675 | |
| 360 360 | |
| 15 015 | |
| 3003 | |
| 225 225 | |
| 420 420 | |
| 75 075 | |
| 1365 | |
| 1 | |
| Possible weighted | |
| sum values |
aWeight combinations are denoted as the sum of each weight value multiplied by the number of weights taking the weight value, with weight value 0 omitted. For instance, ‘1 × 1’ represents cases where one weight takes the value 1, and the other 14 weights taking the value 0; and ‘’ represent cases where 2 of the 15 weights take the value , 1 weight takes the value , and the remaining 12 weights take the value 0. Each weight combination corresponds to one or more weight permutations. For instance, for weight combination ‘1 × 1’, the weight value 1 can be taken by each of the 15 weights, thus it corresponds to weight permutations. Similarly, for weight combination ‘’, there are corresponding weight permutations.
Predicting performance of element predictors
| Element predictor | Sensitivity | Specificity | Accuracy | MCC |
|---|---|---|---|---|
| CDK | ||||
| GPS | 0.908 | 0.800 | 0.844 | 0.695 |
| KinasePhos_90 | 0.884 | 0.717 | 0.784 | 0.589 |
| KinasePhos_95 | 0.799 | 0.837 | 0.822 | 0.632 |
| KinasePhos_100 | 0.571 | 0.923 | 0.782 | 0.542 |
| KinasePhos_bitscore | 0.912 | 0.685 | 0.776 | 0.588 |
| NetPhosK_0.3 | 1 | 0 | 0.400 | N/A |
| NetPhosK_0.5 | 0.639 | 0.748 | 0.705 | 0.387 |
| NetPhosK_0.7 | 0.065 | 0.998 | 0.624 | 0.188 |
| PPSP_highsens | 0.983 | 0.075 | 0.438 | 0.128 |
| PPSP_balanced | 0.905 | 0.796 | 0.839 | 0.687 |
| PPSP_highspec | 0.054 | 0.982 | 0.611 | 0.100 |
| Scansite_low | 0.667 | 0.884 | 0.797 | 0.571 |
| Scansite_medium | 0.405 | 0.971 | 0.744 | 0.479 |
| Scansite_high | 0.153 | 0.993 | 0.657 | 0.290 |
| CK2 | ||||
| GPS | 0.699 | 0.895 | 0.816 | 0.613 |
| KinasePhos_90 | 0.581 | 0.904 | 0.774 | 0.523 |
| KinasePhos_95 | 0.476 | 0.950 | 0.760 | 0.504 |
| KinasePhos_100 | 0.266 | 0.985 | 0.698 | 0.386 |
| KinasePhos_bitscore | 0.594 | 0.901 | 0.778 | 0.530 |
| NetPhosK_0.3 | 0.961 | 0.525 | 0.699 | 0.506 |
| NetPhosK_0.7 | 0.245 | 1.000 | 0.698 | 0.403 |
| PPSP_highsens | 0.930 | 0.227 | 0.509 | 0.208 |
| PPSP_balanced | 0.742 | 0.933 | 0.857 | 0.700 |
| PPSP_highspec | 0.048 | 1.000 | 0.619 | 0.171 |
| PredPhospho | 0.594 | 0.959 | 0.813 | 0.616 |
| Scansite_low | 0.576 | 0.983 | 0.820 | 0.640 |
| Scansite_medium | 0.380 | 0.997 | 0.750 | 0.512 |
| Scansite_high | 0.135 | 1.000 | 0.654 | 0.293 |
| PKA | ||||
| GPS | 0.817 | 0.809 | 0.812 | 0.618 |
| KinasePhos_90 | 0.722 | 0.843 | 0.794 | 0.569 |
| KinasePhos_95 | 0.650 | 0.887 | 0.792 | 0.560 |
| KinasePhos_100 | 0.361 | 0.952 | 0.716 | 0.405 |
| KinasePhos_bitscore | 0.775 | 0.804 | 0.792 | 0.573 |
| NetPhosK_0.3 | 0.878 | 0.724 | 0.786 | 0.590 |
| NetPhosK_0.5 | 0.694 | 0.874 | 0.802 | 0.583 |
| NetPhosK_0.7 | 0.483 | 0.959 | 0.769 | 0.525 |
| PPSP_highsens | 0.967 | 0.231 | 0.526 | 0.270 |
| PPSP_balanced | 0.850 | 0.806 | 0.823 | 0.645 |
| PPSP_highspec | 0.008 | 0.998 | 0.602 | 0.048 |
| Scansite_low | 0.644 | 0.917 | 0.808 | 0.596 |
| Scansite_medium | 0.422 | 0.981 | 0.758 | 0.515 |
| Scansite_high | 0.158 | 0.991 | 0.658 | 0.288 |
| PKC | ||||
| GPS | 0.718 | 0.753 | 0.739 | 0.466 |
| KinasePhos_90 | 0.649 | 0.789 | 0.733 | 0.441 |
| KinasePhos_95 | 0.480 | 0.864 | 0.710 | 0.378 |
| KinasePhos_100 | 0.129 | 0.977 | 0.638 | 0.211 |
| KinasePhos_bitscore | 0.687 | 0.722 | 0.708 | 0.404 |
| NetPhosK_0.3 | 0.716 | 0.695 | 0.703 | 0.403 |
| NetPhosK_0.5 | 0.491 | 0.841 | 0.701 | 0.358 |
| NetPhosK_0.7 | 0.333 | 0.935 | 0.694 | 0.348 |
| PPSP_highsens | 0.954 | 0.274 | 0.546 | 0.289 |
| PPSP_highspec | 0.006 | 1.000 | 0.602 | 0.059 |
| PredPhospho | 0.598 | 0.805 | 0.722 | 0.412 |
| Scansite_low | 0.411 | 0.866 | 0.684 | 0.315 |
| Scansite_medium | 0.170 | 0.946 | 0.636 | 0.189 |
| Scansite_high | 0.069 | 0.994 | 0.624 | 0.179 |
Predicting performance assessed on MetaPS06 datasets. Element predictors having the best predicting performance are shown in italic.
aMCC is undefined.
Predicting performance of combinatorial meta-predictors
| Number of element predictors in combination | Element predictors included in best combinatorial meta-predictor | Accuracy | MCC |
|---|---|---|---|
| CDK | |||
| 6 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, PredPhospho, Scansite_low | 0.799 | 0.575 |
| Best element predictor (PredPhospho) | 0.853 | 0.708 | |
| CK2 | |||
| 2 | NetPhosK_0.3, PPSP_balanced | 0.857 | 0.700 |
| 3 | GPS, NetPhosK_0.3, PPSP_balanced | 0.832 | 0.652 |
| 4 | GPS, NetPhosK_0.3, PPSP_highsens, Scansite_low | 0.808 | 0.621 |
| 5 | GPS, KinasePhos_90, NetPhosK_0.3, PPSP_highsens, Scansite_low | 0.778 | 0.565 |
| 6 | GPS, KinasePhos_90, NetPhosK_0.3, PPSP_highsens, PredPhospho, Scansite_low | 0.748 | 0.508 |
| Best element predictor (NetPhosK_0.5) | 0.871 | 0.730 | |
| PKA | |||
| 2 | NetPhosK_0.3, PredPhospho | 0.827 | 0.638 |
| 3 | GPS, NetPhosK_0.3, PPSP_highsens | 0.82 | 0.625 |
| 4 | GPS, NetPhosK_0.3, PPSP_highsens, PredPhospho | 0.819 | 0.619 |
| 5 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, PredPhospho | 0.81 | 0.599 |
| 6 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, PredPhospho, Scansite_low | 0.784 | 0.555 |
| Best element predictor (PredPhospho) | 0.827 | 0.642 | |
| PKC | |||
| 4 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, | 0.739 | 0.448 |
| 5 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, PredPhospho | 0.717 | 0.405 |
| 6 | GPS, KinasePhos_bitscore, NetPhosK_0.3, PPSP_highsens, PredPhospho, Scansite_low | 0.676 | 0.319 |
| Best element predictor (PPSP_balanced) | 0.741 | 0.477 |
Predicting performance assessed on MetaPS06 datasets. For each l (2 ≤ l ≤ 6), the composition and predicting performance of the best meta-predictors composed of l element predictors are shown, together with the predicting performance of the best element predictor. Meta-predictors having predicting performance exceeding that of the best element predictor are shown in italic.
*P-values in Fisher's Z-transformation test (compared with the MCC of the best element predictor) are shown in parentheses.
Predicting performance of unweighted voting, best unreduced weighted voting and best reduced weighted voting meta-predictors
| Predictor | Accuracy | MCC |
|---|---|---|
| CDK | ||
| Best element predictor (PredPhospho) | 0.853 | 0.708 |
| Unweighted voting Meta-predictor | 0.853 | 0.699 |
| CK2 | ||
| Best element predictor (NetPhosK_0.5) | 0.871 | 0.730 |
| Unweighted voting Meta-predictor | 0.809 | 0.617 |
| Best unreduced weighted voting Meta-predictor | 0.844 | 0.675 |
| Best reduced weighted voting Meta-predictor | 0.867 | 0.722 |
| PKA | ||
| Best element predictor (PredPhospho) | 0.827 | 0.642 |
| Unweighted voting Meta-predictor | 0.820 | 0.620 |
| PKC | ||
| Best element predictor (PPSP_balanced) | 0.741 | 0.477 |
| Unweighted voting Meta-predictor | 0.733 | 0.433 |
| Best unreduced weighted voting Meta-predictor | 0.744 | 0.500 |
Predicting performance assessed on MetaPS06 datasets. Meta-predictors having predicting performance exceeding that of the best element predictor are shown in italic.
*P-values in Fisher's Z-transformation test (compared with the MCC of the best element predictor) are shown in parentheses.
Predicting performance of weighted voting meta-predictors with restricted grid search of parameters
| Sensitivity | Specificity | Accuracy | MCC | |
|---|---|---|---|---|
| CDK | 0.912 | 0.832 | 0.864 | 0.730 (0.19) |
| CK2 | 0.878 | 0.904 | 0.893 | 0.779 (0.027) |
| PKA | 0.883 | 0.828 | 0.850 | 0.699 (0.014) |
| PKC | 0.773 | 0.791 | 0.784 | 0.558 (0.011) |
Predicting performance assessed on MetaPS06 datasets.
*P-values in Fisher's Z-transformation test (compared with the MCC of the best element predictor) are shown in parentheses.
Areas under the ROC curves for the six element predicting programs and the weighted voting meta-predictor with restricted grid search
| CDK | CK2 | PKA | PKC | |
|---|---|---|---|---|
| GPS | 0.8761 | 0.8130 | 0.8446 | 0.7574 |
| KinasePhos | 0.8713 | 0.7508 | 0.8234 | 0.7440 |
| NetPhosK | 0.7767 | 0.9307 | 0.8749 | 0.7581 |
| PPSP | 0.8721 | 0.8767 | 0.8860 | 0.7994 |
| PredPhospho | 0.8670 | 0.7791 | 0.8537 | 0.7149 |
| Scansite | 0.7584 | 0.7734 | 0.7656 | 0.6397 |
| Max of the six category | GPS 0.8761 | NetPhosK 0.9307 | PKA 0.8896 | PPSP 0.7994 |
| Meta-predictor (weighted voting with restricted grid search) | 0.8956 | 0.9313 | 0.8946 | 0.8247 |
Minimal and maximal improvements in accuracy and MCC achieved by the weighted voting meta-predictor with restricted grid search over the best predictor of each element predicting program
| Minimal improvement in accuracy | Maximal improvement in accuracy | Minimal improvement in MCC | Maximal improvement in MCC | |
|---|---|---|---|---|
| GPS | 0.020 | 0.077 | 0.035 | 0.166 |
| KinasePhos | 0.042 | 0.115 | 0.098 | 0.249 |
| NetPhosK | 0.022 | 0.159 | 0.049 | 0.343 |
| PPSP | 0.025 | 0.041 | 0.043 | 0.081 |
| PredPhospho | 0.011 | 0.080 | 0.022 | 0.163 |
| Scansite | 0.042 | 0.100 | 0.103 | 0.243 |
Minimal and maximal improvements in accuracy and MCC were calculated across the four datasets for CDK, CK2, PKA and PKC kinase families.
Parameters selected by the restricted grid search in the weighted voting meta-predictors
| GPS | Kinase Phos_90 | Kinase Phos_95 | Kinase Phos_100 | Kinase Phos_bit score | NetPhos K_0.3 | NetP hosK_0.5 | NetP hosK_0.7 | PPSP_ highsens | PPSP_ balanced | PPSP_ highspec | Pred Phospho | Scansite_ low | Scansite_ medium | Scansite_ high | T | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CDK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||||||||
| CK2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||||||||
| PKA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ||||||||
| PKC | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |