| Literature DB >> 24586628 |
Bandana Kumari1, Ravindra Kumar1, Manish Kumar1.
Abstract
Protein palmitoylation is the covalent attachment of the 16-carbon fatty acid palmitate to a cysteine residue. It is the most common acylation of protein and occurs only in eukaryotes. Palmitoylation plays an important role in the regulation of protein subcellular localization, stability, translocation to lipid rafts and many other protein functions. Hence, the accurate prediction of palmitoylation site(s) can help in understanding the molecular mechanism of palmitoylation and also in designing various related experiments. Here we present a novel in silico predictor called 'PalmPred' to identify palmitoylation sites from protein sequence information using a support vector machine model. The best performance of PalmPred was obtained by incorporating sequence conservation features of peptide of window size 11 using a leave-one-out approach. It helped in achieving an accuracy of 91.98%, sensitivity of 79.23%, specificity of 94.30%, and Matthews Correlation Coefficient of 0.71. PalmPred outperformed existing palmitoylation site prediction methods - IFS-Palm and WAP-Palm on an independent dataset. Based on these measures it can be anticipated that PalmPred will be helpful in identifying candidate palmitoylation sites. All the source datasets, standalone and web-server are available at http://14.139.227.92/mkumar/palmpred/.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24586628 PMCID: PMC3929663 DOI: 10.1371/journal.pone.0089246
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Performance of SVM on different window size.
Performance of PSSM based SVM model.
| Threshold | Sensitivity | Specificity | Accuracy | MCC | False Positive Rate (%) (100-specificity) |
| −1 | 94.20 | 36.93 | 45.73 | 0.24 | 63.07 |
| −0.9 | 92.75 | 60.18 | 65.18 | 0.38 | 39.82 |
| −0.8 | 89.37 | 73.77 | 76.17 | 0.47 | 26.23 |
| −0.7 | 88.89 | 81.05 | 82.26 | 0.55 | 18.95 |
| −0.6 | 85.51 | 86.49 | 86.34 | 0.60 | 13.51 |
| −0.5 | 81.64 | 90.88 | 89.46 | 0.65 | 9.12 |
| − |
|
|
|
|
|
| −0.3 | 72.95 | 95.96 | 92.43 | 0.70 | 4.04 |
| −0.2 | 67.63 | 96.75 | 92.28 | 0.69 | 3.25 |
| −0.1 | 58.94 | 97.63 | 91.69 | 0.65 | 2.37 |
| 0 | 53.62 | 98.25 | 91.39 | 0.63 | 1.75 |
| 0.1 | 49.28 | 98.60 | 91.02 | 0.61 | 1.40 |
| 0.2 | 45.89 | 98.86 | 90.72 | 0.59 | 1.14 |
| 0.3 | 39.61 | 98.95 | 89.83 | 0.55 | 1.05 |
| 0.4 | 38.16 | 99.12 | 89.76 | 0.54 | 0.88 |
| 0.5 | 33.82 | 99.21 | 89.16 | 0.51 | 0.79 |
| 0.6 | 27.54 | 99.47 | 88.42 | 0.46 | 0.53 |
| 0.7 | 21.26 | 99.47 | 87.45 | 0.40 | 0.53 |
| 0.8 | 17.87 | 99.65 | 87.08 | 0.37 | 0.35 |
| 0.9 | 13.04 | 99.82 | 86.49 | 0.32 | 0.18 |
| 1 | 8.70 | 99.91 | 85.89 | 0.26 | 0.09 |
The selected performance for SVM model has been shown in bold.
Performance of IFS-Palm and PalmPred on training dataset (Dtrain) using LOOCV approach of training.
| Predictor | Sensitivity | Specificity | Accuracy | MCC |
| IFS-Palm | 68.60 | 94.65 | 90.65 | 0.64 |
| PalmPred | 79.23 | 94.30 | 91.98 | 0.71 |
Performance of CKSAAP-Palm, IFS-Palm and PalmPred on the independent dataset (D1ind) of 19 proteins.
| Predictors | Sensitivity | Specificity | Accuracy | MCC |
| CKSAAP-Palm | 62.96 | 86.50 | 83.16 | 0.43 |
| IFS-Palm | 92.59 | 98.77 | 97.89 | 0.91 |
| PalmPred | 96.30 | 98.77 | 98.42 | 0.94 |
*The values for all measurement categories had been taken from Hu et al. 2011.
Comparative study of cysteine palmitoylation sites in Yeast proteins. This data is referred as D2ind in the text.
| Protein | Uniprot ID | Uniprot annotation | Experimentally identified sites | IFS-Palm | WAP-Palm | PalmPred |
| TVP18 | A6ZMD0 | – | – | – | – | 78 |
| HIP1 | P06775 | – | 603 | 339, 463 | 339 | – |
| RHO2 | P06781 | 188* | 188 | 188 | – | 188 |
| NUC1 | P08466 | – | – | – | – | – |
| TUB1 | P09733 | – | – | – | – | 14 |
| GPA2 | P10823 | 4 | – | 4 | – | 4 |
| GAP1 | P19145 | – | – | 286 | – | – |
| YCK1 | P23291 | 537#, 538# | – | 537, 538 | – | 537, 538 |
| YCP4 | P25349 | 243* | – | 243 | – | 243 |
| AGP1 | P25376 | 633# | – | 469 | 172, 266 | – |
| SYN8 | P31377 | 238* | 238 | – | – | 238 |
| MLF3 | P32047 | – | – | – | – | 2 |
| SSO1 | P32867 | – | 266 | – | – | 266 |
| SNC2 | P33328 | 94* | 94 | 94 | 94 | 94 |
| YKT6 | P36015 | 196# | – | 196 | – | 196 |
| YKL047W | P36090 | – | – | 516 | – | 516 |
| BAP2 | P38084 | – | 609 | – | – | – |
| VAP1 | P38085 | – | 619 | 318, 412 | – | – |
| YBR016W | P38216 | – | – | 110, 119, 122 | – | 119 |
| TAT2 | P38967 | – | – | 489 | – | – |
| AKR1 | P39010 | – | – | 663 | 533, 667 | 533, 663, 667 |
| MNN1 | P39106 | – | 17 | – | – | – |
| SSO2 | P39926 | – | 270, 274 | – | – | 270 |
| YCK3 | P39962 | 517*, 518*, 519*,520*, 522*, 523*,524* | – | 84, 517, 518,519, 522, 524 | – | 517, 518, 519,520, 522, 523 |
| VAC8 | P39968 | 4*, 5*, 7* | – | 4, 5, 7, 106, 144 | 106 | 4, 5, 7 |
| HEM14 | P40012 | – | – | 104, 435 | – | – |
| LBS6 | P42951 | – | – | 217, 223, 531 | – | 217, 223 |
| MNN11 | P46985 | – | 35 | – | – | – |
| MSE1 | P48525 | – | – | 413 | 502 | 12 |
| GNP1 | P48813 | – | 663 | 193, 312 | 201 | – |
| MNN10 | P50108 | – | 44 | 263, 362 | – | – |
| YGL108C | P53139 | 4* | – | 4 | – | 4 |
| RHO3 | Q00245 | – | 5 | – | 130 | 5 |
| MEH1 | Q02205 | 7 *, 8* | – | 7, 8 | – | 7, 8 |
| TLG1 | Q03322 | 205*, 206* | 205, 206 | – | – | 205 |
| YLR326W | Q06170 | – | – | 79, 80, 81 | 80 | 79, 80, 81 |
| SNA4 | Q07549 | 2*, 3*, 5*, 7*, 8* | – | – | – | 2, 3, 5, 7, 34 |
| PSR1 | Q07800 | 9 | – | 10 | 10 | 9, 10 |
| YLR001C | Q07895 | – | 780 | 780 | 504 | 780 |
| PSR2 | Q07949 | 9 | – | 9, 10 | 10 | 9, 10 |
| TLG2 | Q08144 |
| 317, 325 | – | – | 316 |
| YPL199C | Q08954 | – | – | 235 | – | 233, 235 |
| SAM3 | Q08986 | – | – | 268, 321 | 321 | – |
| YPL236C | Q12003 | 13*, 14*, 15* | – | 14, 15 | 13, 14, 159 | 13, 14, 15 |
| PIN2 | Q12057 | – | 35, 41, 53 | 66, 79, 81,82, 84 | 66, 81, 82 | 53, 66, 79,81, 82, 84 |
| VAM3 | Q12241 | – | 262, 274 | – | – | 262 |
, * and # denotes the palmitoylated cysteine respectively annotated as ‘probable’, ‘By similarity’ and ‘potential’ in Uniprot.
Performance of different machine learning classifiers.
| Leave-one-out Cross-validation | Independent Testing Dataset (D1ind) | |||||||
| Classifiers | Sn | Sp | Acc | MCC | Sn | Sp | Acc | MCC |
| Naïve Bayes | 79.60 | 74.50 | 79.58 | 0.44 | 82.80 | 81.70 | 82.63 | 0.51 |
| RBF Network | 85.00 | 49.00 | 85.00 | 0.37 | 82.10 | 60.00 | 82.11 | 0.37 |
| Random Forest | 85.20 | 21.40 | 85.23 | 0.19 | 89.50 | 36.50 | 89.47 | 0.48 |
| Support Vector Machine | 79.23 | 94.30 | 91.98 | 0.71 | 96.30 | 98.77 | 98.42 | 0.94 |
Sn, Sp, Acc and MCC represent Sensitivity, Specificity, Accuracy and Matthews Correlation Coefficient respectively.
Prediction performance of PalmPred on dataset D3ind taken from Nishimura and Linder 2013 (referred as D3ind).
| Protein | Uniprot ID | Total no. of cysteines in protein | Experimentally identified sites | PalmPred |
| bcdC42 | P60953 | 7 | 188 | – |
| Wrch-1 | Q7L0Q8 | 12 | 256 | 256 |
| RalA | P11233 | 3 | 203 | – |
| RalB | P11234 | 2 | 203 | – |
| PRL-1 | Q93096 | 6 | – | 104, 171 |
| PRL-2 | Q12974 | 7 | – | 101 |
| PRL-3 | O75365 | 6 | 170 | 171 |
| PDE6α | P16499 | 15 | – | – |
| PDE6β | P23440 | 21 | – | – |
| PLA2γ | Q9UP65 | 7 | – | 539 |
Prediction of PalmPred on dataset D4ind taken from Oku et al. 2013.
| Protein | Uniprot ID | Total no. ofcysteines in protein | PutativePalmitoylation sites | Experimentalconfirmation | PalmPred |
| TARPγ-2 | O88602 | 6 | 121 | + | 68, 121 |
| TARPγ-8 | Q8VHW2 | 7 | 144 | + | 90, 91, 144 |
| Cornichon-2 | O35089 | 8 | 9 | + | 84 |
| CaMKIIα | P11798 | 10 | 6 | + | – |
| Kalirin7 | A2CG49 | 55 | 1404 | – | 417, 989, 1334, 2508 |
| Homer1C | Q9Z2Y3 | 2 | 365 | – | – |
| Neurochondrin | Q9Z0E0 | 25 | 3,4 | + | 3, 4, 292, 647, 348 |
| Rab3A | P63011 | 4 | 220 | – | 218, 220 |
| Syd-1 | Q9DBZ9 | 13 | 736 | + | 346, 360 |
| Liprin-α2 | Q8BSS9 | 9 | 3 | – | – |
| KIF5C | P28738 | 10 | 7 | – | 303, 304 |
| TRPM8 | Q8R4D5 | 26 | 1032 | + | 780, 1028, 1031, 1032, 1033 |
| TRPC1 | Q61056 | 19 | 736 | + | 198, 367, 692, 703 |
| Orexin2receptor | P58308 | 14 | 381 | + | 381, 382 |
| Paxillin | Q8VI36 | 25 | 591 | – | – |
| Zyxin | Q62523 | 23 | 404 | + | – |
| Par3 | Q99NH2 | 12 | 6 | – | – |
Figure 2The basic architecture of PalmPred.