| Literature DB >> 25330111 |
Sonu Kumar1, Bram J van Raam1, Guy S Salvesen1, Piotr Cieplak1.
Abstract
Caspases are enzymes belonging to a conserved family of cysteine-dependent aspartic-specific proteases that are involved in vital cellular processes and play a prominent role in apoptosis and inflammation. Determining all relevant protein substrates of caspases remains a challenging task. Over 1500 caspase substrates have been discovered in the human proteome according to published data and new substrates are discovered on a daily basis. To aid the discovery process we developed a caspase cleavage prediction method using the recently published curated MerCASBA database of experimentally determined caspase substrates and a Random Forest classification method. On both internal and external test sets, the ranking of predicted cleavage positions is superior to all previously developed prediction methods. The in silico predicted caspase cleavage positions in human proteins are available from a relational database: CaspDB. Our database provides information about potential cleavage sites in a verified set of all human proteins collected in Uniprot and their orthologs, allowing for tracing of cleavage motif conservation. It also provides information about the positions of disease-annotated single nucleotide polymorphisms, and posttranslational modifications that may modulate the caspase cleaving efficiency.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25330111 PMCID: PMC4201543 DOI: 10.1371/journal.pone.0110539
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Quality measures of trained classifiers and comparison with publicly available prediction model [44].
| TP | FN | FP | TN | Kappa | AUC | Cost | ACC | PRC | SPC | MCC | |
| RF | 191 | 6 | 5 | 788 | 0.97 | 0.998 | 8 | 98.89 | 97.0 | 99.0 | 0.97 |
| NB | 197 | 0 | 13 | 780 | 0.96 | 0.999 | 10 | 98.69 | 94.0 | 98.0 | 0.96 |
| J48 | 189 | 8 | 4 | 789 | 0.96 | 0.958 | 12 | 98.79 | 98.0 | 99.0 | 0.97 |
| SMO | 191 | 6 | 7 | 786 | 0.96 | 0.998 | 10 | 98.69 | 97.0 | 99.0 | 0.96 |
|
| 644 | 16 | 24 | 2614 | 0.96 | 0.999 | 36 | 98.79 | 96.0 | 99.0 | 0.96 |
| PeptideCutter | 50.8 | 63.0 | 97.7 | 0.05 | |||||||
| GraBCas | 67.7 | 67.6 | 67.5 | 0.35 | |||||||
| CASVM P4-P1 | 62.3 | 83.0 | 93.7 | 0.32 | |||||||
| CASVM P4-P2’ | 72.7 | 73.1 | 73.6 | 0.45 | |||||||
| CASVM P14-P10’ | 83.1 | 81.6 | 80.1 | 0.66 |
Abbreviations: TP – number of true positives, FN-false negatives, FP-false positives, TN-true negatives, ACC-accuracy, PRC-precision, SPC-specificity, MCC-Matthews correlation coefficient, Kappa-Kappa statistical value, RF-Random Forest method, NB- Naïve Bayes, J48-decision tree algorithm, SMO-Sequential Minimal Optimization.
List of experimentally confirmed caspase substrates not included into the RF training set.
| Uniprot_name | Caspase | P1 | P5-P5’ | Cascleave2.0 score | CaspDB score | Pubmed ID |
| ACTB_HUMAN | Casp-1 | 244 |
| 0.512 | 0.939 | 9070648 |
|
|
|
|
|
|
|
|
| G3P_HUMAN | Casp-1 | 189 |
| 0.328 | 0.943 | 17959595 |
| CING_HUMAN | Casp-3 | 173 |
| 0.334 | 0.978 | 20058249 |
| ASM_HUMAN | Casp-7 | 251 |
| 0.224 | 0.645 | 21157428 |
| AT2B2_HUMAN | Casp-7 | 1117 |
| 0.282 | 0.912 | 12107825 |
| COF1_HUMAN | Casp-6 | 17 |
| 0.338 | 0.721 | 18487604 |
|
|
|
|
|
|
|
|
| BAG3_HUMAN | Casp-3 | 215 |
| 0.473 | 0.984 | 20232307 |
|
|
|
|
|
|
|
|
|
|
|
| ||||
| RUNX1_HUMAN | Casp-2 | 99 |
| n.d. | 0.961 | 24527765 |
| MYD88_HUMAN | Casp-3 | 135 |
| 0.494 | 0.978 | 24363429 |
| KDM4C_HUMAN | Casp-3 | 396 |
| 0.746 | 0.999 | 24952432 |
| BMR1B_HUMAN | Casp-3 | 50 |
| 0.248 | 0.937 | 21368862 |
| 120 |
| 0.448 | 0.992 | |||
| KKCC1_HUMAN | Casp-3 | 32 |
| 0.671 | 0.992 | 21368862 |
| CSK_HUMAN | Casp-3 | 409 |
| 0.660 | 0.907 | 21368862 |
| AKT2_HUMAN | Casp-3 | 121 |
| 0.714 | 0.943 | 21368862 |
| KC1G1_HUMAN | Casp-3 | 343 |
| 0.218 | 0.987 | 21368862 |
| EF2 K_HUMAN | Casp-3 | 14 |
| 0.574 | 0.976 | 21368862 |
| 430 |
| 0.518 | 0.953 | |||
| MK12_HUMAN | Casp-3 | 46 |
| 0.519 | 0.834 | 21368862 |
| MKNK2_HUMAN | Casp-3 | 32 |
| 0.429 | 0.931 | 21368862 |
| 58 |
| 0.386 | 0.973 | |||
| PIM2_HUMAN | Casp-3 | 198 |
| 0.481 | 0.871 | 21368862 |
| KPCI_HUMAN | Casp-3 | 6 |
| 0.474 | 0.987 | 21368862 |
| TRIB3_HUMAN | Casp-3 | 338 |
| 0.467 | 0.985 | 21368862 |
Comparison of Cascleave 2.0 and CaspDB scores.
Figure 1Comparision of CaspDB and Cascleave 2.0 scores.
(A) Probability score comparison of caspase-1 cleavage sites. (B) Caspase-8 cleavage sites. CaspDB and Cascleave 2.0 scores are marked in red and black, respectively.
Optimized parameter values for trained classifier.
| Classifier | Parameters | Values |
| Random Forest (RF) | Maximum depth | Unlimited |
| Number of Trees | 1500 | |
| Naïve Bayes (NB) | Default | |
| J48 | Confidence factor | 0.25 |
| Number of folds | 3 | |
| Subtree raising | True | |
| Unpruned | False | |
| Use Laplace | False | |
| SMO | Complexity | 1 |
| Build logistic models | True | |
| Epsilon | 1.0E-12 | |
| Filter type | Normalize training data | |
| Kernel | RBF Kernel | |
| Gamma | 0.01 |