| Literature DB >> 24040264 |
Jiangan Xie1, Zhiling Xu, Shangbo Zhou, Xianchao Pan, Shaoxi Cai, Li Yang, Hu Mei.
Abstract
Prediction of proteasomal cleavage sites has been a focus of computational biology. Up to date, the predictive methods are mostly based on nonlinear classifiers and variables with little physicochemical meanings. In this paper, the physicochemical properties of 14 residues both upstream and downstream of a cleavage site are characterized by VHSE (principal component score vector of hydrophobic, steric, and electronic properties) descriptors. Then, the resulting VHSE descriptors are employed to construct prediction models by support vector machine (SVM). For both in vivo and in vitro datasets, the performance of VHSE-based method is comparatively better than that of the well-known PAProC, MAPPP, and NetChop methods. The results reveal that the hydrophobic property of 10 residues both upstream and downstream of the cleavage site is a dominant factor affecting in vivo and in vitro cleavage specificities, followed by residue's electronic and steric properties. Furthermore, the difference in hydrophobic potential between residues flanking the cleavage site is proposed to favor substrate cleavages. Overall, the interpretable VHSE-based method provides a preferable way to predict proteasomal cleavage sites.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24040264 PMCID: PMC3767653 DOI: 10.1371/journal.pone.0074506
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The pretreatments of the MHC-I ligands.
VHSE descriptors for 20 natural amino acids.
| AA |
|
|
|
|
|
|
|
|
| Ala A | 0.15 | −1.11 | −1.35 | −0.92 | 0.02 | −0.91 | 0.36 | −0.48 |
| Arg R | −1.47 | 1.45 | 1.24 | 1.27 | 1.55 | 1.47 | 1.30 | 0.83 |
| Asn N | −0.99 | 0.00 | −0.37 | 0.69 | −0.55 | 0.85 | 0.73 | −0.80 |
| Asp D | −1.15 | 0.67 | −0.41 | −0.01 | −2.68 | 1.31 | 0.03 | 0.56 |
| Cys C | 0.18 | −1.67 | −0.46 | −0.21 | 0.00 | 1.20 | −1.61 | −0.19 |
| Gln Q | −0.96 | 0.12 | 0.18 | 0.16 | 0.09 | 0.42 | −0.20 | −0.41 |
| Glu E | −1.18 | 0.40 | 0.10 | 0.36 | −2.16 | −0.17 | 0.91 | 0.02 |
| Gly G | −0.20 | −1.53 | −2.63 | 2.28 | −0.53 | −1.18 | 2.01 | −1.34 |
| His H | −0.43 | −0.25 | 0.37 | 0.19 | 0.51 | 1.28 | 0.93 | 0.65 |
| Ile I | 1.27 | −0.14 | 0.30 | −1.80 | 0.30 | −1.61 | −0.16 | −0.13 |
| Leu L | 1.36 | 0.07 | 0.26 | −0.80 | 0.22 | −1.37 | 0.08 | −0.62 |
| Lys K | −1.17 | 0.70 | 0.70 | 0.80 | 1.64 | 0.67 | 1.63 | 0.13 |
| Met M | 1.01 | −0.53 | 0.43 | 0.00 | 0.23 | 0.10 | −0.86 | −0.68 |
| Phe F | 1.52 | 0.61 | 0.96 | −0.16 | 0.25 | 0.28 | −1.33 | −0.20 |
| Pro P | 0.22 | −0.17 | −0.50 | 0.05 | −0.01 | −1.34 | −0.19 | 3.56 |
| Ser S | −0.67 | −0.86 | −1.07 | −0.41 | −0.32 | 0.27 | −0.64 | 0.11 |
| Thr T | −0.34 | −0.51 | −0.55 | −1.06 | 0.01 | −0.01 | −0.79 | 0.39 |
| Trp W | 1.50 | 2.06 | 1.79 | 0.75 | 0.75 | −0.13 | −1.06 | −0.85 |
| Tyr Y | 0.61 | 1.60 | 1.17 | 0.73 | 0.53 | 0.25 | −0.96 | −0.52 |
| Val V | 0.76 | −0.92 | 0.17 | −1.91 | 0.22 | −1.40 | −0.24 | −0.03 |
Performance of SVM models.
| Dataset 1: MHC-I ligands | ||||||
| Sequence length | Kernel |
|
|
|
|
|
| ±6 (12) | Linear | 0.5419 | 0.8457 | 77.11 | 78.86 | 75.28 |
| RBF | 0.5411 | 0.8459 | 77.07 | 78.85 | 75.20 | |
| ±8 (16) | Linear | 0.5677 | 0.8586 | 78.35 | 82.66 | 73.83 |
| RBF | 0.5676 | 0.8591 | 78.35 | 82.54 | 73.95 | |
| ±10 (20) | Lineara | 0.5905 | 0.8673 | 79.52 | 82.74 | 76.13 |
| RBF | 0.5902 | 0.8691 | 79.52 | 82.20 | 76.69 | |
| ±12 (24) | Linear | 0.5842 | 0.8701 | 79.22 | 81.78 | 76.53 |
| RBF | 0.6082 | 0.8809 | 80.42 | 82.93 | 77.78 | |
| ±14 (28) | Linear | 0.5803 | 0.8705 | 79.02 | 81.85 | 76.04 |
| RBF | 0.5896 | 0.8746 | 79.49 | 81.74 | 77.13 | |
|
| ||||||
|
|
|
|
|
|
|
|
| ±6 (12) | Linear | 0.5099 | 0.8345 | 75.45 | 78.12 | 72.78 |
| RBF | 0.5162 | 0.8357 | 75.76 | 78.74 | 72.78 | |
| ±8 (16) | Linear | 0.5265 | 0.8380 | 76.27 | 79.34 | 73.20 |
| RBF | 0.5092 | 0.8364 | 75.44 | 75.86 | 75.03 | |
| ±10 (20) | Linearb | 0.5481 | 0.8310 | 77.39 | 76.68 | 78.09 |
| RBF | 0.5399 | 0.8318 | 76.98 | 76.88 | 77.08 | |
| ±12 (24) | Linear | 0.5174 | 0.8377 | 75.85 | 76.26 | 75.45 |
| RBF | 0.5338 | 0.8368 | 76.67 | 75.65 | 77.69 | |
| ±14 (28) | Linear | 0.5318 | 0.8354 | 76.57 | 75.86 | 77.29 |
| RBF | 0.5358 | 0.8392 | 76.79 | 77.10 | 76.48 | |
The predictive power of SVMMHC-I and SVMVITRO in comparison with the other 4 models.
| Test set 1: MHC-I ligands | Test set 2: | |||||||
| Model |
|
|
|
|
|
|
|
|
|
| NA | 45.6 | 30.0 | −0.25 | NA | 46.4 | 64.7 | 0.10 |
|
| NA | 83.5 | 16.5 | 0.00 | NA | 72.1 | 41.4 | 0.12 |
|
| NA | 39.8 | 46.3 | −0.14 | NA | 34.4 | 91.4 | 0.31 |
|
| NA | 73.6 | 42.4 | 0.16 | NA | 57.4 | 76.4 | 0.32 |
| SVMMHC-I | 73.5 | 82.3 | 64.8 | 0.48 | ||||
| SVMVITRO | 70.5 | 62.5 | 78.7 | 0.42 | ||||
The predictive performance of PAProC, FragPredict, NetChop1.0 and 2.0 are cited from Saxova et al. [17].
Not available.
Figure 2The weight coefficients of VHSE variables included in SVMMHC-I model.
A: VHSE 1 (Hydrophobic property); B: VHSE 3 (Steric property); C: VHSE 5 (Electronic property).
The profiles of in vivo cleavages.
| Position | Favored | Unfavored |
| P9 | F, W, L, M | E, D, N, S |
| P8 | F, W, L, I | R, E, K, D |
| P7 | F, W, L, I | R, E, K, D |
| P4 | G, A, S | W, R, Y |
| P1 | F, W, K, R, I | E, D, N, T |
| P3' | R, E, K, D | F, W, L, I |
| P4' | R, E, K, D | F, W, L, I |
| P5' | E, D, N, T | F, W, K, R, I |
The residues in the corresponding positions are favorable to substrate cleavages;
The residues in the corresponding positions are unfavorable to substrate cleavages.
Figure 3The weight coefficients of VHSE variables included in SVMVITRO model.
A: VHSE 1 (Hydrophobic property); B: VHSE 3 (Steric property); C: VHSE 5 (Electronic property).