| Literature DB >> 22984622 |
Bo Yao1, Lin Zhang, Shide Liang, Chi Zhang.
Abstract
Identifying protein surface regions preferentially recognizable by antibodies (antigenic epitopes) is at the heart of new immuno-diagnostic reagent discovery and vaccine design, and computational methods for antigenic epitope prediction provide crucial means to serve this purpose. Many linear B-cell epitope prediction methods were developed, such as BepiPred, ABCPred, AAP, BCPred, BayesB, BEOracle/BROracle, and BEST, towards this goal. However, effective immunological research demands more robust performance of the prediction method than what the current algorithms could provide. In this work, a new method to predict linear antigenic epitopes is developed; Support Vector Machine has been utilized by combining the Tri-peptide similarity and Propensity scores (SVMTriP). Applied to non-redundant B-cell linear epitopes extracted from IEDB, SVMTriP achieves a sensitivity of 80.1% and a precision of 55.2% with a five-fold cross-validation. The AUC value is 0.702. The combination of similarity and propensity of tri-peptide subsequences can improve the prediction performance for linear B-cell epitopes. Moreover, SVMTriP is capable of recognizing viral peptides from a human protein sequence background. A web server based on our method is constructed for public use. The server and all datasets used in the current study are available at http://sysbio.unl.edu/SVMTriP.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22984622 PMCID: PMC3440317 DOI: 10.1371/journal.pone.0045152
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Performance of SVMTriP models with different epitope lengths.
| Length (AA) | Sn (%) | P (%) | F-measure | AUC |
|
| 68.5±2.5 | 55.5±1.5 | 0.615±0.020 | 0.674 |
|
| 67.5±3.5 | 57.0±2.0 | 0.620±0.030 | 0.681 |
|
| 64.8±4.9 | 56.5±2.5 | 0.605±0.030 | 0.689 |
|
| 63.5±5.5 | 57.1±3.0 | 0.601±0.045 | 0.685 |
|
| 79.0±1.9 | 54.1±1.1 | 0.641±0.015 | 0.666 |
|
| 80.1±2.1 | 55.2±1.0 | 0.693±0.060 | 0.702 |
Performance of different linear B-cell epitope prediction methods.
| Methods | Sn (%) | P (%) | F-measure | AUC |
|
| 59.8±0.9 | 58.5±6.5 | 0.590±0.040 | 0.667 |
|
| 54.0±7.1 | 60.5±2.5 | 0.572±0.055 | 0.667 |
|
| 80.1±2.1 | 55.2±1.0 | 0.693±0.060 | 0.702 |
The results for AAP [14] and BCPred [15], are obtained by the software implemented locally.
Figure 1ROC curves for AAP, BCPred, and SVMTriP.
Weights of tri-peptides in the optimal SVM model.
| Tri-Peptide | Rank | Weight Score | Tri-Peptide | Rank | Weight Score |
| QQP | 1 | 503251.79 | GQQ | 11 | 121677.62 |
| PQQ | 2 | 488627.71 | QPY | 12 | 116598.60 |
| QPQ | 3 | 367386.40 | YPQ | 13 | 113237.37 |
| QPF | 4 | 246462.39 | QQF | 14 | 81709.59 |
| FPQ | 5 | 234868.65 | PYP | 15 | 79191.37 |
| PQP | 6 | 231353.73 | FQQ | 16 | 77357.97 |
| QGQ | 7 | 153161.76 | PPP | 17 | 76320.05 |
| PFP | 8 | 151840.02 | QPP | 18 | 64756.05 |
| QQQ | 9 | 128930.20 | QFP | 19 | 63814.16 |
| QQG | 10 | 122291.90 | PPQ | 20 | 63173.33 |
Weight scores are calculated by the formula w = ∑ α i x. Here α is dual representation of the decision boundary; and x (i = 0, 1, 2…n) is vector described in SVM model. Both α i and x are available in model file.
Figure 2Tendency test for BCPred, AAP, and SVMTriP.
Three bars at the same point on the x-axis are the results for APP (blue), BCPred (green), and SVMTriP (red), respectively. In the same bar, the light part is for the number of returned human peptide, and the dark part is for virus. For example, at the point of 400 returned peptides, the dark part in the red bar is 362, which means that 362 viral peptides are return in all 400 peptides by SVMTriP, and the light red part represents 38 human peptides.
Comparison among the tri-peptide subsequence models with or without propensity.
| Kernels | Sn (%) | P (%) | F-measure | ||
| Tri-peptide | Propensity only | N.A. | 56.5±12.5 | 61.0±6.3 | 0.584±0.085 |
| Tri-peptide | w./o. Propensity | Blosum62 | 54.5±6.5 | 60.5±1.5 | 0.573±0.035 |
| PAM160 | 55.0±7.2 | 61.1±1.8 | 0.578±0.040 | ||
| w./Propensity |
|
|
|
| |
| PAM160 | 69.3±10.0 | 58.5±3.5 | 0.633±0.050 | ||
| AA_A pattern | w./o. Propensity | Blosum62 | 54.8±6.8 | 60.5±1.5 | 0.579±0.040 |
| PAM160 | 55.2±7.1 | 61.3±2.0 | 0.577±0.045 | ||
| w./Propensity | Blosum62 | 60.5±5.5 | 57.5±2.5 | 0.589±0.040 | |
| PAM160 | 59.5±5.5 | 57.5±1.5 | 0.585±0.035 | ||
| A_AA pattern | w./o. Propensity | Blosum62 | 55.5±8.5 | 60.6±2.2 | 0.581±0.050 |
| PAM160 | 55.2±8.1 | 60.5±1.5 | 0.577±0.055 | ||
| w./Propensity | Blosum62 | 60.5±6.5 | 57.5±1.5 | 0.590±0.040 | |
| PAM160 | 59.5±5.5 | 57.5±1.5 | 0.585±0.025 | ||
The corresponding model is defined as SVMTriP.