| Literature DB >> 26029265 |
Weike Shen1, Yuan Cao2, Lei Cha3, Xufei Zhang1,4, Xiaomin Ying3, Wei Zhang1, Kun Ge5, Wuju Li3, Li Zhong1,4.
Abstract
BACKGROUND: Accurate identification of linear B-cell epitopes plays an important role in peptide vaccine designs, immunodiagnosis, and antibody productions. Although several prediction methods have been reported, unsatisfied accuracy has limited the broad usages in linear B-cell epitope prediction. Therefore, developing a reliable model with significant improvement on prediction accuracy is highly desirable.Entities:
Keywords: Amino acid anchoring pair composition; Epitopes prediction; Linear B-cell epitopes
Year: 2015 PMID: 26029265 PMCID: PMC4449562 DOI: 10.1186/s13040-015-0047-3
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Six datasets for model construction and evaluation
| Dataset | PostiveNum* | NegativeNum* | Refereance |
|---|---|---|---|
| BCI727 | 727 | 727 | Saha et.al. [ |
| Chen872 | 872 | 872 | Chen et.al. [ |
| ABC16 | 700 | 700 | Saha et.al. [ |
| Blind387 | 187 | 200 | Saha et.al. [ |
| Lbtope_Confirm | 1042 | 1795 | Singh et.al. [ |
| FBC934 | 934 | 934 | EL-Manzalawy et.al. [ |
*PositiveNum and NegativeNum represent the number of positive samples and negative samples, respectively.
Figure 1A schematic flowchart shows the feature selection and fivefold cross-validation test on BCI727 dataset.
Figure 2Amino acid anchoring pair composition extraction. For each sequence in a dataset, the sequences of amino acid anchoring pair composition (APC) are extracted by decomposing a protein or peptide sequences into 2-mer or n-mer subsequences. The two terminal amino acids of subsequences are the anchoring point pair that may anchor each other to form a relatively stable structure, and the pair composition can be used as the features of the peptide sequences.
AUC values for each combination of (=2, 3, 4) and (<0.2, 0.4, 0.5, 0.6, 1) on BCI727 dataset at window size of 20
| Parameters | |||||
|---|---|---|---|---|---|
| 0.703 | 0.723 | 0.747 | 0.734 | 0.745 | |
| 0.715 | 0.742 |
| 0.738 | 0.746 | |
| 0.704 | 0.726 | 0.725 | 0.723 | 0.730 |
* The bold denotes the largest AUC value of the prediction.
Performances of APCpred on BCI727 dataset at different window sizes using fivefold cross-validation
| Window Sizes | Acc(%) | Sen(%) | Spe(%) | MCC | AUC |
|---|---|---|---|---|---|
| 12 | 65.68 | 65.48 | 65.89 | 0.314 | 0.705 |
| 14 | 66.30 | 66.58 | 66.02 | 0.326 | 0.727 |
| 16 | 67.13 | 67.68 | 66.58 | 0.343 | 0.735 |
| 18 | 68.23 | 69.05 | 67.40 | 0.365 | 0.732 |
| 20 |
| 69.74 | 67.13 | 0.369 | 0.748 |
* The bold denotes the largest accuracy (ACC) value of the prediction.
Figure 3ROC curves of APCpred on BCI727 dataset for different window-sizes using fivefold cross-validation. The largest AUC is 0.748 at window size of 20.
Performances of APCpred, Bayesb, AAP, and the combination method of AAP and AP models on testing with Chen872 dataset using fivefold cross-validation
| Methods | ACC(%) | Sen(%) | Spe(%) | MCC | AUC |
|---|---|---|---|---|---|
| APCpred |
| 69.95 | 75.92 | 0.460 | 0.809 |
| Bayesb | 68.50 | 70.00 | 67.00 | - | 0.74 |
| AAP | 71.09 | 60.87 | 75.36 | 0.366 | - |
| AAP + AP | 72.54 | 63.56 | 76.48 | 0.404 | - |
“-” denotes unknown information.
“AAP + AP” is Chen’s combination method of AAP and five APs.
The bold denotes the largest ACC value of the prediction.
Performances of APCpred, ABCpred, BCpred, and AAP on testing with ABC16 dataset using fivefold cross-validation
| Methods | ACC(%) | Sen(%) | Spe(%) | MCC | AUC |
|---|---|---|---|---|---|
| APCpred | 73.00 | 65.14 | 80.86 | 0.466 | 0.794 |
| ABCpred | 65.93 | 67.14 | 64.71 | 0.319 | - |
| BCPred |
| 70.14 | 79.0 | 0.493 | 0.801 |
| AAPBCPred | 73.14 | 50.17 | 95.57 | 0.518 | 0.782 |
“-” denotes unknown information.
The bold denotes the largest Acc value of the prediction.
Comparison of Performances among APCpred, ABCpred, BCpred, and AAP models
| Methods | ACC(%) | Sen(%) | Spe(%) | MCC | AUC |
|---|---|---|---|---|---|
| APCpred |
| 56.15 | 79.00 | 0.362 | 0.748 |
| ABCpred | 66.41 | 71.66 | 61.50 | *0.333 | *0.736 |
| BCpred | 65.89 | 66.31 | 65.50 | 0.318 | 0.699 |
| AAPBCPred | 64.60 | 64.17 | 65.00 | 0.292 | 0.689 |
The four classifiers were trained using ABC16 dataset and evaluated using the third dataset of Blind287.
“*” denotes the information was obtained on online prediction of ABCpred with the third dataset though an automatic program script.
The bold denotes the largest ACC value of the prediction.
Comparison of performances between APCpred and LBtope on FBC934 dataset
| Methods | Acc(%) | Sen(%) | Spe(%) | Mcc |
|---|---|---|---|---|
| LBtope on FBC934 dataset (trained on Lbtope_Confirm dataset) | 52.66 | 78.09 | 27.23 | 0.06 |
| APCpred on FBC934 dataset (trained on Lbtope_Confirm dataset) |
| 59.31 | 50.86 | 0.10 |
Both models were trained on Lbtope_Confirm dataset.
The bold denotes the largest Acc value of the prediction.