| Literature DB >> 19772615 |
Shide Liang1, Dandan Zheng, Chi Zhang, Martin Zacharias.
Abstract
BACKGROUND: Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19772615 PMCID: PMC2761409 DOI: 10.1186/1471-2105-10-302
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
The values of for twenty amino acid residues.
| Ala | -0.392 | Leu | -1.31 |
| Arg | 0.316 | Lys | 0.021 |
| Asn | 0.446 | Met | 1.06 |
| Asp | -0.307 | Phe | 0.979 |
| Cys | -7.36 | Pro | 0.017 |
| Gln | -0.006 | Trp | -0.07 |
| Glu | -0.492 | Val | -0.826 |
| Gly | 0.463 | Ser | -0.004 |
| His | 0.207 | Thr | -0.062 |
| Ile | 0.334 | Tyr | 0.979 |
is the contribution of residue type r to the area of antibody binding site; is the contribution of residue type r to the protein surface area; is the average relative accessible surface area of surface residues of type r.
AUC values for training and testing datasets predicted by the single term
| Binding site propensity | 0.637 | 0.577 |
| Conservation score | 0.593 | 0.564 |
| Side chain energy score | 0.555 | 0.569 |
| Contact number | 0.59 | 0.556 |
| Planarity score | 0.53 | 0.554 |
| Fraction of turns & loops | 0.489 | 0.587 |
a Antigen-antibody complexes from protein docking benchmark 2.0. b 17 recently released antigen-antibody complex structures in PDB. Unbound structures of both databsets were used for prediction and bound structures were used for identification of interface residues. The AUC values were calculated and averaged for all the proteins in two datasets, respectively.
Figure 1Correlation between precision and the number of predicted residues (a) Training set; (b) Testing set. The prediction results of all the proteins in the datasets were calculated and averaged. The precisions of random prediction are 15% and 12.6% for the training and testing sets, respectively.
Prediction results for the training set with 6 combined terms
| 1AHW_AB:C | 1TFH_A | 173 | 25 | 0.360 | 0.153 | 0.662 | 0.481 |
| 1BVK_DE:F | 3LZT_ | 98 | 17 | 0.765 | 0.361 | 0.716 | 0.835 |
| 1DQJ_AB:C | 3LZT_ | 98 | 20 | 0.300 | 0.182 | 0.654 | 0.534 |
| 1E6J_HL:P | 1A43_ | 63 | 13 | 0.462 | 0.353 | 0.780 | 0.585 |
| 1JPS_HL:T | 1TFH_B | 155 | 25 | 0.360 | 0.170 | 0.662 | 0.517 |
| 1 MLC_AB:E | 3LZT_ | 98 | 16 | 0.562 | 0.250 | 0.671 | 0.636 |
| 1VFB_AB:C | 8LYZ_ | 107 | 18 | 0.833 | 0.385 | 0.730 | 0.833 |
| 1WEJ_HL:F | 1HRC_ | 95 | 13 | 0.462 | 0.188 | 0.683 | 0.649 |
| 2VIS_AB:C | 2VIU_A | 247 | 20 | 0.900 | 0.281 | 0.797 | 0.901 |
| 1BJ1_HLJK:VWb | 2VPF_GH | 160 | 35 | 0.600 | 0.412 | 0.760 | 0.705 |
| 1FSK_BC:A | 1BV1_ | 145 | 19 | 0.526 | 0.233 | 0.738 | 0.587 |
| 1I9R_HL:ABCb | 1ALY_ABC | 320 | 65 | 0.508 | 0.292 | 0.686 | 0.687 |
| 1IQD_AB:C | 1D7P_M | 127 | 17 | 0.765 | 0.361 | 0.791 | 0.848 |
| 1K4C_AB:C | 1JVM_A | 88 | 16 | 0.500 | 0.258 | 0.681 | 0.647 |
| 1KXQ_H:A | 1PPI_ | 341 | 30 | 0.600 | 0.148 | 0.666 | 0.637 |
| 1NCA_HL:N | 7NN9_ | 263 | 27 | 0.556 | 0.163 | 0.674 | 0.684 |
| 1NSN_HL:S | 1KDC_ | 106 | 23 | 0.174 | 0.114 | 0.627 | 0.454 |
| 1QFW_HL:AB | 1HRP_AB | 170 | 17 | 0.235 | 0.071 | 0.660 | 0.484 |
| 1QFW_IM:AB | 1HRP_AB | 170 | 17 | 0.706 | 0.214 | 0.712 | 0.738 |
| 2JEL_HL:P | 1POH_ | 68 | 18 | 0.167 | 0.158 | 0.680 | 0.498 |
| 1BGX_HL:T | 1CMW_A | 646 | 66 | 0.394 | 0.124 | 0.683 | 0.521 |
| 2HMI_CD:AB | 1S6P_AB | 810 | 14 | 0.429 | 0.024 | 0.697 | 0.518 |
| Mean | 207 | 24.1 | 50.7 | 0.222 | 0.7 | 0.635 |
a Sensitivity, precision, and specificity were recorded when 55% of surface residues were predicted as interface residues by the single term. We chose the parameter (55%) so that the sensitivity was about 50% in the consensus prediction. bMultiple binding sites.
Prediction results for the testing set
| 2ARJ_HL:Q | 1NEZ_G | 99 | 18 | 0.278 | 0.227 | 0.790 | 0.604 |
| 2BDN_HL:A | 1DOK_A | 63 | 13 | 0.154 | 0.095 | 0.620 | 0.281 |
| 2FD6_HL:U | 1YWH_A | 225 | 14 | 0.500 | 0.089 | 0.659 | 0.617 |
| 2GHW_B:A | 2GHV_E | 148 | 27 | 0.519 | 0.311 | 0.744 | 0.727 |
| 2H9G_AB:R | 1D4V_A | 108 | 18 | 0.556 | 0.286 | 0.722 | 0.724 |
| 2J6E_IMHL:ABb | 2DTQ_AB | 336 | 41 | 0.512 | 0.202 | 0.719 | 0.614 |
| 2NR6_CD:A | 1YG9_A | 233 | 19 | 0.947 | 0.234 | 0.724 | 0.870 |
| 2NYY_CD:A | 2VUA_A | 321 | 24 | 0.750 | 0.164 | 0.690 | 0.810 |
| 2P45_B:A | 1KF2_A | 104 | 13 | 0.154 | 0.057 | 0.637 | 0.553 |
| 2Q8B_HL:A | 1Z40_A | 228 | 25 | 0.440 | 0.147 | 0.685 | 0.645 |
| 2QQN_HL:A | 1KEX_A | 118 | 11 | 0.636 | 0.200 | 0.738 | 0.737 |
| 2R29_HL:A | 1OK8_A | 317 | 20 | 0.300 | 0.054 | 0.646 | 0.567 |
| 2R56_HL:A | 1GX9_A | 131 | 22 | 0.091 | 0.053 | 0.670 | 0.409 |
| 2UZI_HL:R | 2EVW_X | 132 | 21 | 0.286 | 0.171 | 0.739 | 0.505 |
| 3BN9_CD:B | 1EAX_A | 181 | 32 | 0.531 | 0.279 | 0.705 | 0.581 |
| 3BQU_CD:AB | 2F5A_HL | 336 | 12 | 1.000 | 0.098 | 0.660 | 0.914 |
| 3D85_AB:C | 3D87_A | 141 | 19 | 0.474 | 0.180 | 0.664 | 0.591 |
| Mean | 189 | 20.5 | 47.8% | 16.7 | 69.5% | 0.632 |
a Sensitivity, precision, and specificity were recorded when 55% of surface residues were predicted as interface residues by the single term. bMultiple binding sites.
Figure 2Two successful examples of antibody binding site prediction (a) SARS spike protein receptor binding domain (2ghv); (b) Cockroach allergen Bla g 2 (1yg9). The antibodies were colored in grey. The surface residues of antigens were colored according to predicted possibility to be an epitope residue (from red to blue in decreasing order) and the core residues were colored in blue.
Comparison with other algorithms
| DiscoTope1.2 | 0.63 | 0.628 | 0.6 | 0.589 |
| BEpro | 0.645 | 0.639 | 0.617 | 0.598 |
| Our algorithm | 0.628 | 0.635 | 0.603 | 0.632 |