| Literature DB >> 21143805 |
Lawrence J K Wee1, Diane Simarmata, Yiu-Wing Kam, Lisa F P Ng, Joo Chuan Tong.
Abstract
BACKGROUND: The identification of B-cell epitopes on antigens has been a subject of intense research as the knowledge of these markers has great implications for the development of peptide-based diagnostics, therapeutics and vaccines. As experimental approaches are often laborious and time consuming, in silico methods for prediction of these immunogenic regions are critical. Such efforts, however, have been significantly hindered by high variability in the length and composition of the epitope sequences, making naïve modeling methods difficult to apply.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21143805 PMCID: PMC3005920 DOI: 10.1186/1471-2164-11-S4-S21
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Heat maps of relative position-specific amino acid propensities (P Px values of amino acids were computed for EL-Manzalawy analysis and control subsets (left, top and bottom respectively) and Chen analysis and control subsets (right, top and bottom respectively). Px values were computed as the ratio of the frequency of occurrence of the amino acid in the epitopes pool over the frequency of occurrence of the same amino acid in the non-epitopes pool at a specific position. Px values were calculated using the epitopes and non-epitope pools in the analysis subsets, and using the two pools of non-epitopes in the control subsets. Increasing color intensities in the red spectrum indicate enrichment in epitopes pool (high Px) while increasing color intensities in the blue spectrum indicate enrichment in non-epitopes pool (low Px).
Figure 2Quantitative measurement of the spread between the relative position-specific amino acid propensities (P At each residue position, standard deviations of Px of all amino acids were calculated. Standard deviation scores were plotted against analysis and control subsets for EL-Manzalawy (top) and Chen (bottom) datasets.
Results of SVM prediction on independent test sets
| SVM Classifier* | Sensitivity (%) | Specificity (%) | Accuracy (%) | AROC |
|---|---|---|---|---|
| SVM12 | 59.00 | 54.00 | 56.50 | 0.60 |
| SVM14 | 60.00 | 51.00 | 55.50 | 0.62 |
| SVM16 | 62.00 | 47.00 | 54.50 | 0.59 |
| SVM18 | 62.00 | 57.00 | 59.50 | 0.67 |
| SVM20 | 54.00 | 62.00 | 58.00 | 0.64 |
| BFE-SVM12 | 70.00 | 64.00 | 67.00 | 0.71 |
| BFE-SVM14 | 63.00 | 69.00 | 66.00 | 0.73 |
| BFE-SVM16 | 73.00 | 62.00 | 67.50 | 0.73 |
| BFE-SVM18 | 70.00 | 69.00 | 69.50 | 0.80 |
| BFE-SVM20 | 81.00 | 68.00 | 74.50 | 0.84 |
| BFE-Chen | 70.00 | 67.00 | 68.50 | 0.74 |
*SVM classifiers are generated for different peptide lengths as indicated by the numerical value assigned to each classifier.
SVM prediction of epitopes on antigens in Pellequer dataset
| UniProt ID | Antigen | Accuracy (%) |
|---|---|---|
| P01556 | Enterotoxin beta chain precursor (Cholera) | 76.74 |
| P00001 | Cytochrome c (Human) | 53.03 |
| P03138 | Major surface antigen precursor (Hepatitis B virus) | 37.32 |
| P01233 | Choriogonadotropin beta chain precursor (Human) | 44.09 |
| P01574 | Interferon beta precursor (Human) | 72.48 |
| P02238 | Leghemoglobin A (Soybean) | 75.24 |
| P00698 | Lysozyme C precursor (Chicken) | 66.06 |
| P02247 | Myohemerythrin (Themiste zostericola) | 61.25 |
| P02185 | Myoglobin (Physeter catodon) | 66.96 |
| P04127 | PAP fimbrial major pilin protein precursor (E.coli) | 58.50 |
| P01112 | Transforming protein p21/H-RAS-1 (Human) | 83.44 |
| P00797 | Renin precursor (Human) | 36.68 |
| P01484 | Neurotoxin II (Androctonus australis hector) | 51.06 |
| P03570 | Coat protein (Tobacco mosaic virus) | 52.50 |