| Literature DB >> 17967170 |
Surajit Ray1, Thomas B Kepler.
Abstract
BACKGROUND: A key step in the development of an adaptive immune response to pathogens or vaccines is the binding of short peptides to molecules of the Major Histocompatibility Complex (MHC) for presentation to T lymphocytes, which are thereby activated and differentiate into effector and memory cells. The rational design of vaccines consists in part in the identification of appropriate peptides to effect this process. There are several algorithms currently in use for making such predictions, but these are limited to a small number of MHC molecules and have good but imperfect prediction power.Entities:
Year: 2007 PMID: 17967170 PMCID: PMC2186325 DOI: 10.1186/1745-7580-3-9
Source DB: PubMed Journal: Immunome Res ISSN: 1745-7580
Figure 1Sequence Logo plot of position specific conservation of (a) binders and (b) non-binders to HLA-A0201.
Selected Variables for each of the three classifiers using available binding data for MHC Class I allele A*0201
| Classifier | Step | Variable Selected | Misclassification error | Gain Achieved |
| SVM | 1 | hydrophobicity | 0.171839 | |
| 2 | Volume | 0.142022 | 0.0298169 | |
| 3 | isoelec | 0.125781 | 0.0162415 | |
| 4 | branch | 0.118570 | 0.0142109 | |
| 5 | aromatic | 0.118395 | 0.0001743 | |
| Random Forest | 1 | isoelec | 0.136791 | |
| 2 | Volume | 0.130078 | 0.0067131 | |
| 3 | hydrophobicity | 0.129642 | 0.0004359 | |
| Bagging | 1 | hydrophobicity | 0.146033 | |
| 2 | Area | 0.140279 | 0.0057541 | |
| 3 | isoelec | 0.137227 | 0.0030514 | |
| 4 | aromatic | 0.134786 | 0.0024411 |
Values of three most important indexes (properties) of amino acids determining the peptide-MHC binding Reproduced from [43]1, [11]2 and [44]3
| 1L | Name | Volume1 | Hydrophobicity2 | Isoelectric3 |
| A | alanine | 88.6 | 1.8 | 6.00 |
| C | cysteine | 108.5 | 2.5 | 5.05 |
| D | aspartate | 111.1 | -3.5 | 2.77 |
| E | glutamate | 138.4 | -3.5 | 3.22 |
| F | phenylalanine | 189.9 | 2.8 | 5.48 |
| G | glycine | 60.1 | -0.4 | 5.97 |
| H | histidine | 153.2 | -3.2 | 7.47 |
| I | isoleucine | 166.7 | 3.8 | 5.94 |
| K | lysine | 168.6 | -3.9 | 9.59 |
| L | leucine | 166.7 | 3.8 | 5.98 |
| M | methionine | 162.9 | 1.9 | 5.74 |
| N | asparagine | 114.1 | -3.5 | 5.41 |
| P | proline | 112.7 | -1.6 | 6.30 |
| Q | glutamine | 143.8 | -3.5 | 5.65 |
| R | arginine | 173.4 | -4.5 | 11.15 |
| S | serine | 89.0 | -0.8 | 5.68 |
| T | threonine | 116.1 | -0.7 | 5.64 |
| V | valine | 140.0 | 4.2 | 5.96 |
| W | tryptophan | 227.8 | -0.9 | 5.89 |
| Y | tyrosine | 193.6 | -1.3 | 5.66 |
Figure 2Misclassification error using different variables and classification methods applied to the MHC binding data for Class I allele A*0201.
Figure 3AROC values using different variables and classification methods applied to the MHC binding data for Class I allele A*0201 categorized by (a) variables used (b) classifier.