| Literature DB >> 20868497 |
Andrew J Bordner1, Hans D Mittelmann.
Abstract
BACKGROUND: The binding of peptide fragments of antigens to class II MHC is a crucial step in initiating a helper T cell immune response. The identification of such peptide epitopes has potential applications in vaccine design and in better understanding autoimmune diseases and allergies. However, comprehensive experimental determination of peptide-MHC binding affinities is infeasible due to MHC diversity and the large number of possible peptide sequences. Computational methods trained on the limited experimental binding data can address this challenge. We present the MultiRTA method, an extension of our previous single-type RTA prediction method, which allows the prediction of peptide binding affinities for multiple MHC allotypes not used to train the model. Thus predictions can be made for many MHC allotypes for which experimental binding data is unavailable.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20868497 PMCID: PMC2957400 DOI: 10.1186/1471-2105-11-482
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
HLA-DP leave-one-allele-out cross-validation results for MultiRTA
| MHC β | AUC | RMS error | Correlation | Number of |
|---|---|---|---|---|
| DPB1*0101 | 0.892 | 2.74 | 0.706 | 481 |
| DPB1*0201 | 0.890 | 1.99 | 0.718 | 474 |
| DPB1*0401 | 0.903 | 0.960 | 0.730 | 552 |
| DPB1*0402 | 0.901 | 0.981 | 0.698 | 537 |
| DPB1*0501 | 0.871 | 1.01 | 0.628 | 475 |
Comparison of HLA-DR prediction results for MultiRTA, NetMHCIIpan, and TEPITOPE using the same data sets
| MultiRTA | NetMHCIIpan | TEPITOPE | |||||
|---|---|---|---|---|---|---|---|
| MHC β chain allele | AUC | RMS error | Correlation coefficient | AUC | Correlation coefficient | AUC | Number of data |
| DRB1*0101 | 1.33 | 0.778 | 0.570 | 0.720 | 5166 | ||
| DRB1*0301 | 1.36 | 0.438 | 0.746 | 0.664 | 1020 | ||
| DRB1*0401 | 0.763 | 1.56 | 0.534 | 0.716 | 1024 | ||
| DRB1*0404 | 0.835 | 1.33 | 0.623 | 0.770 | 663 | ||
| DRB1*0405 | 1.28 | 0.566 | 0.759 | 630 | |||
| DRB1*0701 | 0.817 | 1.51 | 0.620 | 0.761 | 853 | ||
| DRB1*0802 | 0.786 | 1.45 | 0.523 | 0.766 | 420 | ||
| DRB1*0901 | 2.01 | 0.380 | 0.653 | NA | 530 | ||
| DRB1*1101 | 1.46 | 0.799 | 0.588 | 0.721 | 950 | ||
| DRB1*1302 | 1.68 | 0.658 | 0.351 | 0.652 | 498 | ||
| DRB1*1501 | 0.729 | 1.57 | 0.513 | 0.686 | 934 | ||
| DRB3*0101 | 1.10 | 0.716 | 0.444 | NA | 549 | ||
| DRB4*0101 | 1.61 | 0.724 | 0.469 | NA | 446 | ||
| DRB5*0101 | 0.788 | 1.60 | 0.543 | 0.680 | 924 | ||
The results for MultiRTA and NetMHCIIpan were obtained using leave-one-allele-out cross-validation, in which predictions are made for one allotype using a model trained on the data for the remaining allotypes. The largest AUC and correlation coefficient values for each MHC allotype are highlighted in bold.
Predictions for each HLA-DR allotype using a single-type RTA model trained on data for the closest MHC allotype
| Test MHC | Closest training | AUC | RMS error | Correlation |
|---|---|---|---|---|
| DRB1*0101 | DRB1*1501 | 0.646 | 1.86 | 0.306 |
| DRB1*0301 | DRB1*1302 | 0.628 | 1.89 | 0.170 |
| DRB1*0401 | DRB1*0405 | 0.656 | 1.93 | 0.348 |
| DRB1*0404 | DRB1*0401 | 0.745 | 1.46 | 0.483 |
| DRB1*0405 | DRB1*0401 | 0.732 | 1.43 | 0.403 |
| DRB1*0701 | DRB1*0901 | 0.681 | 1.72 | 0.382 |
| DRB1*0802 | DRB1*1101 | 0.793 | 1.58 | 0.505 |
| DRB1*0901 | DRB1*0701 | 0.671 | 1.81 | 0.388 |
| DRB1*1101 | DRB1*1302 | 0.628 | 1.89 | 0.264 |
| DRB1*1302 | DRB1*1101 | 0.646 | 1.85 | 0.302 |
| DRB1*1501 | DRB1*0101 | 0.670 | 2.11 | 0.363 |
| DRB3*0101 | DRB1*0301 | 0.654 | 1.59 | 0.322 |
| DRB4*0101 | DRB1*0101 | 0.665 | 2.05 | 0.295 |
| DRB5*0101 | DRB1*0101 | 0.736 | 2.05 | 0.433 |
Predictions for each HLA-DP allotype using a single-type RTA model trained on data for the closest MHC allotype
| Test MHC | Closest training | AUC | RMS error | Correlation |
|---|---|---|---|---|
| DPB1*0101 | DPB1*0501 | 0.883 | 1.13 | 0.660 |
| DPB1*0201 | DPB1*0402 | 0.863 | 1.13 | 0.669 |
| DPB1*0401 | DPB1*0402 | 0.872 | 1.12 | 0.633 |
| DPB1*0402 | DPB1*0201 | 0.824 | 1.20 | 0.623 |
| DPB1*0501 | DPB1*0101 | 0.876 | 1.10 | 0.661 |
MultiRTA leave-one-allele-out prediction results for the DFRMLI compilation of experimental binding data for overlapping peptides from four different antigens
| MHC allotype | AUC (± 95% CI) | Number of binders |
|---|---|---|
| DRB1*0101 | 0.798 (± 0.096) | 23 |
| DRB1*0301 | 0.659 (± 0.177) | 11 |
| DRB1*0401 | 0.787 (± 0.104) | 22 |
| DRB1*0701 | 0.797 (± 0.092) | 17 |
| DRB1*1101 | 0.772 (± 0.100) | 29 |
| DRB1*1301 | 0.722 (± 0.361) | 2 |
| DRB1*1501 | 0.639 (± 0.143) | 15 |
The data set for each allotype contains 103 peptides with measured binding affinities. The 95% confidence intervals for AUC were calculated from the standard deviation of the linearly related Somer's Dxy statistic () using the Hmisc package in R. Data sets with fewer binders have corresponding higher uncertainty in their AUC values.
Figure 1Variation in the peptide core residue binding specificity as assessed by the standard deviation in MultiRTA predicted binding affinity over all 20 residue types. Each bar represents the variation for one MHC allotype in the training data set.