| Literature DB >> 22383964 |
Lianming Zhang1, Yiqing Chen, Hau-San Wong, Shuigeng Zhou, Hiroshi Mamitsuka, Shanfeng Zhu.
Abstract
MOTIVATION: Accurate identification of peptides binding to specific Major Histocompatibility Complex Class II (MHC-II) molecules is of great importance for elucidating the underlying mechanism of immune recognition, as well as for developing effective epitope-based vaccines and promising immunotherapies for many severe diseases. Due to extreme polymorphism of MHC-II alleles and the high cost of biochemical experiments, the development of computational methods for accurate prediction of binding peptides of MHC-II molecules, particularly for the ones with few or no experimental data, has become a topic of increasing interest. TEPITOPE is a well-used computational approach because of its good interpretability and relatively high performance. However, TEPITOPE can be applied to only 51 out of over 700 known HLA DR molecules.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22383964 PMCID: PMC3285624 DOI: 10.1371/journal.pone.0030483
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Available X-ray structures of MHC class II HLA-peptide complexe.
| PDB ID | HLA Allele | Peptide Sequence |
| 1AQD | DRB1*01:01 | VGSD |
| 1PYW | DRB1*01:01 | X |
| 1KLG | DRB1*01:01 | GEL |
| 2FSE | DRB1*01:01 | AG |
| 1KLU | DRB1*01:01 | GEL |
| 1SJH | DRB1*01:01 | PE |
| 1SJE | DRB1*01:01 | PE |
| 1T5W | DRB1*01:01 | AA |
| 1T5X | DRB1*01:01 | AA |
| 2IAN | DRB1*01:01 | GEL |
| 2IAM | DRB1*01:01 | GEL |
| 2IPK | DRB1*01:01 | XPK |
| 1FYT | DRB1*01:01 | PK |
| 1R5I | DRB1*01:01 | PK |
| 1HXY | DRB1*01:01 | PK |
| 1JWM | DRB1*01:01 | PK |
| 1JWS | DRB1*01:01 | PK |
| 1JWU | DRB1*01:01 | PK |
| 1LO5 | DRB1*01:01 | PK |
| 2ICW | DRB1*01:01 | PK |
| 2OJE | DRB1*01:01 | PK |
| 2G9H | DRB1*01:01 | PK |
| 1A6A | DRB1*03:01 | PVSK |
| 1J8H | DRB1*04:01 | PK |
| 2SEB | DRB1*04:01 | AY |
| 1BX2 | DRB1*15:01 | ENPV |
| 1YMM | DRB1*15:01 | ENPV |
| 2Q6W | DRB3*01:01 | A |
| 3C5J | DRB3*02:01 | QVI |
| 1FV1 | DRB5*01:01 | NPVVHF |
| 1H15 | DRB5*01:01 | GGV |
| 1ZGL | DRB5*01:01 | VHF |
The table shows complex structures retrieved from PDB. The columns in the table give PDB ID, HLA-DR restriction and bound peptide (binding core highlighted in bold).
The HLA-DR amino acid residue positions of each pockets in TEPITOPEpan profile.
| Residue positions | |
| P1 | 82 85 86 89 |
| P2 | 77 78 81 82 |
| P3 | 78 |
| P4 | 11 13 26 28 70 71 74 78 |
| P5 | 11 13 28 70 71 74 |
| P6 | 11 13 28 30 61 71 |
| P7 | 11 28 30 47 61 67 70 71 |
| P8 | 60 61 |
| P9 | 9 30 37 57 60 61 |
The first column gives nine pockets (P1 to P9). The second column shows corresponding residue positions in contact with each pocket.
Performance of TEPITOPEpan with different alphas in terms of AUC.
| Allele | Count |
|
|
|
|
|
|
|
| 1-KNN | TEPITOPE |
| DRB1*01:01 | 1203 | 0.622 | 0.635 | 0.642 | 0.648 |
| 0.650 | 0.650 | 0.650 | 0.650 | 0.650 |
| DRB1*03:01 | 474 | 0.591 | 0.639 | 0.689 | 0.724 |
| 0.729 | 0.728 | 0.728 | 0.729 | 0.727 |
| DRB1*04:01 | 457 | 0.745 | 0.757 | 0.764 | 0.771 |
| 0.767 | 0.764 | 0.759 | 0.756 | 0.756 |
| DRB1*04:04 | 168 | 0.819 | 0.832 | 0.836 | 0.842 |
| 0.843 | 0.841 | 0.839 | 0.838 | 0.837 |
| DRB1*04:05 | 171 | 0.748 | 0.762 | 0.770 | 0.783 | 0.792 | 0.794 |
| 0.798 | 0.795 | 0.795 |
| DRB1*07:01 | 310 | 0.753 | 0.772 |
| 0.770 | 0.767 | 0.767 | 0.767 | 0.767 | 0.766 | 0.766 |
| DRB1*08:02 | 174 | 0.771 | 0.781 | 0.793 |
| 0.794 | 0.792 | 0.791 | 0.790 | 0.790 | 0.788 |
| DRB1*09:01 | 117 |
| 0.724 | 0.715 | 0.702 | 0.696 | 0.689 | 0.688 | 0.689 | 0.686 | 0.644 |
| DRB1*11:01 | 359 | 0.691 | 0.699 | 0.704 | 0.707 | 0.714 | 0.721 |
|
| 0.723 | 0.723 |
| DRB1*13:02 | 179 | 0.725 | 0.735 | 0.744 |
| 0.743 | 0.736 | 0.735 | 0.735 | 0.736 | 0.737 |
| DRB1*15:01 | 365 | 0.717 | 0.727 | 0.734 |
| 0.731 | 0.731 | 0.730 | 0.730 | 0.730 | 0.730 |
| DRB3*01:01 | 102 | 0.640 | 0.700 | 0.734 |
| 0.731 | 0.707 | 0.700 | 0.663 | 0.606 | 0.673 |
| DRB4*01:01 | 181 | 0.698 | 0.705 | 0.714 | 0.725 | 0.731 | 0.741 | 0.742 |
| 0.744 | 0.718 |
| DRB5*01:01 | 343 | 0.626 | 0.638 | 0.644 | 0.645 | 0.651 |
| 0.651 | 0.651 | 0.651 | 0.652 |
| Average | 4603 | 0.706 | 0.722 | 0.732 |
|
| 0.737 | 0.737 | 0.733 | 0.729 | 0.728 |
The highest value in each row of columns for is highlighted in bold. 1-KNN means the result of using only specificity vector(s) in the library with highest similarity to derive PSSM.
AUC on Nielsen-Set2.
| Allele | Count | Binder | NetMHCIIpan2.0 | NetMHCIIpan-1.0 | MultiRTA | TEPITOPE | TEPITOPEpan |
| DRB1*03:02 | 148 | 44 |
| 0.688 | 0.549 | 0.602 | |
| DRB1*08:06 | 118 | 91 |
| 0.703 | 0.652 | 0.870 | 0.886 |
| DRB1*08:13 | 1370 | 455 | 0.666 | 0.763 | 0.712 | 0.746 |
|
| DRB1*08:19 | 116 | 54 |
| 0.677 | 0.630 | 0.714 | |
| DRB1*12:01 | 117 | 81 | 0.798 | 0.587 | 0.620 |
| |
| DRB1*12:02 | 117 | 79 |
| 0.660 | 0.663 | 0.842 | |
| DRB1*14:02 | 118 | 78 |
| 0.713 | 0.672 | 0.725 | |
| DRB1*14:04 | 30 | 16 | 0.679 | 0.571 | 0.563 |
| |
| DRB1*14:12 | 116 | 63 |
| 0.797 | 0.688 | 0.804 | |
| DRB3*03:01 | 160 | 70 | 0.765 | 0.739 | 0.729 |
| |
| Average | 0.800 | 0.690 | 0.683 | 0.763 |
The highest values for each allele are highlighted in bold. Results of NetMHCIIpan-2.0 are obtained by leave-one-(allele)-out (LOO) experiment over original 24 alleles in [27].
AUC on Lin-Set3.
| Allele | Count | Binder | NetMHCIIpan-2.0 | NetMHCIIpan-1.0 | MultiRTA | TEPITOPE | TEPITOPEpan |
| DRB1*01:01 | 103 | 15 | 0.883 | 0.846 | 0.817 |
|
|
| DRB1*03:01 | 103 | 18 | 0.716 | 0.668 |
| 0.695 | 0.680 |
| DRB1*04:01 | 103 | 8 |
| 0.814 | 0.696 | 0.754 | 0.782 |
| DRB1*07:01 | 103 | 10 |
| 0.852 | 0.781 | 0.740 | 0.741 |
| DRB1*11:01 | 103 | 39 |
| 0.820 | 0.819 | 0.824 | 0.826 |
| DRB1*13:01 | 103 | 11 |
| 0.715 | 0.686 | 0.715 | 0.716 |
| DRB1*15:01 | 103 | 11 |
| 0.790 | 0.689 | 0.659 | 0.661 |
| Average | 0.824 | 0.786 | 0.749 | 0.754 | 0.757 |
The highest value for each allele is highlighted in bold. According to Nielsen et al. [27], for DRB1*01:01, 04:01, 07:01 and 15:01, binding threshold is set to 100 nM, and threshold is set to 1000 nM for the rest when calculating the AUC.
AUC on Epan-Set4.
| Allele | Count | Binder | NetMHCIIpan-2.0 | NetMHCIIpan-1.0 | MulitRTA | TEPITOPE | TEPITOPEpan |
| DRB1*01:02 | 92 | 62 | 0.746 |
| 0.749 | 0.762 | 0.758 |
| DRB1*01:03 | 52 | 41 | 0.772 | 0.756 | 0.772 |
| |
| DRB1*03:02 | 88 | 44 |
| 0.775 | 0.733 | 0.823 | |
| DRB1*04:03 | 63 | 14 | 0.678 | 0.659 | 0.611 |
| |
| DRB1*04:06 | 92 | 37 | 0.486 |
| 0.519 | 0.501 | |
| DRB1*11:02 | 65 | 30 |
| 0.738 | 0.591 | 0.723 | 0.738 |
| DRB1*11:03 | 64 | 27 |
| 0.623 | 0.585 | 0.726 | |
| DRB1*11:04 | 73 | 34 |
| 0.639 | 0.618 | 0.664 | 0.654 |
| DRB1*12:01 | 719 | 446 |
| 0.721 | 0.673 | 0.659 | |
| DRB1*13:01 | 302 | 132 | 0.494 | 0.516 | 0.567 |
| 0.623 |
| DRB1*14:01 | 43 | 33 | 0.676 | 0.761 |
| 0.785 | |
| DRB1*15:02 | 47 | 21 |
| 0.762 | 0.777 | 0.740 | 0.742 |
| DRB1*16:01 | 56 | 17 |
| 0.793 | 0.789 | 0.644 | |
| DRB3*02:02 | 656 | 318 |
| 0.732 | 0.680 | 0.686 | |
| Average |
| 0.701 | 0.677 | 0.712 | |||
| Average (Tepitope alleles) |
| 0.688 | 0.661 | 0.705 | 0.703 | ||
| Average (Others) |
| 0.708 | 0.686 | 0.717 |
Highest values for each allele are highlighted in bold.
Figure 1Comparing of different pan-specific methods by the sequence logos of peptides restricted to HLA-DRB1*04:02, DRB1*11:01, DRB1*12:01, DRB1*13:01.
Evaluation on SYF-Set6 and EIEDB-SET7.
| SYF-Set6 | Ligand | NetMHCIIpan-2.0 | NetMHCIIpan-1.0 | MultiRTA | TEPITOPE | TEPITOPEpan |
| Avg per ligand | 1164 |
| 0.799 | 0.760 | 0.800 | |
| Avg per allele | 28 |
| 0.787 | 0.756 | 0.769 | |
| Avg per allele (TEPITOPE alleles) | 17 | 0.785 | 0.767 | 0.733 |
| 0.807 |
| Avg per allele (Other alleles) | 11 | 0.814 |
| 0.791 | 0.711 |
Identifying HLA-DR ligands and T-cell epitopes, respectively. Ligand and Epitopes show the number of HLA-DR ligands and HLA-DR epitopes, respectively. Avg per ligand shows the average AUC over all ligands, and Avg per allele gives the average of Avg per ligand over all alleles.
The number of errors on predicting binding cores of 20 complexes in EpanCore-Set8.
| PDB | #complexes | #alleles | NetMHCIIpan-2.0 | NetMHCIIpan-1.0 | MultiRTA | TEPITOPEpan | TEPITOPE |
| Count | 20 | 7 | 5 errors | 3 errors | 3 errors | 2 error | 0 errors (2 missing) |
The binding cores of 2 complexes cannot be predicted by TEPITOPE, since it doesn't cover DRB3*01:01 and DRB3*02:01.