| Literature DB >> 24565049 |
Wen-Jun Shen, Shaohong Zhang, Hau-San Wong.
Abstract
BACKGROUND: The immune system must detect a wide variety of microbial pathogens, such as viruses, bacteria, fungi and parasitic worms, to protect the host against disease. Antigenic peptides displayed by MHC II (class II Major Histocompatibility Complex) molecules is a pivotal process to activate CD4+ TH cells (Helper T cells). The activated TH cells can differentiate into effector cells which assist various cells in activating against pathogen invasion. Each MHC locus encodes a great number of allele variants. Yet this limited number of MHC molecules are required to display enormous number of antigenic peptides. Since the peptide binding measurements of MHC molecules by biochemical experiments are expensive, only a few of the MHC molecules have suffecient measured peptides. To perform accurate binding prediction for those MHC alleles without suffecient measured peptides, a number of computational algorithms were proposed in the last decades.Entities:
Year: 2013 PMID: 24565049 PMCID: PMC3908610 DOI: 10.1186/1477-5956-11-S1-S15
Source DB: PubMed Journal: Proteome Sci ISSN: 1477-5956 Impact factor: 2.480
X-ray crystallographic structures of pMHC II binding complexes.
| PDB ID | DRB Allele | Peptide Sequence | Core |
|---|---|---|---|
| DRB1*0101 | GELIGILNAAKVPAD | IGILNAAKV | |
| DRB1*0301 | PVSKMRMATPLLMQA | MRMATPLLM | |
| DRB1*0101 | VGSDWRFLRGYHQYA | WRFLRGYHQ | |
| DRB1*1501 | ENPVVHFFKNIVTPR | VHFFKNIVT | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB5*0101 | NPVVHFFKNIVTPRTPPPSQ | FKNIVTPRT | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB5*0101 | GGVYHFVKKHVHES | YHFVKKHVH | |
| DRB5*0101 | VHFFKNIVTPRTP | FKNIVTPRT | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0401 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | GELIGILNAAKVPAD | IGILNAAKV | |
| DRB1*0101 | GELIGTLNAAKVPAD | IGTLNAAKV | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | XFVKQNAAAL | FVKQNAAAL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PEVIPMFSALSEGATP | VIPMFSALS | |
| DRB1*0101 | PEVIPMFSALSEG | VIPMFSALS | |
| DRB1*0101 | AAYSDQATPLLLSPR | YSDQATPLL | |
| DRB1*0101 | AAYSDQATPLLLSPR | YSDQATPLL | |
| DRB1*1501 | ENPVVHFFKNIVTPRGGSGGGGG | VHFFKNIVT | |
| DRB5*0101 | VHFFKNIVTPRTPGG | FKNIVTPRT | |
| DRB1*0101 | AGFKGEQGPKGEPG | FKGEQGPKG | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | GELIGILNAAKVPAD | IGILNAAKV | |
| DRB1*0101 | GELIGTLNAAKVPAD | IGTLNAAKV | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | XPKWVKQNTLKLAT | WVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB3*0101 | AWRSDEALPLGS | WRSDEALPL | |
| DRB1*0401 | AYMRADAAAGGA | MRADAAAGG | |
| DRB3*0301 | QVIILNHPGQISA | IILNHPGQI | |
| DRB1*0101 | APPAYEKLSAEQSPP | YEKLSAEQS | |
| DRB1*0101 | KPVSKMRMATPLLMQALPM | MRMATPLLM | |
| DRB1*0101 | KMRMATPLLMQALPM | MRMATPLLM | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL | |
| DRB1*0101 | PKYVKQNTLKLAT | YVKQNTLKL |
DRB Pocket Residue Indices.
| Pocket residue indices | |
|---|---|
| P1 | 86 |
| P2 | - |
| P3 | - |
| P4 | 13, 70, 71, 74, 78 |
| P5 | - |
| P6 | 11 |
| P7 | 28, 30, 47, 61, 67, 71 |
| P8 | - |
| P9 | 9, 37, 57, 60, 61 |
Figure 1Schematic illustration of profile generation for pocket 4/6/7/9.
The performance of OWA-PSSM, TEPITOPE and TEPITOPEpan on the SMM-align dataset in terms of AUC.
| Allele | # peptides | TEPITOPE | TEPITOPEpan | OWA-PSSM |
|---|---|---|---|---|
| DRB1*0101 | 1203 | 0.647 | 0.648 | 0.645 |
| DRB1*0301 | 474 | 0.733 | 0.739 | 0.731 |
| DRB1*0401 | 457 | 0.754 | 0.770 | 0.756 |
| DRB1*0404 | 168 | 0.829 | 0.832 | 0.830 |
| DRB1*0405 | 171 | 0.789 | 0.785 | 0.789 |
| DRB1*0701 | 310 | 0.768 | 0.768 | 0.771 |
| DRB1*0802 | 174 | 0.769 | 0.774 | 0.784 |
| DRB1*0901 | 117 | 0.686 | 0.731 | |
| DRB1*1101 | 359 | 0.709 | 0.700 | 0.715 |
| DRB1*1302 | 179 | 0.721 | 0.728 | 0.727 |
| DRB1*1501 | 365 | 0.725 | 0.727 | 0.731 |
| DRB3*0101 | 102 | 0.724 | 0.869 | |
| DRB4*0101 | 181 | 0.722 | 0.729 | |
| DRB5*0101 | 343 | 0.654 | 0.652 | 0.646 |
| Average | 0.732 | 0.747 | ||
| Average I | 0.736 | 0.738 | 0.739 | |
| Average II | 0.711 | 0.776 | ||
"Average" is the average over 14 alleles. "Average I" is the average over 11 alleles predictable by TEPITOPE. "Average II" is the average over 3 alleles not predictable by TEPITOPE. We obtain the PSSMs of TEPITOPE and TEPITOPEpan from their public servers ProPred (http://www.imtech.res.in/raghava/propred/page4.html) and TEPITOPEpan (http://www.biokdd.fudan.edu.cn/Service/TEPITOPEpan/TEPITOPEpan.html), respectively.
The performance on the MHCBench dataset in terms of AUC.
| Dataset | # peptides | OWA-PSSM | TEPITOPE | TEPITOPEpan | MultiRTA | NetMHCIIpan-2.0 |
|---|---|---|---|---|---|---|
| Set 1 | 1017 | 0.768 | 0.766 | 0.764 | 0.713 | 0.765 |
| Set 2 | 673 | 0.735 | 0.734 | 0.727 | 0.685 | 0.739 |
| Set 3a | 590 | 0.739 | 0.736 | 0.734 | 0.701 | 0.710 |
| Set 3b | 495 | 0.756 | 0.753 | 0.748 | 0.715 | 0.753 |
| Set 4a | 646 | 0.753 | 0.750 | 0.748 | 0.699 | 0.759 |
| Set 4b | 584 | 0.746 | 0.745 | 0.738 | 0.706 | 0.751 |
| Set 5a | 117 | 0.655 | 0.668 | 0.640 | 0.599 | 0.649 |
| Set 5b | 85 | 0.664 | 0.689 | 0.646 | 0.597 | 0.640 |
| Average | 0.727 | 0.730 | 0.718 | 0.677 | 0.721 | |
The PSSMs of TEPITOPE, TEPITOPEpan and the results of MultiRTA were obtained from their respective web servers. The prediction of NetMHCIIpan-2.0 were computed by its stand-alone software package.
Prediction performance on the HLA-DR ligand dataset.
| Allele | # ligands | OWA-PSSM | TEPITOPE | TEPITOPEpan | MultiRTA | NetMHCIIpan-2.0 |
|---|---|---|---|---|---|---|
| DRB1*0101 | 53 | 0.827 | 0.833 | 0.834 | 0.833 | 0.835 |
| DRB1*0102 | 5 | 0.889 | 0.895 | 0.892 | 0.935 | 0.927 |
| DRB1*0301 | 88 | 0.667 | 0.673 | 0.671 | 0.652 | 0.789 |
| DRB1*0401 | 468 | 0.831 | 0.833 | 0.826 | 0.771 | 0.875 |
| DRB1*0402 | 36 | 0.882 | 0.885 | 0.880 | 0.768 | 0.667 |
| DRB1*0403 | 1 | 0.954 | 0.954 | 1.000 | 0.845 | |
| DRB1*0404 | 42 | 0.779 | 0.775 | 0.797 | 0.711 | 0.765 |
| DRB1*0405 | 36 | 0.804 | 0.809 | 0.778 | 0.729 | 0.856 |
| DRB1*0701 | 47 | 0.698 | 0.697 | 0.696 | 0.720 | 0.744 |
| DRB1*0801 | 39 | 0.694 | 0.697 | 0.656 | 0.541 | 0.643 |
| DRB1*0802 | 1 | 0.930 | 0.916 | 0.923 | 0.532 | 0.978 |
| DRB1*0803 | 1 | 0.229 | 0.149 | 0.383 | 0.292 | |
| DRB1*0901 | 6 | 0.750 | 0.659 | 0.842 | 0.957 | |
| DRB1*1001 | 183 | 0.783 | 0.770 | 0.827 | 0.866 | |
| DRB1*1101 | 35 | 0.828 | 0.835 | 0.831 | 0.838 | 0.896 |
| DRB1*1104 | 8 | 0.868 | 0.870 | 0.856 | 0.811 | 0.911 |
| DRB1*1201 | 11 | 0.801 | 0.828 | 0.847 | 0.863 | |
| DRB1*1301 | 16 | 0.819 | 0.824 | 0.813 | 0.745 | 0.724 |
| DRB1*1302 | 19 | 0.743 | 0.742 | 0.735 | 0.720 | 0.561 |
| DRB1*1401 | 9 | 0.712 | 0.730 | 0.704 | 0.810 | |
| DRB1*1501 | 22 | 0.720 | 0.718 | 0.717 | 0.663 | 0.671 |
| DRB1*1502 | 3 | 0.773 | 0.767 | 0.774 | 0.706 | 0.665 |
| DRB1*1601 | 2 | 0.621 | 0.630 | 0.918 | 0.849 | |
| DRB3*0101 | 2 | 0.888 | 0.918 | 0.953 | 0.971 | |
| DRB3*0301 | 5 | 0.907 | 0.783 | 0.939 | 0.948 | |
| DRB4*0101 | 6 | 0.500 | 0.491 | 0.515 | 0.726 | |
| DRB4*0103 | 2 | 0.941 | 0.753 | 0.745 | 0.827 | |
| DRB5*0101 | 18 | 0.823 | 0.842 | 0.835 | 0.777 | 0.847 |
| Average | 1164 | 0.774 | 0.756 | 0.754 | 0.797 | |
| Average I | 0.799 | 0.801 | 0.795 | 0.732 | 0.786 | |
| Average II | 0.735 | 0.697 | 0.789 | 0.814 | ||
"Average" is the average over 28 alleles. "Average I" is the average over 17 alleles predictable by TEPITOPE.
"Average II" is the average over 11 alleles not predictable by TEPITOPE. The PSSMs of TEPITOPE, TEPITOPEpan and the results of MultiRTA were obtained from their respective web servers. The prediction of NetMHCIIpan-2.0 were obtained directly from its publication.
Prediction performance on the HLA-DR T-cell epitope dataset.
| Allele | # epitopes | OWA-PSSM | TEPITOPE | TEPITOPEpan | MultiRTA | NetMHCIIpan-2.0 |
|---|---|---|---|---|---|---|
| DRB1*0101 | 125 | 0.807 | 0.795 | 0.808 | 0.786 | 0.810 |
| DRB1*0102 | 4 | 0.790 | 0.761 | 0.792 | 0.822 | 0.879 |
| DRB1*0103 | 5 | 0.837 | 0.719 | 0.528 | 0.667 | |
| DRB1*0301 | 173 | 0.632 | 0.637 | 0.640 | 0.655 | 0.683 |
| DRB1*0401 | 342 | 0.747 | 0.741 | 0.743 | 0.707 | 0.775 |
| DRB1*0402 | 33 | 0.575 | 0.574 | 0.571 | 0.521 | 0.570 |
| DRB1*0403 | 14 | 0.904 | 0.905 | 0.848 | 0.896 | |
| DRB1*0404 | 46 | 0.732 | 0.732 | 0.738 | 0.715 | 0.744 |
| DRB1*0405 | 21 | 0.745 | 0.746 | 0.722 | 0.579 | 0.626 |
| DRB1*0406 | 6 | 0.766 | 0.753 | 0.869 | 0.741 | |
| DRB1*0407 | 4 | 0.808 | 0.824 | 0.749 | 0.668 | |
| DRB1*0408 | 2 | 0.930 | 0.930 | 0.930 | 0.999 | 0.986 |
| DRB1*0701 | 56 | 0.737 | 0.720 | 0.736 | 0.736 | 0.742 |
| DRB1*0703 | 1 | 0.905 | 0.911 | 0.915 | 0.707 | 0.896 |
| DRB1*0801 | 4 | 0.586 | 0.554 | 0.640 | 0.716 | 0.663 |
| DRB1*0802 | 2 | 0.848 | 0.866 | 0.850 | 0.685 | 0.754 |
| DRB1*0803 | 2 | 0.548 | 0.516 | 0.707 | 0.852 | |
| DRB1*0901 | 13 | 0.729 | 0.697 | 0.636 | 0.738 | |
| DRB1*1001 | 4 | 0.870 | 0.835 | 0.789 | 0.875 | |
| DRB1*1101 | 88 | 0.752 | 0.745 | 0.751 | 0.703 | 0.815 |
| DRB1*1102 | 1 | 0.843 | 0.828 | 0.822 | 0.503 | 0.493 |
| DRB1*1103 | 3 | 0.333 | 0.328 | 0.480 | 0.510 | |
| DRB1*1104 | 6 | 0.793 | 0.810 | 0.805 | 0.666 | 0.807 |
| DRB1*1201 | 3 | 0.876 | 0.887 | 0.862 | 0.970 | |
| DRB1*1301 | 15 | 0.767 | 0.783 | 0.756 | 0.642 | 0.632 |
| DRB1*1302 | 10 | 0.832 | 0.809 | 0.813 | 0.781 | 0.860 |
| DRB1*1303 | 3 | 0.482 | 0.562 | 0.515 | 0.604 | |
| DRB1*1401 | 16 | 0.718 | 0.781 | 0.697 | 0.789 | |
| DRB1*1404 | 1 | 0.930 | 0.949 | 0.938 | 0.956 | |
| DRB1*1405 | 2 | 0.861 | 0.807 | 0.848 | 0.839 | |
| DRB1*1501 | 193 | 0.688 | 0.681 | 0.690 | 0.665 | 0.722 |
| DRB1*1502 | 20 | 0.611 | 0.605 | 0.608 | 0.570 | 0.681 |
| DRB1*1503 | 2 | 0.802 | 0.829 | 0.531 | 0.874 | |
| DRB1*1601 | 5 | 0.684 | 0.699 | 0.721 | 0.724 | |
| DRB1*1602 | 3 | 0.885 | 0.912 | 0.886 | 0.984 | |
| DRB3*0101 | 12 | 0.875 | 0.833 | 0.883 | 0.895 | |
| DRB3*0202 | 10 | 0.588 | 0.613 | 0.466 | 0.539 | |
| DRB3*0301 | 1 | 0.988 | 0.885 | 0.906 | 0.966 | |
| DRB4*0101 | 17 | 0.663 | 0.560 | 0.583 | 0.789 | |
| DRB4*0103 | 1 | 0.990 | 0.990 | 0.992 | 0.991 | |
| DRB5*0101 | 55 | 0.746 | 0.738 | 0.747 | 0.752 | 0.802 |
| DRB5*0102 | 1 | 0.909 | 0.870 | 0.752 | 0.987 | |
| Average | 0.764 | 0.748 | 0.758 | 0.717 | 0.781 | |
| Average I | 0.753 | 0.748 | 0.754 | 0.696 | 0.747 | |
| Average II | 0.775 | 0.761 | 0.736 | 0.812 | ||
"Average" is the average over 42 alleles. "Average I" is the average over 20 alleles predictable by TEPITOPE.
"Average II" is the average over 22 alleles not predictable by TEPITOPE. The PSSMs of TEPITOPE, TEPITOPEpan and the results of MultiRTA were obtained from their respective web servers. The prediction of NetMHCIIpan-2.0 were obtained directly from its publication.
Comparison of OWA-PSSM with four pan-specific methods in identifying MHC II-peptide binding cores.
| PDB ID | OWA-PSSM | TEPITOPE | TEPITOPEpan | MultiRTA | NetMHCIIpan-2.0 |
|---|---|---|---|---|---|
| IGILNAAKV | IGILNAAKV | IGILNAAKV | IGILNAAKV | ||
| MRMATPLLM | MRMATPLLM | MRMATPLLM | MRMATPLLM | MRMATPLLM | |
| WRFLRGYHQ | WRFLRGYHQ | WRFLRGYHQ | WRFLRGYHQ | WRFLRGYHQ | |
| VHFFKNIVT | VHFFKNIVT | VHFFKNIVT | VHFFKNIVT | ||
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| FKNIVTPRT | FKNIVTPRT | FKNIVTPRT | |||
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YHFVKKHVH | YHFVKKHVH | YHFVKKHVH | YHFVKKHVH | YHFVKKHVH | |
| FKNIVTPRT | FKNIVTPRT | FKNIVTPRT | |||
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| IGILNAAKV | IGILNAAKV | IGILNAAKV | IGILNAAKV | ||
| IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| FVKQNAAAL | FVKQNAAAL | FVKQNAAAL | FVKQNAAAL | FVKQNAAAL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| VIPMFSALS | VIPMFSALS | VIPMFSALS | VIPMFSALS | VIPMFSALS | |
| VIPMFSALS | VIPMFSALS | VIPMFSALS | VIPMFSALS | VIPMFSALS | |
| YSDQATPLL | YSDQATPLL | YSDQATPLL | YSDQATPLL | ||
| YSDQATPLL | YSDQATPLL | YSDQATPLL | YSDQATPLL | ||
| VHFFKNIVT | VHFFKNIVT | VHFFKNIVT | VHFFKNIVT | VHFFKNIVT | |
| FKNIVTPRT | FKNIVTPRT | FKNIVTPRT | |||
| FKGEQGPKG | FKGEQGPKG | FKGEQGPKG | FKGEQGPKG | FKGEQGPKG | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| IGILNAAKV | IGILNAAKV | IGILNAAKV | IGILNAAKV | ||
| IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | IGTLNAAKV | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| WVKQNTLKL | WVKQNTLKL | WVKQNTLKL | WVKQNTLKL | WVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| WRSDEALPL | - | WRSDEALPL | WRSDEALPL | WRSDEALPL | |
| MRADAAAGG | MRADAAAGG | MRADAAAGG | |||
| IILNHPGQI | - | IILNHPGQI | IILNHPGQI | ||
| YEKLSAEQS | YEKLSAEQS | YEKLSAEQS | YEKLSAEQS | YEKLSAEQS | |
| MRMATPLLM | MRMATPLLM | MRMATPLLM | MRMATPLLM | MRMATPLLM | |
| MRMATPLLM | MRMATPLLM | MRMATPLLM | MRMATPLLM | ||
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | YVKQNTLKL | |
| 41/41 | 39/41 | 36/41 | 32/41 | ||
| 39/39 | 39/39 | 38/39 | 34/39 | 30/39 | |
| 2/2 | 1/2 | 2/2 | 2/2 | ||
Incorrectly predicted binding cores are highlighted in bold. "Correct" gives the number of correctly identified cores over 41 X-ray structures. "Correct I" gives the number of correctly identified cores over 39 X-ray structures whose DRB alleles are predictable by TEPITOPE. "Correct II" gives the number of correctly identified cores over 2 X-ray structures whose DRB alleles are not predictable by TEPITOPE. The PSSMs of TEPITOPE, TEPITOPEpan and the results of MultiRTA, NetMHCIIpan-2.0 were obtained from their respective web servers.