| Literature DB >> 35758790 |
Ronghui You1, Wei Qu1,2, Hiroshi Mamitsuka3,4, Shanfeng Zhu1,2,5,6,7,8.
Abstract
MOTIVATION: Computationally predicting major histocompatibility complex (MHC)-peptide binding affinity is an important problem in immunological bioinformatics. Recent cutting-edge deep learning-based methods for this problem are unable to achieve satisfactory performance for MHC class II molecules. This is because such methods generate the input by simply concatenating the two given sequences: (the estimated binding core of) a peptide and (the pseudo sequence of) an MHC class II molecule, ignoring biological knowledge behind the interactions of the two molecules. We thus propose a binding core-aware deep learning-based model, DeepMHCII, with a binding interaction convolution layer, which allows to integrate all potential binding cores (in a given peptide) with the MHC pseudo (binding) sequence, through modeling the interaction with multiple convolutional kernels.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35758790 PMCID: PMC9235502 DOI: 10.1093/bioinformatics/btac225
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.The architecture of DeepMHCII. The red arrows are processes of the binding interaction convolutional layer (BICL) and the blue arrows are processes of binding core prediction.
Summary statistics of BD2016
| Allele | No. of peptides | No. of binders | No. of MHCs |
|---|---|---|---|
| HLA-DR | 87 363 | 40 756 | 36 |
| HLA-DP | 15 564 | 5135 | 9 |
| HLA-DQ | 28 081 | 9098 | 27 |
| H-2 | 3273 | 894 | 8 |
| Total | 134 281 | 55 883 | 80 |
Data redundancy of BD2016 and BD2020
| Dataset | Outer-p | Inner-p | Ratio |
|---|---|---|---|
| BD2016 | 4.39e−4 | 2.41e−2 | 54.9 |
| BD2020 | 2.43e−3 | 2.43e−3 | 1 |
Performance of DeepMHCII and competing methods
| Method | 5-CV | LOMO | IEBD test | Binding core | ||
|---|---|---|---|---|---|---|
| AUC | PCC | AUC | PCC | AUC | No. of correct/No. of total | |
| NetMHCIIpan-3.2 | 0.847 | 0.679 | 0.775 | 0.544 | 0.719 | 45/51 |
| PUFFIN | 0.846 | 0.676 | 0.768 | 0.525 | 0.700 | – |
| DeepSeqPanII | 0.759 | 0.524 | 0.732 | 0.473 | 0.700 | 10/51 |
| DeepMHCII |
|
|
|
|
|
|
Fig. 2.Performance comparison between DeepMHCII and (a) NetMHCIIpan-3.2, (b) PUFFIN and (c) DeepSeqPanII under LOMO. Each dot represents an MHC-II molecule
Performance (AUC) of DeepMHCII and competing methods on the independent testing set
| Allele | No. of peptides | No. of binders | NetMHCIIpan-3.2 | PUFFIN | DeepSeqPanII | DeepMHCII |
|---|---|---|---|---|---|---|
| DBR1*01:01 | 100 | 81 | 0.880 | 0.834 | 0.733 |
|
| DBR1*03:01 | 99 | 61 | 0.588 | 0.620 | 0.536 |
|
| DBR1*04:01 | 142 | 91 | 0.787 | 0.762 | 0.748 |
|
| DBR1*07:01 | 94 | 72 | 0.800 | 0.728 | 0.732 |
|
| DBR1*09:01 | 62 | 45 | 0.793 | 0.796 | 0.654 |
|
| DBR1*11:01 | 94 | 72 | 0.614 | 0.629 |
| 0.657 |
| DBR1*12:02 | 59 | 48 | 0.661 | 0.742 |
| 0.788 |
| DBR1*13:01 | 57 | 47 | 0.538 | 0.472 | 0.591 |
|
| DBR1*15:01 | 96 | 81 | 0.767 | 0.683 | 0.751 |
|
| DBR1*15:02 | 54 | 41 | 0.760 | 0.735 | 0.730 |
|
| Average | 0.719 | 0.700 | 0.700 |
|
Fig. 3.ROC curves by ID2017
Performance of DeepMHCII and MHCAttnNet on BD2020
| Method | Accuracy | AUC |
|---|---|---|
| MHCAttnNet | 0.755 | 0.758 |
| DeepMHCII |
|
|
Fig. 4.Sequence logos by DeepMHCII, NetMHCIIpan-3.2 and DeepSeqPanII
Weights of BICL for nine pockets of the binding core
| Weight | |
|---|---|
| P1 |
|
| P2 | 230.51 ± 10.22 |
| P3 | 232.19 ± 8.48 |
| P4 |
|
| P5 | 232.00 ± 10.86 |
| P6 |
|
| P7 | 239.53 ± 9.44 |
| P8 | 227.72 ± 9.80 |
| P9 |
|
Performance of DeepMHCII with different kernel sizes under 5-fold CV
| Method | AUC | PCC |
|---|---|---|
| DeepMHCII 5 | 0.828 | 0.637 |
| DeepMHCII 7 | 0.834 | 0.651 |
| DeepMHCII 9 | 0.849 | 0.676 |
| DeepMHCII11 | 0.853 | 0.685 |
| DeepMHCII13 | 0.854 | 0.686 |
| DeepMHCII15 | 0.853 | 0.686 |
| DeepMHCII |
|
|
Performance of DeepMHCII with CNN instead of BICL
| Methods | AUC | PCC |
|---|---|---|
| CNN | 0.806 | 0.608 |
| DeepMHCII |
|
|
Computation time and model size of DeepMHCII and competing methods
| Method | Training (s) | Test (ms/sample) | Model size (MB) |
|---|---|---|---|
| PUFFIN | 651 | 2.252 | 24.1 |
| DeepSeqPanII | 7967 | 0.828 | 15.3 |
| DeepMHCII | 693 | 0.583 | 1.4 |