| Literature DB >> 33952199 |
Xiaoyun Yang1, Liyuan Zhao1, Fang Wei2, Jing Li3.
Abstract
BACKGROUND: Epitope prediction is a useful approach in cancer immunology and immunotherapy. Many computational methods, including machine learning and network analysis, have been developed quickly for such purposes. However, regarding clinical applications, the existing tools are insufficient because few of the predicted binding molecules are immunogenic. Hence, to develop more potent and effective vaccines, it is important to understand binding and immunogenic potential. Here, we observed that the interactive association constituted by human leukocyte antigen (HLA)-peptide pairs can be regarded as a network in which each HLA and peptide is taken as a node. We speculated whether this network could detect the essential interactive propensities embedded in HLA-peptide pairs. Thus, we developed a network-based deep learning method called DeepNetBim by harnessing binding and immunogenic information to predict HLA-peptide interactions.Entities:
Keywords: Deep learning; Network analysis; T cell epitope prediction
Mesh:
Substances:
Year: 2021 PMID: 33952199 PMCID: PMC8097772 DOI: 10.1186/s12859-021-04155-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Workflow diagram of DeepNetBim framework. First, the binding and immunogenic data were retrieved from the IEDB database. Then, the weighted HLA-peptide binding network (coloured in blue) and immunogenic network (coloured in purple) were constructed separately to acquire quantified network features. Next, the encoded peptides and the obtained network features were fed into an attention-based deep learning process. Finally, the predicted binding affinity and binary immunogenic category of the above two independent models were combined to make the final prediction. BA binding affinity, IC immunogenic category
Coefficients of network metrics in two multiple regression models
| Binding (linear regression) | HLA attributes | |||||
| Variable | ||||||
| coefficient | 0.281 | − 0.048 | 0.095 | 0.02 | 0.003 | |
| Peptide attributes | ||||||
| Variable | ||||||
| Coefficient | − 0.044 | − 0.173 | − 0.003 | 0.05 | ||
| Immunogenic (logistic regression) | HLA attributes | |||||
| Variable | ||||||
| Coefficient | 1.914 | 5.241 | − 1.878 | − 4.083 | − 2.242 | |
| Peptide attributes | ||||||
| Variable | ||||||
| Coefficient | − 0.429 | 3.791 | 0.092 | − 1.022 | ||
Fig. 2Improved performance of integrating network metrics in DeepNetBim. a Absolute error for binding affinity prediction and b accuracy rate for immunogenic prediction of the original model (coloured in blue) and the PepOnly model (coloured in white) for the 6 most frequent alleles in the test dataset. c Absolute error for binding affinity prediction and d ROC curve for immunogenic prediction of the original network, shuffle networks (shuffled network edge weights) and random network (reassigned network weights by uniform distribution) in the test dataset
Fig. 3Performance comparison on the benchmark dataset and the external dataset. a Performance of the binding model compared with PepOnly and other tools on the latest IEDB benchmark dataset in terms of AUC and SRCC. b Performance compared with PepOnly, NetMHC4.0 and NetMHCpan4.0 on external data in terms of absolute error value
Fig. 4Performance on the separate models and the combined model. a Boxplot of positive and random peptides in the binding and the immunogenic model with p value < 0.001 by one-sided Mann–Whitney U test. b The proportion of recognized neoantigens in the top-ranked 10, 20, 30, 40 and 50 neoantigen candidates
Performance on combined model
| Binding model | Immunogenic model | Combined model | |
|---|---|---|---|
| Accuracy | 0.695 | 0.627 | 0.676 |
| Sensitivity | 0.780 | 0.585 | 0.708 |
| Specificity | 0.694 | 0.628 | 0.676 |
| PPV | 0.061 | 0.038 | 0.081 |
Fig. 5The architecture of the deep neural network in DeepNetBim. For an HLA-peptide pair, the encoded peptide (encoded to 9 × 21 matrix, coloured in blue) and network metrics (encoded to 1 × 8 vector, coloured in green) were fed into both convolutional and attention modules. The outputs of the two modules were then merged together by concatenating them to a single tensor