| Literature DB >> 26512199 |
Heng Luo1, Hao Ye2, Hui Wen Ng2, Leming Shi3, Weida Tong2, Donna L Mendrick2, Huixiao Hong2.
Abstract
As major histocompatibility complexes in humans, the human leukocyte antigens (HLAs) have important functions to present antigen peptides onto T-cell receptors for immunological recognition and responses. Interpreting and predicting HLA-peptide binding are important to study T-cell epitopes, immune reactions, and the mechanisms of adverse drug reactions. We review different types of machine learning methods and tools that have been used for HLA-peptide binding prediction. We also summarize the descriptors based on which the HLA-peptide binding prediction models have been constructed and discuss the limitation and challenges of the current methods. Lastly, we give a future perspective on the HLA-peptide binding prediction method based on network analysis.Entities:
Keywords: HLA; MHC; binding; machine learning; peptide; prediction
Year: 2015 PMID: 26512199 PMCID: PMC4603527 DOI: 10.4137/BBI.S29466
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Figure 1The typical pathways by which HLAs present antigen peptides to T-cells. In the HLA Class I pathway, endogenous antigen proteins are degraded by proteasomes into peptides that are transported via transporters associated with antigen processing (TAPs) into the ER. The peptides are loaded onto Class I HLAs and the complexes are sent to the Golgi apparatus for modification. Finally, the complexes are fused into the cell membrane where they can be recognized by TCRs on CD8+ T-cells. In the HLA Class II pathway, exogenous protein antigens are ingested by the cell into endocytic vesicular compartments and loaded onto Class II HLAs in the ER and processed by Golgi apparatus. The complexes are presented on the cell surface and recognized by TCR of CD4+ T-cells.
An overview of major machine learning tools for predicting HLA–peptide binding sorted by category and method. The tools were divided into two categories, qualitative or quantitative, depending on the outputs. The underlying method, descriptors, performance, and URL were harvested from the original papers. The supported number and class of HLAs and corresponding length of peptides were harvested from either the original papers or their websites. Some tools utilize extra process to deal with peptides with various lengths, which are listed in the “extra process” column.
| CATEGORY | NAME | METHOD | DESCRIPTOR | PERFORMANCE | HLA (CLASS) | PEPTIDE LENGTH (HLA CLASS) | EXTRA PROCESS | URL |
|---|---|---|---|---|---|---|---|---|
| Qualitative | ANNPred | ANN | Sparse encoding | Accuracy: 87.3%±5.9% | 30(I) | 9-mers(I) | N/A | |
| MULTIPRED | ANN/HMM/SVM | Sparse encoding | AUC >0.80 | 23(I), 6(II) | 9-mers(I), 9-mer cores(II) | N/A | ||
| nHLAPred | ANN/PSSM | Sparse encoding | Accuracy: 93.6%±2.92% | 30(I) | 9-mers(I) | N/A | ||
| Zhu et al. | Decision Tree | N/A | Accuracy: ~0.8 | 16(I) | 9-mers(I) | N/A | N/A | |
| S-HMM | HMM | N/A | AUC: 0.85~0.89 | 1(II) | 9~25-mers (II) | N/A | N/A | |
| ocHMM | HMM | Physicochemical property grouping | Accuracy: 0.35~0.99 | 2(I) | Various(I) | N/A | N/A | |
| Salomon et al. | Kernel | BLOSUM62 | AUC: 0.82~0.96 | 25(II) | 9~33-mers (II) | N/A | N/A | |
| KISS | SVM | Heckerman et al81# | AUC: 0.86~0.90 | 35(I) | 9-mers(I) | N/A | ||
| MHC2PRED | SVM | Sparse encoding | Accuracy: ~80% | 42(II) | 9-mers or longer(II) | Matrix optimization techniques (MOTs) | ||
| POPI | SVM | Physicochemical properties | Accuracy: ~60% | 23(I), 21(II) | 9-mers(I), 9-mer cores(II) | N/A | ||
| SVMHC | SVM/PSSM | Sparse encoding | MCC: 0.85 | 32(I), 51(II) | 9-mers(I), 9-mer cores(II) | N/A | ||
| Quantitative | NetMHC/NetMHCII | ANN | Sparse encoding/BLOSUM50 | AUC: 0.914(I), 0.787(II) | 78(I), 14(II) | 8~11-mers(I), various(II) | NN-align | |
| NetMHCpan/NetMHCIIpan | ANN | Sparse encoding/BLOSUM50 | Pearson: 0.77(I), AUC: 0.847(II) | 150(I), 35(II) | 8~14-mers(I), 9~19-mers(II) | Similar to NN-align | ||
| IEDB | ANN/Consensus | N/A | AUC: 0.96(I), 0.76(II) | 50(I), 54(II) | Various(I/II) | N/A | ||
| NetMHCcons | Consensus | N/A | Better than single methods | 101(I) | 8~15-mers(I) | N/A | ||
| MHCMIR/MHC2MIR | MIL/MIR | BLOSUM62 | AUC: 0.73~0.89 | 26(II) | 9~25-mers(II) | N/A | ||
| MHCPRED | QSAR regression | N/A | q | 11(I), 3(II) | 9-mers(I), 9-mer cores(II) | N/A | ||
| SVRMHC | SVR | Sparse encoding/11 physicochemical properties | q | 36(I), 6(II) | 9-mer cores(I/II) | Iterative self-consistent (ISC) |
Notes:
NetMHCpan/NetMHCIIpan/NetMHCcons can predict any HLA allele with a known sequence, thus the HLA number is unlimited.
The descriptors contain both the peptides and HLAs.
The BLUSUM62 matrix was used for distance calculation.
Abbreviation: N/A, not available/applicable.