| Literature DB >> 26589500 |
Linus Backert1, Oliver Kohlbacher2,3,4.
Abstract
Immunoinformatics involves the application of computational methods to immunological problems. Prediction of B- and T-cell epitopes has long been the focus of immunoinformatics, given the potential translational implications, and many tools have been developed. With the advent of next-generation sequencing (NGS) methods, an unprecedented wealth of information has become available that requires more-advanced immunoinformatics tools. Based on information from whole-genome sequencing, exome sequencing and RNA sequencing, it is possible to characterize with high accuracy an individual's human leukocyte antigen (HLA) allotype (i.e., the individual set of HLA alleles of the patient), as well as changes arising in the HLA ligandome (the collection of peptides presented by the HLA) owing to genomic variation. This has allowed new opportunities for translational applications of epitope prediction, such as epitope-based design of prophylactic and therapeutic vaccines, and personalized cancer immunotherapies. Here, we review a wide range of immunoinformatics tools, with a focus on B- and T-cell epitope prediction. We also highlight fundamental differences in the underlying algorithms and discuss the various metrics employed to assess prediction quality, comparing their strengths and weaknesses. Finally, we discuss the new challenges and opportunities presented by high-throughput data-sets for the field of epitope prediction.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26589500 PMCID: PMC4654883 DOI: 10.1186/s13073-015-0245-0
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Fig. 1Generating predictions from data. a Evaluation of the predictor using cross-validation: first the data-set is split into k-folds (k = 5). Next, five predictors are trained on four folds and validated on the one left out. Evaluation can be, for example, a receiver operating characteristic (ROC) curve analysis. Finally, an average ROC curve is generated. b Training of the final predictor: after evaluation, the final predictor is trained on the complete data-set
Examples of databases offering immunological data
| Database | Content | Reference |
|---|---|---|
| SYFPEITHI | MHC ligands, T-cell epitopes | [ |
| IEDB | Epitopes, epitope–MHC/BCR complexes | [ |
| IMGT | Antibodies, T-cell receptors | [ |
| IMGT/HLA | HLA alleles | [ |
| MHCBN 4.0 | MHC peptides, TAP-interacting peptides | [ |
| AntiJen | MHC ligands, TCR–MHC complexes, T-cell epitopes, TAP, B-cell epitopes, protein–protein interactions | [ |
| Dana-Farber Repository | MHC ligands for machine learning | [ |
Abbreviations: BCR B-cell receptor, HLA human leukocyte antigen, IEDB Immune Epitope Database, IMGT International ImMunoGeneTics information system, MHC major histocompatibility complex, MHCBN MHC binding and non-binding, TAP transporter associated with antigen processing, TCR T-cell receptor
Methods for analyzing steps in the antigen-processing pathway and for HLA typing
| Predictor/tool | Key method | Reference |
|---|---|---|
| HLA class I binding | ||
| Allele-specific | ||
| SYFPEITHI | PSSM | [ |
| RANKPEP | PSSM | [ |
| BIMAS | PSSM | [ |
| SVMHC | SVM | [ |
| netMHC | ANN | [ |
| Pan-specific | ||
| MULTIPRED | HMM/ANN | [ |
| netMHCpan | ANN | [ |
| PickPocket | PSSM | [ |
| TEPITOPEpan | PSSM | [ |
| ADT | Threading | [ |
| UniTope | SVM | [ |
| KISS | SVM | [ |
| HLA class II binding | ||
| Allele-specific | ||
| SYFPEITHI | PSSM | [ |
| netMHCII/SM-align | PSSM/ANN | [ |
| ProPred | PSSM | [ |
| RANKPED | PSSM | [ |
| TEPITOPE | PSSM | [ |
| SVRMHC | SVM | [ |
| MHC2MIL | Multi-instance learning | [ |
| MHC2pred | SVM | – |
| Pan-specific | ||
| MULTIPRED | HMM/ANN | [ |
| MHCIIMulti | Multi-instance learning | [ |
| TEPITOPEpan | PSSM | [ |
| netMHCIIpan | ANN | [ |
| Consensus methods | ||
| CONSENSUS | – | [ |
| netMHCcon | – | [ |
| Binding stability | ||
| netMHCstab | ANN | [ |
| Proteasomal cleavage | ||
|
| ||
| netChop 20S | ANN | [ |
| PCM | PSSM | [ |
| FragPredict | PSSM | [ |
| Pcleavage | SVM | [ |
| PAProC | ANN | [ |
|
| ||
| netChop Cterm | ANN | [ |
| ProteaSMM | PSSM | [ |
| TAP transport | ||
| PredTAP | HMM/ANN | [ |
| SVMTAP | SVM | [ |
| Integrated processing | ||
| EpiJen | – | [ |
| WAPP | – | [ |
| NetCTL | – | [ |
| NetCTLpan | – | [ |
| T-cell reactivity | ||
| POPI | SVM | [ |
| POPISK | SVM | [ |
| B-cell epitope prediction | ||
| Continuous | ||
| COBEpro | SVM | [ |
| BCPRed | SVM | [ |
| FBCPred | SVM | [ |
| Discontinuous | ||
| EPMeta | SVM | [ |
| Discotope 2.0 | Linear regression | [ |
| NGS-based HLA typing | ||
| ATHLATES | Contig assembly | [ |
| seq2HLA | Greedy algorithm | [ |
| OptiType | Integer linear programming | [ |
| Polysolver | Bayesian classification | [ |
Abbreviations: ANN artificial neural network, HLA human leukocyte antigen, HMM hidden Markov model, NGS next-generation sequencing, PSSM position-specific scoring matrix, SVM support vector machine, TAP transporter associated with antigen processing
Fig. 2Antigen processing pathways. Top: HLA class I pathway — the endogenous antigen is cleaved by the proteasome into peptides. These peptides are transported into the endoplasmic reticulum (ER) by the TAP and become bound to HLA class I. The HLA–ligand complex is transported in a vesicle to the cell surface and can be recognized by the TCR on CD8+ T cells. Bottom: HLA class II pathway — the exogenous antigen is taken up into the cell, digested into peptides and bound to HLA class II in an endosome. The HLA–ligand complex is transported in a vesicle to the cell surface and can be recognized by the TCR on CD4+ T cells. HLA human leukocyte antigen, TAP transporter associated with antigen processing, TCR T-cell receptor