| Literature DB >> 35311072 |
Elizabeth S Borden1,2, Kenneth H Buetow3,4, Melissa A Wilson3,4, Karen Taraszka Hastings1,2.
Abstract
Prioritization of immunogenic neoantigens is key to enhancing cancer immunotherapy through the development of personalized vaccines, adoptive T cell therapy, and the prediction of response to immune checkpoint inhibition. Neoantigens are tumor-specific proteins that allow the immune system to recognize and destroy a tumor. Cancer immunotherapies, such as personalized cancer vaccines, adoptive T cell therapy, and immune checkpoint inhibition, rely on an understanding of the patient-specific neoantigen profile in order to guide personalized therapeutic strategies. Genomic approaches to predicting and prioritizing immunogenic neoantigens are rapidly expanding, raising new opportunities to advance these tools and enhance their clinical relevance. Predicting neoantigens requires acquisition of high-quality samples and sequencing data, followed by variant calling and variant annotation. Subsequently, prioritizing which of these neoantigens may elicit a tumor-specific immune response requires application and integration of tools to predict the expression, processing, binding, and recognition potentials of the neoantigen. Finally, improvement of the computational tools is held in constant tension with the availability of datasets with validated immunogenic neoantigens. The goal of this review article is to summarize the current knowledge and limitations in neoantigen prediction, prioritization, and validation and propose future directions that will improve personalized cancer treatment.Entities:
Keywords: MHC class I; MHC class II; neoantigen prediction; neoantigen prioritization; neoantigens (neoAgs)
Year: 2022 PMID: 35311072 PMCID: PMC8929516 DOI: 10.3389/fonc.2022.836821
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Overview of neoantigen prediction, prioritization, and validation. Neoantigen prediction relies on sample acquisition, high quality sequencing data, variant calling, and variant annotation. Neoantigen prioritization requires predicting some combination of the potential for the neoantigen to be expressed, processed, bound by MHC, and recognized by the T cell receptor (TCR). The development of neoantigen prioritization models relies on the availability of validated datasets of neoantigen immunogenicity. Figure created with BioRender.com.
Figure 2Sample collection and sequencing considerations. Here we describe considerations for obtaining sequencing data for neoantigen prediction including tissues needed, tissue collection method, and sequencing types. Figure created with BioRender.com.
Figure 3Types of mutations that can lead to neoantigens. Single nucleotide variants (SNVs) caused by a point mutation in a single nucleic acid. Insertions and deletions (indels) caused by addition of nucleic acids or loss of nucleic acids. Indels with a frameshift occur when the number of nucleic acids is not a multiple of three, changing the reading frame. Gene fusions can be caused by either translocations at the DNA level or RNA splicing of independent transcripts. Figure created with BioRender.com.
Comparison of the ranking of single nucleotide variant (SNV) callers across six benchmarking studies that have been released since 2017.
| Software | Supernat et al., 2018 ( | Bian et al., 2018 ( | Pei et al., 2020 ( | Kumaran et al., 2019 ( | Wang et al., 2020 ( | Wang et al., 2020 ( | Hofmann et al., 2017 ( |
|---|---|---|---|---|---|---|---|
|
| #1 | #1 | |||||
|
| #1 | #1 | #22 | #2 | #37 | ||
|
| #1 | ||||||
|
| #12 | ||||||
|
| #2 | #1 | #2 | ||||
|
| #1 | #1 | |||||
|
| #1 | #2 | |||||
|
| #1 | ||||||
|
| #2 | #4 | |||||
|
| #3 | #2 | #2 | ||||
|
| #2 | ||||||
|
| #32 | ||||||
|
| #3 | #3 | |||||
|
| #3 | ||||||
|
| #3 | #4 | |||||
|
| #4 | #3 | #48 | ||||
|
| #3 | ||||||
|
| #4 | ||||||
|
| #43 | #4 | |||||
|
| #43 | ||||||
|
| #4 |
Benchmarking papers from before 2017 were excluded as they typically compared outdated software versions or compared software that are no longer maintained. Numbers and colors indicate the relative ranking based on the individual paper, with one (green) being the highest two (yellow), three (orange), and four (red) being the lowest.
1These rankings are based on 30x data. In 15x data, the improved performance of DeepVariant was enhanced.
2At 20% purity.
3Good performance at high purity, but poor performance for low purity samples.
4Results based on DREAM WGS datasets as ground truth.
5Results based on WES and deep sequencing spike in studies.
6Software not free.
7High performance at low VAF, low performance at high VAF.
8High sensitivity, but with very high false positive rate.
Comparison of the ranking of insertion and deletion (indel) callers across four benchmarking studies that have been released since 2017.
| Software | Supernat et al., 2018 ( | Pei et al., 20201 ( | Kumaran et al., 2019 ( | Wang et al., 2020 ( |
|---|---|---|---|---|
|
| #1 | #1 | ||
|
| #2 | #1 | #1 | |
|
| #1 | |||
|
| #1 | |||
|
| #2 | |||
|
| #2 | |||
|
| #2 | |||
|
| #2 | |||
|
| #3 | |||
|
| #3 | |||
|
| #3 | |||
|
| #3 | |||
|
| #3 | |||
|
| #3 |
Benchmarking papers from before 2017 were excluded as they typically compared outdated software versions or compared software that are no longer maintained. Numbers and colors indicate the relative ranking based on the individual paper, with one (green) being the highest, two (yellow), and three (orange) being the lowest.
140% purity.
Figure 4Steps of MHC class I-restricted neoantigen prioritization and summary of characteristics considered for each step. Mutations in the DNA of a tumor cell are transcribed into RNA and translated into a protein. At the end of the life cycle of the protein, the protein is broken down into peptides by the proteasome and transported into the endoplasmic reticulum by the transporter associated with antigen presentation (TAP). Once inside the endoplasmic reticulum, the peptide has the opportunity to be loaded on MHC class I. If the peptide is successfully bound to MHC class I, the peptide:MHC complex is transported to the cell surface where the peptide:MHC complex has the opportunity to be recognized by the T cell receptor (TCR). Characteristics of the neoantigen encompassing expression, processing, MHC class I binding, and TCR recognition potential have been assessed to enhance prioritization of MHC class I-restricted neoantigens and are summarized in each of the boxes in the figure.
Comparison of available neoantigen: MHC class I binding prediction tools.
| Software | Model Type | Data Type | Published Comparisons | Performance metrics |
|---|---|---|---|---|
|
| Artificial neural network | Mass spectrometry eluted peptides and binding affinity measurements | Outperformed MHCflurry1.2 and MixMHCpred, outperformed NetMHC4.0 for HLA-B and -C | Immunogenicity predictions |
|
| Artificial neural network | Mass spectrometry eluted peptides and binding affinity measurements | Outperformed NetMHCpan4.0 and MixMHCpred | Binding vs. non-binding predictions |
|
| Artificial neural network | Binding affinity measurements | Comparable performance to MHCflurry1.2 and NetMHCpan3.0 | Binding vs. non-binding predictions |
|
| Artificial neural network | Mass spectrometry eluted peptides and binding affinity measurements | Comparable performance to NetMHCpan4.0, outperformed MHCflurry1.2 and MixMHCpred | Binding vs. non-binding predictions |
|
| Artificial neural network | Mass spectrometry eluted peptides | Outperformed original NetMHC, original NetMHCpan, and MixMHCpred | Binding vs. non-binding predictions |
|
| Artificial neural network | Binding affinity measurements | Outperformed NetMHCpan4.0 | Correlation with validated binding affinity |
|
| Artificial neural network | Binding affinity measurements | None provided | None provided |
|
| Matrix approach | Mass spectrometry eluted peptides | Outperformed NetMHC3.0 and NetMHCpan3.0 | Binding vs. non-binding predictions |
|
| Artificial neural network | Mass spectrometry eluted peptides | Outperformed NetMHC4.0 and NetMHCpan2.8 | Binding vs. non-binding predictions |
|
| Artificial neural network | Binding affinity measurements | Outperformed Pickpocket, IEDB SMM, and original NetMHCpan model | Binding vs. non-binding predictions |
|
| Binding models | Database of known binding peptides | None provided | None provided |
|
| Matrix approach | Binding affinity measurements | Outperformed IEDB SMM, underperformed original NetMHC model | Binding vs. non-binding predictions |
|
| Matrix approach | Binding affinity measurements | Underperformed original NetMHCpan model | Binding vs. non-binding predictions and correlation with validated binding affinity |
|
| Positional scanning peptide libraries | Binding affinity measurements | Only compared to older models not included in this summary, performed better than 10/16 available methods | Binding vs. non-binding predictions |
|
| Matrix approach | Binding affinity measurements | None provided | None provided |
|
| Matrix approach | Binding affinity measurements | None provided | None provided |
|
| Binding motifs | Binding affinity measurements | None provided | None provided |
Models included that met the following criteria 1) released since 2012 or included in a benchmarking study since 2012, 2) published in a peer reviewed journal, 3) available for web-based or command-line application, and 4) the most recent versions of a given software. Published comparisons are based on the comparisons reported in the publication of the new model. Performance metrics summarize whether the published comparisons were based on the ability of the model to predict immunogenicity, categorize each neoantigen as a binder vs. non-binder, or on the correlation between the predicted and experimentally validated binding affinity.
Summary of MHC class I-restricted neoantigen prioritization models.
| MuPeXI ( | Neoepitope novelty ( | Neopepsee ( | pTuneos ( | TESLA ( | NeoScore ( | |
|---|---|---|---|---|---|---|
|
| RNA | – | – | RNA | RNA | RNA |
|
| VAF1 | – | – | PyClone | – | – |
|
| – | – | – | NetCTLpan | – | – |
|
| – | – | – | NetCTLpan | – | – |
|
| NetMHCpan | NetMHCpan | NetMHCpan | NetMHCpan | NetMHCpan | NetMHCpan |
|
| – | – | – | – | NetMHCstabpan | NetMHCstabpan |
|
| – | – | Chowell et al. | Trained neural network | – | – |
|
| – | – | Chowell et al. | – | – | – |
|
| – | – | Sequence similarity to known epitopes | Łuksza et al. | – | – |
|
| Number of mismatches | BLOSUM62 matrix | – | BLOSUM62 matrix | – | – |
|
| X | – | X | X | – | – |
|
| – | BLOSUM62 matrix | – | – | – | – |
|
| – | – | Saethang et al. | – | – | – |
|
| AUC = 0.635 in test set | AUC = 0.66 in training set | Not reported | AUC = 0.833 with 10-fold cross-validation | Cannot be calculated | AUC = 0.845 in test set |
Where applicable, the tool used for each characteristic is specified. Dashes indicate that the characteristic was not included in the model and “X” indicates that the characteristic was included in a model, but that the characteristic is a fixed quantity with no specific tools to report.
1VAF, variant allele frequency.
2TAP, transporter associated with antigen processing.
3Kd, dissociation constant.
4Amplitude, ratio of the dissociation constant of the wild type peptide and neoantigen.
5AUC, area under the receiver operator characteristics curve.
Figure 5Steps of MHC class II-restricted neoantigen prioritization and summary of characteristics considered for each step. Mutations in the DNA of a tumor cell are transcribed into RNA and translated into a protein. The protein can either be taken up into the endocytic compartment of an antigen presenting cell or processed and presented by the tumor cell if the tumor cell expresses MHC class II (not pictured). In the late endosomes, protein cleavage and MHC class II loading occurs. The protein is cleaved by cathepsins at the N- and C-termini before and after binding to the MHC class II molecule. If the peptide is successfully bound to MHC class II, the peptide:MHC complex is transported to the cell surface where the peptide: MHC complex has the opportunity to be recognized by the T cell receptor (TCR). Characteristics of the neoantigen encompassing expression, processing, MHC class II binding, and TCR recognition potential that may enhance prioritization of MHC class II-restricted neoantigens are summarized in each of the boxes in the figure. * indicates characteristics that, to our knowledge, have not been assessed for the prioritization of MHC class II-restricted neoantigens.
Comparison of available neoantigen: MHC class II binding prediction tools.
| Software | Model Type | Data Type | Published Comparisons | Performance Metrics |
|---|---|---|---|---|
|
| Artificial neural network | Binding affinity measurements | Comparable performance to NetMHCIIpan3.2, outperformed models from before 2012 | Binding vs. non-binding predictions and correlation with validated binding affinity |
|
| Artificial neural network | Mass spectrometry eluted peptides and binding affinity measurements | Outperformed NetMHCIIpan3.2, MixMHC2Pred, MHCnuggets, and DeepSeqPanII | Immunogenicity predictions |
|
| Artificial neural network | Binding affinity measurements | Comparable performance to NetMHCIIpan3.2 | Binding vs. non-binding predictions |
|
| Matrix approach | Mass spectrometry eluted peptides | Outperformed NetMHCIIpan3.2 | Binding vs. non-binding predictions |
|
| Artificial neural network | Mass spectrometry eluted peptides | Outperformed NetMHCpan3.1 | Binding vs. non-binding predictions |
Five of the newest methods summarized here due to recent benchmarking demonstrating that these methods highly outperformed earlier models. Only the most recent version of each software is included. Published comparisons are based on the comparisons reported in the publication of the new model. Performance metrics summarize whether the published comparisons were based on the ability of the model to predict immunogenicity, categorize each neoantigen as a binder vs. non-binder, or on the correlation between the predicted and experimentally validated binding affinity.
Summary of MHC class II-restricted neoantigen prioritization models.
| MARIA ( | Abelin ( | Alspach ( | |
|---|---|---|---|
|
| RNA | RNA | – |
|
| Neural network for N-/C-terminal motifs | Neural network for N-/C-terminal motifs | – |
|
| Neural network for MHC class II binding scores | NeonMHC | Hidden Markov model |
| Overlap with known HLA-DQ peptides | |||
|
| – | Weighted for genes over-represented on MHC class II | – |
|
| AUC = 0.89 | AUC = 0.98 | AUC = 0.90 |
Where applicable, the tool used for each characteristic is specified. Dashes indicate that the characteristic was not included in the model.
1AUC stands for area under the receiver operator characteristics curve.
Available sets of MHC class I-restricted neoantigens validated to elicit a CD8+ T cell response.
| Author and Year | Tumor Type | Tested Neoantigens: Immunogenic Neoantigens | Available Sequencing Data | Mutations Tested | Prioritization Method | Validation Method |
|---|---|---|---|---|---|---|
| Melanoma | 227:10 | WES | SNVs and small indels | NetMHCpan2.4 | ELISPOT | |
| Ovarian cancer | 114:1 | WES | SNVs | NetMHCpan2.4 | ELISPOT | |
| Chronic lymphocytic leukemia | 48:3 | WES | SNVs | NetMHCpan2.4 | ELISPOT | |
| Melanoma | 357:9 | WES, RNAseq | SNVs | Expression >1 FPKM and MHC binding by IEDB | ELISA | |
| Melanoma | 21:11 | WES, RNAseq | SNVs | NetMHC2.4 | ELISA | |
| Lung cancer | 355:2 | WES | SNVs | NetMHCpan2.8 | Multimers | |
| Melanoma | 57:11 | WES, RNAseq | SNVs | Expression >0 FPKM and NetMHC3.2, NetMHCpan2.0 | Multimers | |
| Lung cancer | 702:9 | WES | SNVs | NetMHCpan2.8 | Multimers | |
| Melanoma | 27:6 | WES, RNAseq | SNVs | VAF>10%, mutation in DNA and RNA | ELISPOT | |
| Melanoma | 165:18 | WES, RNAseq | SNVs and small indels | NetMHCpan2.4 and oncogene mutations | ELISPOT | |
| Melanoma and Lung cancer | 347:27 (available) | WES, RNAseq | SNVs and small indels | Consensus from 25 groups | Multimers |
Datasets were only included if they included a minimum of ten neoantigens.
Available sets of MHC class II-restricted neoantigens validated to elicit a CD4+ T cell response.
| Author and Year | Tumor Type | Neoantigens Tested : Immunogenic Neoantigens | Available Sequencing Data | Mutations Tested | Prioritization Method | Validation Method |
|---|---|---|---|---|---|---|
| Melanoma | 165:80 | WES, RNAseq | SNVs and small indels | NetMHCpan2.4 and oncogene mutations | ELISPOT | |
| Melanoma | 125:60 | WES, RNAseq | SNVs | RNA expression > 10 RPKM and IEDB binding predictions | ELISPOT |
Datasets were only included if they included a minimum of ten neoantigens.
Figure 6Summary of three commonly applied validation techniques for the immunogenicity of MHC class I or II-restricted neoantigens. Mass spectrometry is performed by eluting peptides directly from tumor cells and validates the in vivo presentation of the neoantigen on the cell surface. MHC multimers (most commonly a tetramer) bind T cell receptors (TCR) specific for the particular neoantigen: MHC, validating TCR recognition of the neoantigen and expansion of neoantigen-specific T cells. ELISA, ELISpot, and intracellular cytokine staining detect the production of cytokines, typically interferon-gamma (IFNγ), interleukin-2 (IL-2), or tumor necrosis factor alpha (TNFα), to validate T cell activation. Figure created with BioRender.com.