| Literature DB >> 35367648 |
Jenna Cleyle1, Marie-Pierre Hardy2, Robin Minati1, Mathieu Courcelles2, Chantal Durette2, Joel Lanoix2, Jean-Philippe Laverdure2, Krystel Vincent2, Claude Perreault3, Pierre Thibault4.
Abstract
Colorectal cancer is the second leading cause of cancer death worldwide, and the incidence of this disease is expected to increase as global socioeconomic changes occur. Immune checkpoint inhibition therapy is effective in treating a minority of colorectal cancer tumors; however, microsatellite stable tumors do not respond well to this treatment. Emerging cancer immunotherapeutic strategies aim to activate a cytotoxic T cell response against tumor-specific antigens, presented exclusively at the cell surface of cancer cells. These antigens are rare and are most effectively identified with a mass spectrometry-based approach, which allows the direct sampling and sequencing of these peptides. Although the few tumor-specific antigens identified to date are derived from coding regions of the genome, recent findings indicate that a large proportion of tumor-specific antigens originate from allegedly noncoding regions. Here, we employed a novel proteogenomic approach to identify tumor antigens in a collection of colorectal cancer-derived cell lines and biopsy samples consisting of matched tumor and normal adjacent tissue. The generation of personalized cancer databases paired with mass spectrometry analyses permitted the identification of more than 30,000 unique MHC I-associated peptides. We identified 19 tumor-specific antigens in both microsatellite stable and unstable tumors, over two-thirds of which were derived from noncoding regions. Many of these peptides were derived from source genes known to be involved in colorectal cancer progression, suggesting that antigens from these genes could have therapeutic potential in a wide range of tumors. These findings could benefit the development of T cell-based vaccines, in which T cells are primed against these antigens to target and eradicate tumors. Such a vaccine could be used in tandem with existing immune checkpoint inhibition therapies, to bridge the gap in treatment efficacy across subtypes of colorectal cancer with varying prognoses. Data are available via ProteomeXchange with identifier PXD028309.Entities:
Keywords: cancer immunotherapy; colorectal cancer; immunopeptidomics; mass spectrometry; tumor-specific antigen
Mesh:
Substances:
Year: 2022 PMID: 35367648 PMCID: PMC9134101 DOI: 10.1016/j.mcpro.2022.100228
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 7.381
Description of CRC-derived cell lines
| Cell line | Tissue | Morphology | Disease | Biomarkers | MHC I molecule/cell | HLA genotyping | Mutations of interest |
|---|---|---|---|---|---|---|---|
| Colo205 | Colon; derived from metastatic site: ascites | Epithelial | Dukes' type D, colorectal adenocarcinoma | MSS, CIMP | 1.44 × 105 ± 0.00282 × 105 | HLA-A∗01:01 HLA-A∗02:01 | BRAF (V600E), SMAD4, TP53 |
| HLA-B∗07:02 HLA-B∗08:01 | |||||||
| HLA-C∗07:01 HLA-C∗07:02 | |||||||
| HCT116 | Colon | Epithelial | Colorectal carcinoma | MSI, CIMP | 5.07 × 105 ± 0.30 × 105 | HLA-A∗01:01 HLA-A∗02:01 | RAS (G13D), PI3CA, CDKN2A, CTNNB1 (B-catenin) |
| HLA-B∗18:01 HLA-B∗45:01 | |||||||
| HLA-C∗05:01 HLA-C∗07:01 | |||||||
| RKO | Colon | Epithelial | Carcinoma | MSI, CIMP | 2.82 × 105 ± 0.11 × 105 | HLA-A∗03:01 | BRAF (V600E), PI3CA |
| HLA-B∗18:01 | |||||||
| HLA-C∗07:01 | |||||||
| SW620 | Colon; derived from metastatic site: lymph node | Epithelial | Dukes' type C, colorectal adenocarcinoma | MSS, CIN | 1.69 × 105 ± 0.0017 × 105 | HLA-A∗02:01 HLA-A∗24:02 | APC, RAS (G12V), SMAD4, TP53 |
| HLA-B∗07:02 HLA-B∗15:18 | |||||||
| HLA-C∗07:02 HLA-C∗07:04 |
Description of primary tumor and matched NAT
| Sample ID | Sex | Age | Ethnic background | Matrix | Diagnosis | Histological diagnosis | Stage | Tumor content % | Mutations of interest | HLA |
|---|---|---|---|---|---|---|---|---|---|---|
| S1_N | colon | NAT | HLA-A∗24:02 | |||||||
| S1_T | F | 73 | Caucasian | Cecum | Cancer | adenocarcinoma | IIC | 100 | KRAS G12D | HLA-B∗07:02 HLA-B∗35:01 |
| HLA-C∗04:01 HLA-C∗07:02 | ||||||||||
| S2_N | Colon | NAT | HLA-A∗02:01 HLA-A∗03:02 | |||||||
| S2_T | M | 60 | Caucasian | Sigmoid | Cancer | Adenocarcinoma | IIA | 95 | HLA-B∗27:05 HLA-B∗58:01 | |
| HLA-C∗02:02 HLA-C∗07:01 | ||||||||||
| S3_N | Colon | NAT | HLA-A∗01:01 HLA-A∗32:01 | |||||||
| S3_T | F | 63 | Caucasian | Sigmoid | Cancer | Adenocarcinoma | IIA | 100 | KRAS Q61H | HLA-B∗38:01 HLA-B∗50:01 |
| HLA-C∗06:02 HLA-C∗12:03 | ||||||||||
| S4_N | Colon | NAT | HLA-A∗01:01 HLA-A∗11:01 | |||||||
| S4_T | F | 85 | Caucasian | Sigmoid | Cancer | Adenocarcinoma | IIA | 100 | KRAS G12D | HLA-B∗15:01 HLA-B∗57:01 |
| HLA-C∗03:03 HLA-C∗06:02 | ||||||||||
| S5_N | Colon | NAT | HLA-A∗03:01 HLA-A∗30:01 | |||||||
| S5_T | F | 43 | Caucasian | Ascending colon | Cancer | Adenocarcinoma | IIA | 95 | HLA-B∗13:02 HLA-B∗52:01 | |
| HLA-C∗06:02 HLA-C∗12:02 | ||||||||||
| S6_N | Colon | NAT | HLA-A∗03:01 HLA-A∗23:01 | |||||||
| S6_T | F | 48 | Caucasian | Sigmoid | Cancer | Adenocarcinoma | IIA | 95 | HLA-B∗07:02 HLA-B∗18:01 | |
| HLA-C∗07:01 HLA-C∗07:02 |
Fig. 1Proteogenomic workflow for the discovery of tumor-specific antigens (TSAs) in both colorectal cancer–derived cell lines and primary tumor samples. Samples generated from colorectal cancer– and normal intestine-derived cell lines and matching primary tumor/normal adjacent tissue biopsies obtained from six individuals were all processed for both RNA sequencing and major histocompatibility complex class I (MHC-I) immunoprecipitation. RNA sequencing data were used for both the transcriptomic characterization of the samples and the generation of customized global cancer proteome databases. For each sample, the MHC-I associated peptides (MAPs) isolated via immunoprecipitation were identified via LC-MS/MS using the respective database. After validating both the identification and the tumor specificity of our TSA candidates, their therapeutic potentials were evaluated through the prediction of both their immunogenicity and intertumoral distribution. Created with BioRender.com.
Fig. 2Transcriptomic profile of primary tumor/normal adjacent tissue CRC biopsies. A, principal component analysis (PCA) of the top 500 varying genes of each tumor/NAT sample following paired-end RNA-Seq and gene read count normalization with DESeq2. MSI tissues (as determined by MSISensor) are encircled. B, GO term analysis of genes up/downregulated in CRC tissues compared with their adjacent NAT. Genes submitted to GO term analysis were those with ∣log2FC∣ >1 and that were found to be differentially regulated in all samples, using TPM normalized values. C, bar graph showing the mean ESTIMATE immune score of MSS NAT, MSI NAT, MSS CRC, and MSI CRC, with standard deviation shown. D, stacked bar graph showing the mean proportion of the transcriptome attributable to five distinct transcript biotypes in NAT versus CRC samples, with the differences in the proportion of noncoding transcripts being statistically significant between NAT and CRC (noncoding: p = 0.016; coding: p = 0.078; SINE: p = 0.15; LTR: p = 0.056; LINE: p = 0.95). E, scatterplots displaying the SNV counts and INDEL counts of MSS and MSI CRC tissues determined by SNPEff genomic annotation, with mean and standard error bars. CRC, colorectal cancer; MSI, microsatellite instability; MSS, microsatellite stable; NAT, normal adjacent tissue; TPM; transcripts per million.
Fig. 3Immunopeptidomics of CRC-derived cell lines and tissues.A, top panel: Stacked bar chart displaying the number of unique peptides identified in CRC cell lines, and a horizontal line indicating the average number of MAPs per cell line. Bottom panel: Scatterplot indicating the correlation between the number of unique MAPs identified in each cell line and the presentation of MHC I at the cell surface (Pearson’s r = 0.96). B, stacked bar chart displaying the number of unique peptides identified in primary tissue samples, and a horizontal line indicating the average number of MAPs per tissue sample. “All peptides” in (A and B) indicates the number of peptides identified with a 5% FDR, whereas “MHC I peptides” indicates the number of peptides identified with the corresponding peptide score, 8–11 amino acids in length, and a rank eluted ligand threshold ≤2% using netpanMHC4.1b predictions. C, bar chart indicating the proportion of unique MAPs predicted to bind to a given HLA allele in each sample, using NetMHCpan-4.1b predicted affinity. D, GO term analysis of MAP source genes for CRC-derived cell lines and primary tissues. For tissues, only source genes shared by four or more tissues were included in this analysis. E, left panel: Stacked bar chart displaying the proportion of MAPs in each tissue sample derived from protein-coding, hypervariable gene (immunoglobulin or T cell receptor), or noncoding transcripts, or those from unannotated transcripts. Right panel: stacked bar chart displaying the proportion of noncoding MAPs derived from processed transcripts, retained introns, nonstop decay products, nonsense mediated decay products, lncRNA, or those that have no annotated transcript. CRC, colorectal cancer; GO, gene ontology; MAP, MHC I–associated peptide.
Fig. 4Novel TSAs identified in colorectal cancer derive primarily from noncoding regions, whereas the majority of TAAs derive from exons. A, bar chart displaying the number of TSAs identified per sample. B, stacked pie chart identifying the genomic origin of TSAs in the inner pie, as well as what proportion of TSAs are mutated in the middle pie. The outer pie demonstrates what proportion of TSAs are from coding or noncoding sequences. C, bar chart displaying the number of TAAs identified per sample. D, stacked pie chart identifying the genomic origin of TAAs in the inner pie, and what proportion of TAAs are canonical or noncanonical in the middle pie. The outer pie displays what proportion of TAAs are from coding or noncoding sequences. E, heatmap displaying the presence or absence of putative TSAs and TAAs in two previous publications on CRC immunopeptidomics (Löffler et al. 2018 and Newey et al. 2019), as well as IEDB and HLA Ligand Atlas (all tissues, and only colon tissue). MSI, microsatellite instability; MSS, microsatellite stable; TAA, tumor-associated antigen; TSA, tumor-specific antigen.
Biological relevance of TSA source genes in CRC
| Source gene | Reference | Biological relevance in CRC |
|---|---|---|
| COL11A1—Collagen type XI alpha 1 | PMID: | Upregulated in CRC (mRNA), marker of poor prognosis, role in CRC development |
| CYP39A1—cytochrome P450, family 39, subfamily A, polypeptide 1 | PMID: | Expression is increased in CRC with poor prognosis |
| DPH6—Diphthamine biosynthesis 6 | No known association | |
| GRIN2B—Glutamate ionotropic receptor NMDA type subunit 2B | PMID: | Identified as nondriver hub gene involved in progression to stage II CRC |
| HKDC1—Hexokinase domain-containing protein 1 | PMID: | HKDC1 contributes to increased metabolism, proliferation, and metastasis of CRC cells |
| HSPD1—Heat shock protein family D (Hsp60) member 1 | PMID: | Differentially expressed in CRC, potential biomarker for diagnosis; exosomal HSPD1 identified as putative diagnostic and prognostic biomarker in CRC |
| IPP (KLHL27)—Intracisternal A particle-promoted polypeptide | Human Protein Atlas (PMID: | Favorable prognostic marker in colorectal cancer, unfavorable in renal and liver cancers |
| LY6G6F-LY6G6D readthrough—Lymphocyte antigen 6 family member G6F and G6D | PMID: | LY6G6D/F overexpressed in CRC, potential cell surface marker |
| NKD1—Naked cuticle homolog 1 | PMID: | Negative feedback regulator of Wnt pathway, intestinal tumor marker in mice; mutations in NKD1 alter Wnt signaling |
| PATJ—PALS1-associated tight junction protein | No known association | |
| PLK1—Serine/threonine-protein kinase PLK1/polo-like kinase 1 | PMID: | Overexpressed in CRC, associated with metastasis and invasion |
| SUCNR1—Succinate receptor 1 | PMID: | SUCNR1 activation induces Wnt ligand expression and activates WNT signaling and EMT in a CRC-derived cell line |
| TRPC6—Transient receptor potential cation channel subfamily C member 6 | PMID: | mRNA expression of TRPC6 lower in CRC than in normal tissue, may contribute to tumorigenesis |
Biological relevance of TAA source genes in CRC
| Source gene | Reference | Biological relevance in CRC |
|---|---|---|
| BUB1—Mitotic spindle checkpoint kinase | PMID: | Mutations in BUB1 linked to early onset CRC; inactivation may drive metastasis and progression in CRC |
| CDCA8—Cell division cycle associated 8 | PMID: | Overexpressed in CRC, associated with cancer progression |
| CENPE—Centromere-associated protein E | No known association | |
| DIAPH3—Diaphanous related formin 3 | Human Protein Atlas (PMID: | DIAPH3 is prognostic, high expression is favorable in colorectal cancer |
| HI-5—H1.5 linker histone, cluster member | PMID: | Frequently mutated in CRC |
| IDO2—Indoleamine 2,3-dioxygenase 2 | PMID: | Upregulated expression in CRC |
| MACC1—Metastasis-associated in colon cancer 1 | PMID: | Promotes growth and metastasis of colorectal cancer; associated with carcinogenesis through B-catenin signaling and EMT transition |
| MCM10—Minichromosome maintenance 10 replication initiation factor | PMID: | Decreased mRNA expression in colon and rectal adenocarcinoma samples compared with normal tissues |
| MGAM2—Maltase glucoamylase 2 | PMID: | Expressed in GI cancers (TCGA data) |
| NOS2—Nitric oxide synthase 2 | Human Protein Atlas (PMID: | Cancer enhanced (colorectal cancer); RNA data |
| ZNF215—Zinc finger protein 215 | Human Protein Atlas (PMID: | Cytoplasmic expression in subsets of immune cells, most abundant in gastrointestinal tract and lymphoid tissues (protein data) |
Bold, validated.
Fig. 5RNA expression profiles of putative TSAs and TAAs. A, scatter plots displaying the log2FC of transcripts, in TPM, in CRC compared with the matched NAT on the y-axis and the mean average expression in a given tissue sample (mean of CRC and NAT). Highlighted points indicate the source transcripts of putative TAA and TSAs. Both S4 and S5 plots have a canonical TAA point that is not visible, as it overlaps with another canonical TAA source transcript. B, heatmap of mean RNA expression in log(rphm+1) of aeTSA coding sequences and TAA coding sequences (divided as canonical TAAs [canTAA] and noncanonical TAAs [non-canTAA]) in normal tissues from Genotype Tissue Expression (GTEx) Portal and in pooled thymic epithelial cell samples. MHClow tissues include those from brain, nerve, and testis, which have been shown to lowly express MHC I. A black outline indicates a mean RNA expression >8.55 rphm. aeTSA, aberrantly expressed tumor-specific antigen; CRC, colorectal cancer; NAT, normal adjacent tissue; TAA, tumor-associated antigen; TSA, tumor-specific antigen.
Relative quantification ratios of validated tumor antigens in CRC
| Sequence | Nature of antigen | Sample | Endogenous sample ratio | Mean intensity | SPS-MS3 ratio (127N/126) | Synthetic calibration curve R2 |
|---|---|---|---|---|---|---|
| RMLLSHTGK | aeTSA | RKO | N.D. | N.D. | N.D. | N.D. |
| LPHRALSGI | aeTSA | S1 | −0.364 | N.D. | N.D. | N.D. |
| GTNPTAAVK | aeTSA | S2 | 2.095 | 7238.425242 | 12.174 | 1.000 |
| LRHKLVLNR | aeTSA | S2 | 0.307 | N.D. | N.D. | N.D. |
| RIGGVGVEK | aeTSA | S2 | 1.965 | 29256.45 | 6.740 | 1.000 |
| SIIETVNSL | aeTSA | S2 | 0.288 | N.D. | N.D. | N.D. |
| TVNTQQYNTK | aeTSA | S2 | −0.021 | N.D. | N.D. | N.D. |
| SVSHLHIFF | aeTSA | S3 | −1.100 | N.D. | N.D. | N.D. |
| TTLENLPQK | aeTSA | S4 | 0.134 | 3140.8875 | 3.783 | 0.999 |
| AQKLQVRI | aeTSA | S5 | 0.793 | N.D. | N.D. | N.D. |
| GQIELSIYR | aeTSA | S5 | 0.328 | N.D. | N.D. | N.D. |
| HGALSIRSI | aeTSA | S5 | 0.777 | N.D. | N.D. | N.D. |
| RLMKFLPV | aeTSA | S5 | 0.171 | N.D. | N.D. | N.D. |
| SLYISEERK | aeTSA | S5 | 0.046 | N.D. | N.D. | N.D. |
| VQTAVLNV | aeTSA | S5 | 1.089 | N.D. | N.D. | N.D. |
| VEAPHLPSF | aeTSA | S6 | 1.059 | 43782.84192 | 41.318 | 1.000 |
| RNRQVATAL | aeTSA | S6 | 1.090 | 12174.6625 | 5.722 | 1.000 |
| RNRQVATAL | Not assigned | S1 | 0.890 | 15514.2375 | 3.507 | 1.000 |
| KIGEVIVTK | mTSA | S2 | 2.506 | 70659.6 | 13.637 | 1.000 |
| TRSTIILHL | mTSA | S3 | 1.381 | 34365.32187 | 48.807 | 0.997 |
| VLYRSVLLLK | Noncanonical TAA | S6 | 0.997 | N.D. | N.D. | N.D. |
| TYKYVDINTF | Canonical TAA | S1 | 1.969 | 29834.36875 | 8.226 | 0.998 |
| RYLEKFYGL | Canonical TAA | S1 | 2.840 | 27614.24286 | 7.661 | 0.997 |
| RYLEKFYGL | Canonical TAA | S6 | 2.970 | 106928.2875 | 16.090 | 0.999 |
| KSINEFWNK | Canonical TAA | S2 | 2.212 | 56110.11667 | 5.238 | 0.999 |
| RIQLPVVSK | Canonical TAA | S4 | 1.083 | 7612.378571 | 2.073 | 0.999 |
| QMAGLRDTY | Canonical TAA | S3 | 1.140 | 36090.60294 | 2.884 | 0.999 |
| AQYDQASTKY | Canonical TAA | S4 | 1.452 | N.D. | N.D. | N.D. |
| FVDNQYWRY | Canonical TAA | S4 | 0.721 | 5853.986533 | 10.954 | 1.000 |
| SANVSKVSF | Canonical TAA | S5 | 1.114 | 12780.925 | 2.321 | 0.999 |
N.D.: not detected.
Endogenous sample ratio: 127N/126 ratio in endogenous samples.
Fig. 6Validation of TSAs and TAAs. A, heatmap displaying mean RNA expression in log(rphm+1) of TSAs and TAAs in 151 TCGA COAD samples. The proportion of TCGA COAD samples expressing the TSA and TAA sequences at least 10-fold higher than the log-transformed (log(rphm+1)) mean expression of pooled GTEx and mTEC samples is displayed on the left. B, rEpitope immunogenicity scores of various groupings of validated TSAs and TAAs compared with presumably nonimmunogenic thymic peptides reported in Adamopoulou et al. 2013. rEpitope suggested threshold of immunogenicity for MHC I peptides (0.36) is indicated by the dashed line. C, predicted prevalence of tumor antigen-binding MHC class I alleles in US population (IEDB). COAD, colon adenocarcinoma; GTEx, Genotype Tissue Expression project; mTEC, medullary thymic epithelial cell; TAA, tumor-associated antigen; TSA, tumor-specific antigen.