| Literature DB >> 33858848 |
Ana Marcu1,2, Leon Bichmann3,4, Leon Kuchenbecker4, Daniel Johannes Kowalewski3, Lena Katharina Freudenmann3,2,5, Linus Backert3,4, Lena Mühlenbruch3,2,5, András Szolek4, Maren Lübke3,2, Philipp Wagner3,6, Tobias Engler6, Sabine Matovina6, Jian Wang7, Mathias Hauri-Hohl8, Roland Martin7, Konstantina Kapolou9, Juliane Sarah Walz3,2,10,11, Julia Velz9, Holger Moch12, Luca Regli9, Manuela Silginer13, Michael Weller13, Markus W Löffler3,2,5,14,15, Florian Erhard16, Andreas Schlosser17, Oliver Kohlbacher4,5,18,19,20,21,22, Stefan Stevanović3,2,5, Hans-Georg Rammensee3,2,5, Marian Christoph Neidert9,23,24.
Abstract
BACKGROUND: The human leucocyte antigen (HLA) complex controls adaptive immunity by presenting defined fractions of the intracellular and extracellular protein content to immune cells. Understanding the benign HLA ligand repertoire is a prerequisite to define safe T-cell-based immunotherapies against cancer. Due to the poor availability of benign tissues, if available, normal tissue adjacent to the tumor has been used as a benign surrogate when defining tumor-associated antigens. However, this comparison has proven to be insufficient and even resulted in lethal outcomes. In order to match the tumor immunopeptidome with an equivalent counterpart, we created the HLA Ligand Atlas, the first extensive collection of paired HLA-I and HLA-II immunopeptidomes from 227 benign human tissue samples. This dataset facilitates a balanced comparison between tumor and benign tissues on HLA ligand level.Entities:
Keywords: adaptive immunity; antigen presentation; antigens; carbohydrate; immunotherapy; translational medical research; tumor-associated
Mesh:
Substances:
Year: 2021 PMID: 33858848 PMCID: PMC8054196 DOI: 10.1136/jitc-2020-002071
Source DB: PubMed Journal: J Immunother Cancer ISSN: 2051-1426 Impact factor: 13.751
Figure 1The HLA Ligand Atlas: content and scope of the data resource. (A) The high-throughput experimental and computational workflow steps used to analyze thousands of HLA-I and HLA-II peptides isolated from benign tissues. The resulting HLA-I and HLA-II immunopeptidomes are comprised in the searchable web resource: https://hla-ligand-atlas.org. See online supplemental figure S1 for details of the quality control workflow. See online supplemental figure S2 for proof of principle using autopsy tissues. (B) HLA-I and HLA-II peptides expand the know immunopeptidome as curated in the public repositories SysteMHC and IEDB. (C) Sample matrix: HLA-I- (blue triangles) and HLA-II samples (orange triangles) included in the HLA Ligand Atlas cover 29 different tissues obtained from 21 human subjects. See online supplemental table S1 for patient characteristics. (D) Position-wise coverage (%) of identified source proteins by HLA ligands binned into four groups: (1) exclusively covered by HLA-I peptides, (2) exclusively covered by HLA-II peptides and (3–4) covered by both and separated into higher position-wise coverage by either HLA-I or HLA-II peptides. HLA, human leucocyte antigen; IEDB, immune epitope database; LC-MS/MS, liquid chromatography mass spectrometry.
Figure 2Source proteins and HLA allotype coverage characteristics of HLA ligands. (A) Length distribution of identified HLA-I and HLA-II peptides from all samples was analyzed. HLA-II peptide lengths are mirrored on the negative side of the x-axis. (B, C) Global overview of HLA-I predicted binders distributed across HLA molecules. HLA binding prediction was performed with NetMHCpan-4.1 (% binding rank ≤2) and SYFPEITHI (Score >50%), while multiple HLA allotypes per peptide were allowed as long as their scores met the aforementioned thresholds. HLA binding prediction for HLA-II ligands was performed with NetMHCIIpan-4.0 and MixMHC2pred (% binding rank ≤5 for both). (D) Pairwise hierarchical clustering of samples based on the Jaccard similarity between HLA-I (blue) and HLA-II (orange) source proteins. The dendrogram illustrates the nearest neighbor based on the similarity between tissues and subjects. (E) Violin plots illustrate the distribution of the Jaccard similarity index for each pairwise comparison between the same subject—different tissues; different subjects—the same tissue and different subject—different tissues. (F) Gene ontology (GO) term enrichment of cellular components was performed for HLA-I and HLA-II source proteins. Top10 enriched genes with respect to their log10 p value (Fisher’s exact test) differentiate between intracellular and extracellular antigen processing pathways. HLA, human leucocyte antigen.
Figure 3Tissues exhibit a gradual separation based on the immunopeptidome yield. (A) The number of identified HLA-I and HLA-II peptides per sample (subject and tissue combinations) was sorted and plotted by median immunopeptidome yield per tissue. Boxes span the inner two quantiles of the distribution and whiskers extend by the same length outside the box. Remaining outlier samples are indicated as black diamonds. The number of subjects contributing to each tissue is illustrated on the y-axis in parenthesis. (B) A linear model was used to correlate the log transformed HLA-I and HLA-II median peptide yields with log transformed median gene expression counts (RPKM) of the immunoproteasome and HLA-DRB1 per tissue32. Corresponding R2, p value (F-statistic) and spearman rho are indicated in the bottom right box. HLA, human leucocyte antigen.
Figure 4Small subsets of source proteins are tissue exclusive. (A, B) Gene set enrichment (left) was tested for each tissue by correlating unique HLA-I and HLA-II source proteins per tissue with upregulated genes as annotated in GTEx. Heatmaps depict log10 p values (Fisher’s exact test) for each pairwise comparison. The number of tissue-specific HLA-I and HLA-II source proteins is depicted through the bar plot for each tissue on the right-hand side of the heatmaps. In addition, GO term enrichment (right) of biological processes was performed using the panther DB webservice for selected tissues with the same set of HLA-I and HLA-II tissue-specific source proteins. Top five enriched terms with respect to their log10 p value (Fisher’s exact test) were selected. DB, database; GO, gene ontology; HLA, human leucocyte antigen.
Figure 5Cryptic peptides are part of the benign immunopeptidomes. (A) Spectra were searched with Peptide-PRISM to identify peptides of cryptic origin. Briefly, de novo sequencing was performed, and top 10 sequences per spectrum were queried against a database consisting of the three-frame translated transcriptome (Ensemble 90) and the six-frame translated human genome (HG38). Target-Decoy search was performed per database stratum, separately for canonical and cryptic peptides. (B) The HLA-allotype distribution of cryptic peptides was plotted in relation to cryptic and canonical peptides predicted to bind to the respective HLA allotype across all subjects and tissues. (C) Distribution of identified cryptic peptides categorized into multiple non-coding genomic regions. (D) Linear model correlating measured retention times (RT) of cryptic peptides with their predicted RTs trained on canonical peptide RTs. Corresponding R2, pi (width of the prediction interval—red dashed lines) and frac (the number of peptides falling into the prediction interval) are indicated in the bottom right. (E) 36 cryptic peptides were selected for spectral validation with synthetic peptides. The similarity between the synthetic and experimental spectrum was computed by correlation scores. (F) Exemplary spectral comparison of the cryptic peptide SVASPVTLGK and its synthesized heavy isotope-labeled counterpart (P+6). Matching b (red) and y ions (blue) are indicated as well as the isotope mass shifted ions (orange stars) of the synthesized peptide. FDR, false discovery rate; HLA, human leucocyte antigen.
Figure 6HLA Ligand Atlas data enables prioritization of tumor-associated antigens (TAAs) and HLA ligands form hotspots in source proteins. (A) The size-proportional Venn diagram illustrates the overlap between the pooled glioblastoma (GBM) and benign HLA-I and -II immunopeptidomes, respectively. The waterfall plots show the number of glioblastoma-associated HLA-I ligands and their frequency among the three GBM patients analyzed. (B) Published CTAs are presented as HLA-I or HLA-II ligands on benign tissues, including testis but also in glioblastoma tumors. The number of identified samples either from the HLA Ligand Atlas or the glioblastoma dataset is depicted on the x-axis, provided that each CTA has been identified with at least two different HLA ligands. The CTA KIA1210 was identified exclusively on HLA-I source proteins in testis and is marked with an asterisk. (C) The position-wise HLA ligand coverage profiles as available in the HLA Ligand Atlas web interface for two exemplary proteins (left), the fibrinogen alpha chain (Uniprot ID P02671, length 866 aa, top) and the basement membrane-specific heparan sulfate proteoglycan core protein (Uniprot ID P98160, length 4391 aa, bottom) are shown, illustrating the spatial clustering of HLA ligands into hotspots. For P02671 a close-up of such a cluster is shown in form of a multiple sequence alignment of the identified peptides (right). CTAs, cancer testis antigens; HLA, human leucocyte antigen, GBM, glioblastoma.