| Literature DB >> 34417437 |
Ying Xu1,2, Guan-Hua Su1,2, Ding Ma1,2, Yi Xiao3,4, Zhi-Ming Shao5,6,7, Yi-Zhou Jiang8,9.
Abstract
Immunotherapies play critical roles in cancer treatment. However, given that only a few patients respond to immune checkpoint blockades and other immunotherapeutic strategies, more novel technologies are needed to decipher the complicated interplay between tumor cells and the components of the tumor immune microenvironment (TIME). Tumor immunomics refers to the integrated study of the TIME using immunogenomics, immunoproteomics, immune-bioinformatics, and other multi-omics data reflecting the immune states of tumors, which has relied on the rapid development of next-generation sequencing. High-throughput genomic and transcriptomic data may be utilized for calculating the abundance of immune cells and predicting tumor antigens, referring to immunogenomics. However, as bulk sequencing represents the average characteristics of a heterogeneous cell population, it fails to distinguish distinct cell subtypes. Single-cell-based technologies enable better dissection of the TIME through precise immune cell subpopulation and spatial architecture investigations. In addition, radiomics and digital pathology-based deep learning models largely contribute to research on cancer immunity. These artificial intelligence technologies have performed well in predicting response to immunotherapy, with profound significance in cancer therapy. In this review, we briefly summarize conventional and state-of-the-art technologies in the field of immunogenomics, single-cell and artificial intelligence, and present prospects for future research.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34417437 PMCID: PMC8377461 DOI: 10.1038/s41392-021-00729-7
Source DB: PubMed Journal: Signal Transduct Target Ther ISSN: 2059-3635
Fig. 1Components and interactions of the tumor immune microenvironment. a Cellular compositions of the tumor immune microenvironment. b Brief illustration of cell–cell interaction in anti-tumor immunity. cDC conventional dendritic cells, CTL cytotoxic T lymphocyte, Gzm granzyme, IFN interferon, MHC major histocompatibility complex, NK cells natural killer cells, pDC plasmacytoid dendritic cells, PFN perforin, TCR T cell receptor, Th T helper cell, TNF tumor necrosis factor
Computational tools used in tumor immunogenomics with high-throughput next-generation sequencing data
| Tool | Characteristics | URL | Year | Ref. |
|---|---|---|---|---|
| CIBERSORT | Based on linear support vector regression, deconvolution from microarray data, and known gene expression profiling set. | 2015 | [ | |
| CIBERSORTx | Expanding data source to single-cell RNA-seq. | 2019 | [ | |
| DeconRNASeq | Constrained least square regression model validated with RNA-seq data from five human tissues | 2013 | [ | |
| EPIC | Based on constrained least square, incorporates a non-negative condition into deconvolution. | 2017 | [ | |
| ESTIMATE | Generating a stromal score and immune score to reflect the tumor purity based on ssGSEA | 2013 | [ | |
| FARDEEP | Based on adaptive least trimmed square, FARDEEP removes outliers and outputs the absolute quantification of cell types | 2019 | [ | |
| MCP-counter | The score is the geometric mean of the expression level of cell-specific genes, implying the absolute abundance of immune cell types among samples | 2016 | [ | |
| MuSiC | Deconvolution of bulk sequencing data based on cell type-specific gene expression reference from single-cell RNA-seq | 2019 | [ | |
| NITUMD | A semi-supervised nonnegative matrix factorization framework with a trichotomous signature matrix | 2020 | [ | |
| PERT | A non-negative maximum likelihood-based method applied to fresh human umbilical cord blood samples | – | 2012 | [ |
| quanTIseq | Quantification of ten different immune cell types and other uncharacterized cells based on RNA-seq data | 2019 | [ | |
| TIMER | Immune cells are estimated via transcriptomic data and the correlations among the immunological, genomic, and clinical features were established. | 2016 | [ | |
| xCell | Spillover compensation is used to separate cell types with high correlation | 2017 | [ | |
| CN-Learn | Machine-learning framework integrating calls from multiple CNV detection algorithms and learning to accurately identify true CNVs | 2019 | [ | |
| deepSNV | Detecting and quantifying sub-clonal SNVs in mixed populations even for low-frequency variants | 2012 | [ | |
| DeepVariant | SNP and small-indel variant caller using deep neural networks in aligned NGS read data | 2018 | [ | |
| EBCall | Discriminating somatic mutations from sequencing errors with both moderate and low allele frequencies | 2013 | [ | |
| GATK | Industry standard for identifying SNPs and indels via analyzing WES, WGS, and RNA-seq data | 2010 | [ | |
| LoFreq | Modeling sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population | 2012 | [ | |
| MuTect2 | Sensitive detection of somatic point mutations especially in low-allelic-fraction events | 2013 | [ | |
| Platypus | Using local de novo assembly to generate candidate variants, including SNPs, indels, and complex polymorphisms | 2014 | [ | |
| PyroHMMsnp | Realigning read sequences around homopolymers and inferring the underlying genotype by using a Bayesian approach | 2013 | [ | |
| SAMtools | Variant caller utilizing the post-processing alignments in the SAM/BAM format | 2009 | [ | |
| SCcaller | Firm foundation for standardized somatic-mutation analysis in single-cell genomics based on single-cell multiple displacement amplification (SCMDA) | 2017 | [ | |
| SomaticSeq | Somatic mutation detection pipeline used to produce highly accurate somatic mutation calls for both SNVs and small INDELs | 2015 | [ | |
| SomaticSniper | Calling of somatic SNPs and indels from matched tumor–normal NGS data | 2011 | [ | |
| Strelka2 | Fast and accurate caller of germline and somatic variants based on Strelka | 2018 | [ | |
| VarDict | Variant caller of SNV, MNV, INDELs, and SVs, enabling ultra-deep sequencing | 2016 | [ | |
| VarScan2 | Detection of somatic mutations and CNVs in exome data from tumor–normal pairs | 2012 | [ | |
| HISAT2 | Graph-based genome alignment and genotyping, also applied for DNA fingerprinting | 2019 | [ | |
| HLA-HD | Extraction of six-digit resolution HLA-I and HLA-II from NGS data | 2017 | [ | |
| HLA-miner | HLA-I and HLA-II typing directly from non-targeted RNA-seq, WGS and WES data | 2012 | [ | |
| HLAProfiler | K-mer profile-based method for HLA calling in RNA-seq data for both rare and common HLA alleles at two-field precision | 2017 | [ | |
| HLAreporter | Extraction of HLA-I and HLA-II from NGS data at four-digit resolution | 2015 | [ | |
| HLAscan | Determination of HLA type across the whole-genome, exome, and target sequences | 2017 | [ | |
| HLAssign | First highly automated open-source HLA-typing method for NGS data to three-field resolution | 2015 | [ | |
| HLA-VBseq | Genotyping of HLA alleles at an 8-digit resolution from WGS data without the need of prior knowledge regarding the HLA loci | 2015 | [ | |
| Kourami | Graph-guided assembly technique used to provide highly classical HLA typing | 2018 | [ | |
| Optitype | Genotyping of major and minor HLA-I alleles from RNA-seq, WGS, and WES data not specifically enriched for the HLA cluster | 2014 | [ | |
| PHLAT | High-accuracy genotyping of HLA-I and HLA-II alleles from RNA-seq, WGS, WES, and targeted sequencing at a four-digit resolution | 2014 | [ | |
| Polysolver | High-precision HLA typing of WES data even relatively low-coverage WES data, and subsequent mutation detection | 2015 | [ | |
| seq2HLA | Using standard RNA-Seq reads as input to determine the HLA-I and HLA-II types and expression at a four-digit resolution | 2012 | [ | |
| SNP2HLA | Imputing four-digit classical alleles and amino acid polymorphisms at class I and class II loci | 2013 | [ | |
| ACME | Pan-specific peptide–MHC class I binding prediction through attention-based deep neural networks | 2019 | [ | |
| MHCAttnNet | MHC-peptide binding prediction of MHC alleles classes I and II using an attention-based deep neural model | 2020 | [ | |
| MHCflurry | Open-source class I MHC binding affinity prediction, using mass spectrometry datasets for model selection and showing competitive accuracy | 2018 | [ | |
| MHCSeqNet | Open-source deep neural network model for universal MHC binding prediction, accepting peptides of any length | 2019 | [ | |
| NetMHC | High accuracy prediction of pMHC binding affinity to human and non-human MHC-I molecules based on ANN and PSSMs | 2008 | [ | |
| NetMHCII | High accuracy prediction of pMHC binding affinity to human and non-human MHC-II molecules based on ANN and PSSMs | 2018 | [ | |
| NetMHCIIpan | Pan-specific version of netMHCII | 2020 | [ | |
| NetMHCpan | Pan-specific version of netMHC | 2020 | [ | |
| PSSMHCpan | PSSM based software for predicting class I peptide-HLA binding affinity | 2017 | [ | |
| PUFFIN | Deep residual network-based computational approach that quantifies uncertainty in pMHC affinity prediction | 2019 | [ | |
ANN artificial neural network, CNVs copy number variations, HLA human leukocyte antigen, INDELs insertions and deletions, NGS next-generation sequencing, MNVs multiple-nucleotide variants, PSSM position-specific scoring matrix, RNA-seq RNA sequencing, SNP single-nucleotide polymorphism, SNVs single nucleotide variations, SVs structural variants, TIME tumor immune microenvironment, WES whole-exome sequencing, WGS whole-genome sequencing
Strengths and weaknesses of immune cells quantification algorithms
| Algorithm | Category | Strengths | Weakness |
|---|---|---|---|
| ESTIMATE | G | Available for tumor purity and global immune status | Only a stromal score and an immune score are output. The information is limited[ |
| xCell | G | Available for inference of 64 immune and stromal cell | The definitions of the cell subtypes are sometimes not clear Accuracy of prediction of some cell types is uncertain[ |
| MCP-counter | G | Available for inference of fibroblasts and endothelial cells Available for an absolute quantification of specific cell population across samples Available for between-sample comparison | Relatively less cell types included in the inference (8 types) |
| CIBERSORT | D | Available for inference of 22 immune cell subtypes Available for between-cell-type comparison | Relative proportion of distinct cell types in a single sample Trained on microarray rather than RNA-seq data[ |
| EPIC | D | Available for inference of fibroblasts, endothelial cells, and uncharacterized cells Enabling inference of tumor purity from uncharacterized cell proportion Available for both between-sample and between-cell-type comparison | Only 6 immune cell types available Not available for discrimination of cell types with transcriptional similarity |
| quanTIseq | D | Available for inference of 10 immune cell subtypes Available for both between-sample and between-cell-type comparison | Not available for quantification of stromal cells (e.g., cancer-associated fibroblasts) |
| TIMER | D | A user-friendly analytic web tool for cancer immunology research | Only 6 immune cell types and no stromal cells available Relative proportion of distinct cell types in a single sample |
| CIBERSORTx | D | Adopting a more convincing gene expression reference from single-cell sequencing | Suitability for some tumor types needs further validation |
| MuSiC | D | Adopting a more convincing gene expression reference from single-cell sequencing Available for tissues with intensively correlated cell types | Suitability for some tumor types needs further validation Not available for TPM data as input[ |
| FARDEEP | D | A robust machine learning tool eliminating outliers in the dataset Suitable for deconvolution of noisy datasets | Different signature matrix should be adopted according to the type of gene expression data[ |
G GSEA-based method, D deconvolution method
Comparison of immunomics technologies at the single-cell level
| Technology | Spatial | Strengths | Weaknesses |
|---|---|---|---|
| H&E | √ | Simple intelligible protocol Lower cost and less time Impressive preservation of tissue morphology | Lack of specific markers Only morphological features and basophilic or eosinophilic information available |
| mIHC&IF | √ | Highly specific marker Detailed information regarding the abundance, distribution and localization of certain substances | Spectral overlap Limited simultaneously detectable markers Time-consuming and labor intensive |
| Flow cytometry | Affordable and fast Machinery available in most institutes More tools available for analysis Could perform cell sorting | Spectral overlap Fluorescent spill-over Targets need to be selected carefully (biased) | |
| CyTOF | More simultaneously detectable markers Higher accuracy without spectral overlap | Costly (both the machine and antibodies) Slower processing speed and lower sensitivity Targets need to be carefully selected (biased) | |
| Spectral flow cytometry | Compatible with flow cytometry (both the machine and antibodies) Greatly eliminates confounding factors | Targets need to be carefully selected (biased) | |
| Single-cell seq | Unbiased Parallel multi-omics analysis Generation of new hypotheses | Limited to nearly 10,000 cells Limited sequencing depth/coverage Costly, time-consuming and labor intensive | |
| CODEX | √ | Higher accuracy and specificity Detection of over 50 markers in a single slide | Affected by the tissue quality Accumulative structural changes Costly, time-consuming and labor intensive |
| IMC | √ | At near-optical resolution Could be applied to biobanked tissues More simultaneously detectable markers | Lack of suitable commercial antibodies for use Comparatively lower rate of image acquisition Limited extent to which slides can be scanned Costly and only available in high-end facilities |
| MIBI-TOF | √ | High accuracy at near-optical resolution Could be applied to biobanked tissue Indefinitely stable samples More simultaneously detectable markers | Lack of suitable commercial antibodies for use Comparatively lower rate of image acquisition Limited extent to which slides can be scanned Costly and only available in high-end facilities |
| Spatial transcriptomics | √ | Visualization and quantitative analysis of the transcriptome with spatial resolution | Small-niche but not real single-cell sequencing Comparatively low resolution |
| Slide-seq | √ | High spatial resolution High scalability to large tissue volumes Lower cost and better accessibility | Small-niche but not real single-cell sequencing Not suitable for analyzing multiple sections Confined to transcriptomics data |
| HDST | √ | Higher spatial resolution than Slide-seq High scalability to large tissue volumes Lower cost and better accessibility | Small-niche but not real single-cell sequencing Not suitable for analyzing multiple sections Confined to transcriptomics data |
| DBiT-seq | √ | Unbiased High spatial resolution multi-omics seq Compatible with different tissues High accessibility and operability | Small-niche but not real single-cell sequencing Existence of a theoretical limit of the pixel size |
| ZipSeq | √ | Provides a complete map of live tissues May integrate with multimodal measurements | Confined to transcriptomics data Costly and only available in few facilities |
CODEX codetection by indexing, CyTOF cytometry by time-of-light, DBiT-seq deterministic barcoding in tissue for spatial omics sequencing, HDST high-definition spatial transcriptome, H&E hematoxylin-eosin, mIHC multiplex immunohistochemistry, mIF multiplex immunofluorescence, IMC imaging mass cytometry, MIBI-TOF multiplexed ion beam imaging by time-of-flight
Fig. 2Development of single-cell spatial technologies: from germination to maturity. (1) Initiation stage: H&E staining, a conventional but significant method that clearly demonstrates the cellular and tissue structure but underperforms in the discrimination of immune cells. (2) Growing stage: The specific binding of antibodies and antigens drove the spatial technologies to a new height as represented by IHC and IF. In addition, multiplex IHC/IF technologies allow the detection of multiple markers simultaneously on a single slice, improving our understanding of the TIME spatial architecture. (3) Mature stage: Given that the spectral overlap limits the further application of mIHC/IF, utilizing dye cycling is a main optimization strategy in which only two or three antibodies are imaged by fluorescence microscopy in each cycle. Then, the fluorophores are cleaved and washed, and this cycle is repeated until all antibodies are imaged, such as CODEX, MxIF, and MELC. Also, IMC and MIBI-TOF utilize mental-conjugated antibodies to eliminate confounding factors, such as spectral overlap, and are also promising. (4) Postmature stage: Combining high-resolution spatial information with single-cell expression data, spatial transcriptomics, slide-seq, HDST, etc. explore brand-new ideas for the characterization of the spatial architecture. CODEX codetection by indexing, HDST high-definition spatial transcriptome, H&E hematoxylin-eosin, IF immunofluorescence, IHC immunohistochemistry, IMC imaging mass cytometry, MELC multiepitope ligand cartography, MIBI-TOF multiplexed ion beam imaging by time-of-flight, MxIF multiplexed fluorescence microscopy method
Fig. 3Radiomics and computational pathology in tumor immunity exploration. Radiological and pathological image-derived omics data enable the investigation of the tumor immune microenvironment (TIME) and response to immunotherapy. For a raw radiology image, regions of interest, generally representing the tumor lesion area, are segmented, while a pathological image is divided into numerous sub-images. Two methods can be applied to analyze these high-dimensional data. First, features, including but not limited to statistical features, tumor volume features, and texture features, are extracted and analyzed by professional clinicians. Alternatively, images are input into a convolutional neural network (CNN). After a complicated deep learning process, robust models are output. These radiomics or digital pathologic models could finally be established to evaluate or predict the immune index, which can be divided into three aspects. First, TIME dissection encompasses distinct immune cell subset classification and TIME spatial architecture characterization, resembling single-cell technologies and CODEX, respectively. Second, immune-related biomarkers, such as the tumor mutation burden (TMB), could be predicted. Third, response to immunotherapy and clinical outcomes could be predicted. ROI regions of interest, TIME tumor immune microenvironment, TMB tumor mutation burden
Clinical significance of immunomics technologies
| Category | Representative technique | Example of clinical relevance | Cancer type | Ref. |
|---|---|---|---|---|
| Bulk sequencing | Abnormal peptides prediction | Developing tumor neoantigen vaccines | Melanoma | [ |
| HLA typing | Glioblastoma | [ | ||
| MHC-antigen binding affinity | Melanoma, NSCLC | [ | ||
| Conventional staining on pathological slides | Immunohistochemistry Immunofluorescence | Detecting immunotherapy biomarkers, such as PD-L1 and TILs | Multiple cancer types | [ |
| Single-cell technologies | CyTOF | Identifying multiple prognosis-correlated T cell and macrophage phenotypes | Renal cell carcinoma | [ |
| Revealing immune cell heterogeneity between glioma and brain metastases | Brain cancer | [ | ||
| Discovering distinct liver TIME driving resistance to immunotherapy | Liver metastatic cancer | [ | ||
| MIBI-TOF | Dissecting the spatial architecture of the TIME as a promising immunotherapy biomarker | Triple-negative breast cancer | [ | |
| Single-cell transcriptomics | ILCregs indicate a poor prognosis | Colorectal cancer | [ | |
| TNFRSF9 + Treg cells refer to a poor prognosis | Lung adenocarcinoma | [ | ||
| CLEC9A + DC represents a better clinical outcome | Nasopharyngeal carcinoma | [ | ||
| TCF1-PD1 + T cells correlate to sensitive to immunotherapy | Melanoma | [ | ||
| CD11b + F4/80+ macrophages lead to resistance to immunotherapy | Liver metastatic cancer | [ | ||
| CCL22 + cDC1 cells are related to sensitivity to CD40 agonist therapy | Colon cancer | [ | ||
| Cytotoxic CD4 + T cells serve as a biomarker of responders of anti-PD-L1 treatment | Bladder cancer | [ | ||
| Artificial intelligence | Radiomics | Radiomics signature predicts immunotherapy biomarker “CytAct” | Lung adenocarcinoma | [ |
| Radiomics signature predicts immunotherapy biomarker TMB | NSCLC | [ | ||
| Radiomics signature predicts chemotherapy biomarker “ImmunoScore“ | gastric cancer | [ | ||
| Radiomics signature predicts response to immunotherapy | Multiple cancer types | [ | ||
| Digital pathology | Discovering the relationship between the number of immune cold regions and tumor relapse risk | Lung adenocarcinoma | [ |
cDC conventional dendritic cell, CytAct cytolytic activity score, CyTOF cytometry by time-of-light, DC dendritic cell, HLA human leukocyte antigen, ILCreg regulatory innate lymphoid cell, MHC major histocompatibility complex, MIBI-TOF multiplexed ion beam imaging by time-of-flight, NSCLC non-small cell lung cancer, TMB tumor mutation burden, Treg regulatory T cell
Fig. 4A landscape of immunomics: developmental tendency and future direction. a The timeline of immunomics technologies. b Historical development trajectory and future prospective of tumor immunomics. CODEX codetection by indexing, CyTOF cytometry by time-of-light, ESTIMATE estimation of stromal and immune cells in malignant tumors using expression data, GATK Genome Analysis Toolkit, GenomeVIP Genome Variant Investigation Platform, HDST high-definition spatial transcriptome, IMC imaging mass cytometry, MCP-counter microenvironment cell populations-counter, MIBI-TOF multiplexed ion beam imaging by time-of-flight, mIF multiplex immunofluorescence, mIHC multiplex immunohistochemistry