Qi Zhang1, Manchao Kuang1, Haiyin An1, Yajing Zhang1, Kai Zhang2, Lin Feng1, Lei Zhang3, Shujun Cheng1. 1. State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China. 2. Department of Cancer Prevention, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China. 3. Department of Endoscopy ,National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Abstract
Understanding of the complex interaction between the peripheral immune system and lung cancer (LC) remains incomplete, limiting patient benefit. Here, we aimed to characterize the host peripheral immune response to LC and investigate its potential prognostic value. Bulk RNA-sequencing data of peripheral blood leucocytes (PBLs) from healthy volunteers and LC patients (n = 142) were analysed for characterization of host systemic immunity in LC. We observed broad blood transcriptome perturbations in LC patients that were heterogeneous, as two new subtypes were established independent of histology. Functionally, the heterogeneity between the two subtypes included dysregulation of diverse biological processes, such as the cell cycle, blood coagulation and inflammatory signalling pathways, together with the abundance and activity of blood cells, particularly lymphocytes and neutrophils, ultimately manifesting as differences in antitumour immune status. Based on these findings, a prognostic model composed of ten genes dysregulated in one LC subtype with relatively poor immune status was developed and validated in a Gene Expression Omnibus (GEO) data set (n = 108), helping to generate a prognostic nomogram. Collectively, our study provides novel and comprehensive insight into the heterogeneity of the host peripheral immune response to LC. The expression heterogeneity-based predictive model may help guide prognostic management for LC patients.
Understanding of the complex interaction between the peripheral immune system and lung cancer (LC) remains incomplete, limiting patient benefit. Here, we aimed to characterize the host peripheral immune response to LC and investigate its potential prognostic value. Bulk RNA-sequencing data of peripheral blood leucocytes (PBLs) from healthy volunteers and LC patients (n = 142) were analysed for characterization of host systemic immunity in LC. We observed broad blood transcriptome perturbations in LC patients that were heterogeneous, as two new subtypes were established independent of histology. Functionally, the heterogeneity between the two subtypes included dysregulation of diverse biological processes, such as the cell cycle, blood coagulation and inflammatory signalling pathways, together with the abundance and activity of blood cells, particularly lymphocytes and neutrophils, ultimately manifesting as differences in antitumour immune status. Based on these findings, a prognostic model composed of ten genes dysregulated in one LC subtype with relatively poor immune status was developed and validated in a Gene Expression Omnibus (GEO) data set (n = 108), helping to generate a prognostic nomogram. Collectively, our study provides novel and comprehensive insight into the heterogeneity of the host peripheral immune response to LC. The expression heterogeneity-based predictive model may help guide prognostic management for LC patients.
The immune response is believed to play a crucial role in the occurrence and progression of cancers, including lung cancer (LC). Past studies have pointed out that, as a highly heterogeneous disease with poor prognosis and high mortality, LC may be promoted via a reprogrammed immune microenvironment by contributing to inflammation, immune regulation and treatment response., Therefore, a variety of diagnostic and prognostic biomarkers have been proposed, and targeted immunotherapy has been applied in clinical therapy based on components of the tumour microenvironment., , From another perspective, cancer is perceived as a systemic disorder. Immune cell abundance, genomic DNA methylation and serological characteristics have been found to be altered in the peripheral immune system in LC., , Nevertheless, a comprehensive and in‐depth understanding of the impacts of LC on the host immune system is still lacking.At present, treatment and prognosis prediction for LC patients are based mainly on the tumour‐node‐metastasis (TNM) staging system. However, there are still considerable disparities in survival among patients with the same clinical characteristics. Researchers have tried to develop new prognostic auxiliary indicators to enhance the accuracy of LC prognosis,, , but intratumour heterogeneity and sampling bias may limit the validity and reproducibility of prognostic markers, particularly for solid tumour specimens. Hence, reliable prognostic biomarkers are needed to guide adjuvant therapy.As an accessible source of immune cells that migrate to and from tumour lesions, peripheral blood is exposed to tumour cells and host‐secreted factors released into the peripheral circulation system and is suitable for evaluating the immune status of cancer patients. Furthermore, genome‐wide expression profile analysis based on high‐throughput transcriptome sequencing (RNA‐Seq) of peripheral blood provides an unbiased and in vitro operational approach to measure transcriptional responses in complex diseases, including cancer.,In this study, we generated transcriptional profiles of peripheral blood leucocytes (PBLs) from LC patients and healthy subjects by RNA‐Seq. We characterized the transcription profile changes of PBLs in LC and parsed their heterogeneity for the first time. Moreover, we developed a risk score (RS) model with signature genes of PBL subtypes as a good indicator to predict overall survival (OS) for LC patients.
MATERIALS AND METHODS
See also Supplementary Methods.
Collection of specimens
We recruited 69 healthy individuals and 73 LC patients and collected four millilitres of fresh peripheral blood from each enrolled individual before clinical treatment. The clinical characteristics of all subjects are summarized in Table S1. The study was reviewed and approved by the Ethics Committee of the Cancer Institute and Hospital of the CAMS and was carried out in accordance with the World Medical Association Declaration of Helsinki Ethical Principles for Medical Research. All subjects provided written informed consent.
Generation and normalization of RNA‐sequencing data
We extracted total RNA from peripheral leucocytes. Eligible libraries were prepared from qualified samples using a NEBNext® Ultra™ RNA Library Prep Kit (New England Biolabs, Ipswich, MA, UK) and sequenced on the Illumina HiSeq 4000 platform. Paired‐end reads (150 bp) were mapped to the human reference genome (GRCh38) by Salmon. Transcript abundances were summarized at the gene level with tximport and were normalized based on transcripts‐per‐million (TPM).
Differentially expressed gene identification and pathway enrichment analysis
The Bioconductor package DESeq2 was used to detect differentially expressed genes (DEGs) with a gene read count matrix. Only genes with a Benjamini‐Hochberg false discovery rate (FDR) < 0.05 and a |log2 fold change (FC)| ≥ 0.59 were defined as DEGs.Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed by the package ClusterProfiler and visualized with the package GOplot and Cytoscape software (v3.7.1) to functionally characterize the identified DEGs., The Benjamini‐Hochberg FDR threshold in each case was set at 0.05.
Estimate the proportion of immune cells
CIBERSORT can differentiate mature human haematopoietic cells by deconvolution with an LM22 signature gene file. To estimate the proportion of 22 immune cell types for each individual, CIBERSORT analysis was performed on the gene expression data with the online tool developed by Newman et al (https://cibersort.stanford.edu/). Routine blood indicators of 67 normal subjects and 37 LC patients collected from medical records were used to verify partial results.To observe DEGs independent of immune cell composition between LC subtypes, we simplified the cell composition to the ratio of myeloid cells to lymphocytes (myeloid cells include monocytes, M0 macrophages, M1 macrophages, M2 macrophages, resting dendritic cells, activated dendritic cells, resting mast cells, activated mast cells, eosinophils and neutrophils; lymphocytes include naive B cells, memory B cells, plasma cells, CD8 T cells, naive CD4 T cells, memory resting CD4 T cells, memory activated CD4 T cells, T follicular helper cells, regulatory T cells, gamma delta T cells, resting natural killer (NK) cells and activated NK cells) and used it as a covariate in the analysis of differential gene expression.
Dimension reduction and cluster analysis
To explore PBL transcriptional profiling–based heterogeneity among samples, we performed t‐distributed stochastic neighbour embedding (t‐SNE) analysis by the package Rtsne using the first 50 principal components and 1000 iterations, combined with cluster analysis employed by the package mclust with default parameters using all genes. Principal component analysis (PCA) of all genes and unsupervised hierarchical clustering analysis of the top 3000 most variable genes using the complete method and Pearson correlation distance were carried out by R to validate the gene expression profile heterogeneity between LC subtypes.
Calculation of molecular distance
A metric presented by Pankla et al defined as molecular distance can be applied to quantify global transcriptome variation degree in patients versus normal people. This method essentially consists of implementing outlier analysis on a gene‐by‐gene basis, where the dispersion of the expression values found in the baseline samples is used to judge whether the expression value of a single case sample lies within two standard deviations of the controls' mean. Here, we calculated the molecular distance of the expression profile for each patient by taking normal samples as the baseline and compared the differences in variation degree between different groups.
Gene set enrichment analysis
To compare differences in the transcriptomes between LC subtypes, we implemented two enrichment methods for the gene expression profiles by taking blood transcription modules (BTMs) devised specifically for blood transcriptome analysis and the ‘hallmark’ MSigDB collection (v7.1) as gene sets: analyses performed with gene set enrichment analysis (GSEA) software (http://www.broad.mit.edu/gsea/) and the tmod package. Pre‐ranked gene lists consisted of genes arranged with a decreasing metric value derived from the DESeq2 analysis result. The formula used for calculating the metric is as follows:For tmod, the CERNO test was employed. Only BTMs that were commonly significantly enriched (FDR < 5%) across the two methods were retained.
Weighted gene correlation network analysis
In this study, two gene co‐expression networks consisting of 86 samples (lung cancer set 1 (LC1) vs the normal group) and 125 samples (lung cancer set 2 (LC2) vs the normal group) were established by the package WGCNA. The two adjacency matrixes were calculated with a beta of 8 or 10, and the minimum cluster size of the clustering dendrograms was 30. The association between eigengene values and clinical traits was assessed by Pearson's correlation, and key co‐expression modules related to class differences were detected according to the correlation and significance P values. Genes in each key co‐expression module were ranked by effect size as the mean expression of the LC class minus the expression of the normal class to generate pre‐ranked gene lists for downstream GSEA with BTMs as gene sets.
Construction and validation of a RS prognostic model
We downloaded a transcriptome data set of peripheral blood mononuclear cells (PBMCs) with clinical outcomes from NCBI’s Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE13255, which contains 108 LC patients. The normalized gene expression matrix transformed by log2 was applied to construct a RS prognostic model. A summary of the sample information is shown in Table S3.All samples were randomly divided equally into training and testing sets. First, we identified a gene panel selected from LC2 DEGs with P < .05 in a univariate regression analysis of the training set (n = 54). Next, we introduced least absolute shrinkage and selection operator (LASSO) Cox regression analysis to the gene panel by the glmnet package. The expected generalization error was estimated by 10‐fold cross‐validation, and the LASSO model consisting of n genes with non‐zero regression coefficients was determined. Those coefficients (c) and the corresponding gene expression values (E) were used to calculate a RS for each patient, as shown below:An optimal cut‐off value determined by performing time‐dependent receiver operating characteristic (ROC) analysis with the survival ROC package dichotomized patients into a high‐risk group or a low‐risk group. The Kaplan‐Meier (KM) method with a log‐rank test was used to perform time‐to‐event analysis to evaluate survival differences between the two groups. The testing set and stage I set were utilized to validate the prognostic ability of the predictive model.Univariate and multivariate Cox proportional hazard regression analyses were performed to assess independent prognostic factors. A novel nomogram was built based on the multivariable analyses and provided visualized risk stratification by the survival, foreign and rms packages. The discrimination of different predictors was appraised by Harrell's concordance index (C‐index).
Statistical analysis
The Mann‐Whitney U test (two‐tailed) was used to compare cell proportions among samples. Pearson's product‐moment correlation test was used to estimate correlations between CIBERSORT analysis results and blood routine indicators of subjects. Fisher's exact chi‐square test, unpaired Student's t test and a chi‐square test with Yates correction for continuity were used to calculate differences in clinical characteristics between different LC subsets. The Kruskal‐Wallis test and Mann‐Whitney U test (two‐tailed) were performed for comparison of molecular distance. All statistical analyses were performed by using R (3.6.3) or SPSS (25.0) software. P < .05 was considered statistically significant.
RESULTS
Broad blood transcriptome perturbations were evident in LC patients
To quantify sample differences in the PBL transcriptome between LC and healthy subjects, we carried out PCA across 142 subjects and found that the LC group was separated from the healthy group (Figure 1A). We identified 1368 DEGs in the LC versus healthy groups to uncover the biological alterations of the peripheral immune system in response to LC (Figure 1B, File S1). Biological processes, including the humoral immune response and regulation of lymphocyte activation, were enriched (Figure 1C). Disturbed pathways identified previously in LC histological studies, such as arachidonic acid metabolism and transcriptional misregulation in cancer, were also observed (Figure S1).,
FIGURE 1
Difference analysis and deconvolution of the PBL transcriptome between LC patients and healthy controls. A, PCA of PBL transcriptional profiles across all subjects. PC1 and PC2 are plotted on the x‐axis and y‐axis, respectively. Ellipses represent 95% CIs. B, Volcano plot of DEGs in LC versus healthy individuals. The horizontal line indicates FDR = 0.05, and the vertical line indicates |log2FC| = 0.59. Blue, black and red dots represent down‐regulated, non‐significantly differentially expressed and up‐regulated genes, respectively. C, GO enrichment analysis of DEGs (FDR < 0.05) (BP, biology process; CC, cellular component; MF, molecular function). D, Estimated proportions of immune cells in healthy and LC individuals (Mann‐Whitney U test, *P < .05; **P < .01; ***P < .001)
Difference analysis and deconvolution of the PBL transcriptome between LC patients and healthy controls. A, PCA of PBL transcriptional profiles across all subjects. PC1 and PC2 are plotted on the x‐axis and y‐axis, respectively. Ellipses represent 95% CIs. B, Volcano plot of DEGs in LC versus healthy individuals. The horizontal line indicates FDR = 0.05, and the vertical line indicates |log2FC| = 0.59. Blue, black and red dots represent down‐regulated, non‐significantly differentially expressed and up‐regulated genes, respectively. C, GO enrichment analysis of DEGs (FDR < 0.05) (BP, biology process; CC, cellular component; MF, molecular function). D, Estimated proportions of immune cells in healthy and LC individuals (Mann‐Whitney U test, *P < .05; **P < .01; ***P < .001)We next estimated the proportions of immune cells for each subject by deconvoluting the PBL RNA‐Seq data to investigate changes at the cellular level (File S2). Nine major cell components are displayed (Figure 1D). Changes in the abundance of monocytes, NK cells, T cells and B cells in the patient group were observed, which is in agreement with past research. Similar changes were found in routine blood indicators that were significantly positively correlated with deconvolution analysis results (Figure S2). In general, the PBL transcriptome of LC patients was widely disturbed and denoted broad aberrancies in innate and adaptive immunity.
PBL transcriptional profiles of LC patients are heterogeneous
We next asked whether there is consistency in peripheral transcriptome perturbations among LC patients. We performed t‐SNE and cluster analyses across all subjects based on the similarity of gene expression profiles (Figure 2A). The LC patients were distinct from normal subjects, in line with the PCA results (Figure 1A). In particular, LC samples were clustered into two subsets defined as LC1 and LC2, both of which contained three LC histological subtypes (Figure S3). We also executed PCA and unsupervised hierarchical cluster analysis of the top 3000 most variable genes to verify between‐group variance (Figure 2B,C). Two clusters were delineated in unsupervised hierarchical cluster analysis, and the vast majority of samples in LC1 and LC2 were divided into different clusters. Detailed clinical traits of the two LC subsets are given in Table S2. Apparently, there were no significant differences in characteristics between the two subsets except for stage (Fisher's exact chi‐square test, P = .004). Early‐stage patients were relatively evenly distributed in the two subsets, but more advanced patients were classified as LC2.
FIGURE 2
PBL transcriptional heterogeneity in LC. A, t‐SNE plot of RNA‐Seq data of all enrolled samples. Ellipses represent 95% CIs. B, PCA of LC samples. Ellipses represent 95% CIs. C, Unsupervised hierarchical clustering heatmap for the top 3000 most variably expressed genes among LC samples using the complete method and Pearson correlation distance. D, Molecular distance of LC subjects classified by new LC subtypes, TNM stages and histological types (LUAD, lung adenocarcinoma; SCLC, small cell lung cancer; LUSC, lung squamous cell carcinoma). The Mann‐Whitney U test P values are displayed on top
PBL transcriptional heterogeneity in LC. A, t‐SNE plot of RNA‐Seq data of all enrolled samples. Ellipses represent 95% CIs. B, PCA of LC samples. Ellipses represent 95% CIs. C, Unsupervised hierarchical clustering heatmap for the top 3000 most variably expressed genes among LC samples using the complete method and Pearson correlation distance. D, Molecular distance of LC subjects classified by new LC subtypes, TNM stages and histological types (LUAD, lung adenocarcinoma; SCLC, small cell lung cancer; LUSC, lung squamous cell carcinoma). The Mann‐Whitney U test P values are displayed on topIn addition, we calculated the molecular distance for each patient to quantify global transcriptional changes in LC over the normal baseline (Figure 2D). Notably, the molecular distance between LC1 and LC2 was significantly different (Mann‐Whitney U test, P < .0001). Although the P value did not indicate statistical significance (Kruskal‐Wallis test, P = .213), there seemed to be a slight difference between subgroups classified based on stage. For the subgroups with different histological types, their PBL expression profiles were very similar (Kruskal‐Wallis test, P = .875). These results further imply that the difference in LC histological type with different origins is not manifested in the PBL expression profiles.In short, considerable heterogeneity of PBL transcriptional profiles independent of histological type was identified in LC. We next conducted analyses aiming to interpret the heterogeneity and its significance.
Global alterations revealed the heterogeneity of the PBL transcriptome in LC patients
First, to characterize the global blood transcriptional profile, we explored BTM signatures of the LC subsets devised specifically for blood transcriptome analysis by performing GSEA and tmod analysis. Significantly up‐regulated BTMs in LC1 versus normal controls (N) across the two methods (FDR < 0.05) included NK cell signature enrichment, cell cycle and DNA repair relevant pathways. Up‐regulated BTMs in LC2 included myeloid cell signature enrichment, blood coagulation and inflammatory signalling pathways (Figure 3A and Figure S4A). We next implemented WGCNA to evaluate class differences in the co‐expression network module derived from LC‐induced data (Figure S5). Core modules related to disease states were found to be differentially enriched in the two subsets, most of which were consistent with the above enrichment analyses results (Figure 3B, File S3). Furthermore, hallmark signatures related to the cell cycle, such as E2F targets, MYC targets and the G2M checkpoint, were salient in LC1., LC2 highlighted the inflammatory response and cytokine‐mediated signalling pathway (Figure 3C and Figure S4B).
FIGURE 3
Global diverse PBL transcriptional alterations revealed by gene enrichment analyses. A, BTMs differentially enriched between classes in both GSEA and tmod (FDR < 0.05 across both methods). Only tmod results are shown (N, normal controls). See also Figure S4A. B, BTM enrichment analysis of genes in significant co‐expression network modules constructed by WGCNA based on LC versus normal controls. Key modules of LC1 network include the brown, magenta and turquoise modules. Key modules of LC2 network include the tan, magenta and turquoise modules. Only significant BTMs (FDR < 0.05) are shown. C, Hallmark gene sets from MSigDB v7.1 differentially enriched between classes in two testing methods (FDR < 0.05 across both methods). Only tmod results are shown. See also Figure S4B. D, Comparison of immune cell proportions in each LC subset. (Mann‐Whitney U test; ns, not significant, P > .05; *P < .05; **P < .01; ***P < .001)
Global diverse PBL transcriptional alterations revealed by gene enrichment analyses. A, BTMs differentially enriched between classes in both GSEA and tmod (FDR < 0.05 across both methods). Only tmod results are shown (N, normal controls). See also Figure S4A. B, BTM enrichment analysis of genes in significant co‐expression network modules constructed by WGCNA based on LC versus normal controls. Key modules of LC1 network include the brown, magenta and turquoise modules. Key modules of LC2 network include the tan, magenta and turquoise modules. Only significant BTMs (FDR < 0.05) are shown. C, Hallmark gene sets from MSigDB v7.1 differentially enriched between classes in two testing methods (FDR < 0.05 across both methods). Only tmod results are shown. See also Figure S4B. D, Comparison of immune cell proportions in each LC subset. (Mann‐Whitney U test; ns, not significant, P > .05; *P < .05; **P < .01; ***P < .001)In view of the enrichment of immune cell signatures, we also focused on different cell abundances between the new LC subtypes. Neutrophils and lymphocytes, especially NK cells and CD8 T cells, showed significant differences between LC1 and LC2 by deconvolution and routine blood analysis (Figure 3D and Figure S6). Moreover, the estimated neutrophil‐to‐lymphocyte ratio of LC2, that is an indicator of the inflammation level, was significantly higher than that of LC1 (1.09 ± 0.55 vs 0.15 ± 0.10).Overall, heterogeneity in the PBL transcriptome of LC patients involved cell cycle–related pathways, blood coagulation, inflammatory signalling pathways and PBL composition, which may play crucial roles in the LC subsets.
Peripheral antitumour immune status is distinct between the LC subsets
Given the global abnormalities related to immunity of the PBL transcriptome and its leading role in the peripheral immune response, we further assessed differences in immune status between two LC subsets based on DEGs. Two DEG panels from LC1 vs N (3015 DEGs) and LC2 vs N (1838 DEGs) were identified, with only 765 genes overlapping (Figure 4A, File S4). We focused on immunological aberrancies by matching the two DEG sets with a list of immune‐associated genes from ImmPort (https://www.immport.org) (Figure 4A). A total of 223 and 145 overlapping DEGs were enriched in different pathways, including up‐regulated NK cell–mediated cytotoxicity in LC1 and cytokine‐cytokine receptor interaction in LC2 (Figure 4B). Considering the immune cell proportion differences, we identified two DEG signatures that were independent of cell components for the two LC subsets (Figure S7A, File S4). Most of the above enriched pathways remained striking and independent of cell composition, emphasizing the critical role in the immune response to different LC subtypes (Figure S7B).
FIGURE 4
Antitumour immunological differences between the LC subsets. A, Gene overlap between immune genes and two DEG sets was identified by a Venn diagram. B, KEGG pathway enrichment analysis with two panels of immune‐related DEGs showing highly significant pathways (FDR < 0.05) in the two subgroups. Only the top five up‐regulated and down‐regulated pathways are shown in each group. The inner circle shows the log2FC values of DEGs in enriched pathways. C, Expression pattern of immunomodulators (IMs) in LC1 and LC2. Log2FC values are shown as a heatmap. White asterisks indicate genes with differential expression (FDR < 0.05). Colours of annotation bars denote different molecular functions and features
Antitumour immunological differences between the LC subsets. A, Gene overlap between immune genes and two DEG sets was identified by a Venn diagram. B, KEGG pathway enrichment analysis with two panels of immune‐related DEGs showing highly significant pathways (FDR < 0.05) in the two subgroups. Only the top five up‐regulated and down‐regulated pathways are shown in each group. The inner circle shows the log2FC values of DEGs in enriched pathways. C, Expression pattern of immunomodulators (IMs) in LC1 and LC2. Log2FC values are shown as a heatmap. White asterisks indicate genes with differential expression (FDR < 0.05). Colours of annotation bars denote different molecular functions and featuresImmunomodulators (IMs) are pivotal for modulation of the immune response level. To gain insight into the peripheral immune status, we next evaluated the expression of a list of IMs that may stimulate or inhibit the immune response to LC (Figure 4C). The results showed that the expression of IMs varied across the LC subsets. In LC1, stimulatory IMs, such as CD70, CXCL9 and NK cell activation receptor‐NKG2D (KLRK1), were up‐regulated. Comparatively, most immunosuppressive molecules, such as PD1 (PDCD1) and PDL1 (CD274), showed significant up‐regulation, while stimulatory IMs, including NKG2D, tended to be down‐regulated in LC2 cells. These findings underscored the possible peripheral immunosuppression in antitumour immunity of LC2 compared to LC1.
Gene signatures of the PBL transcriptome have the capability of predicting LC survival
Remarkable alterations induced by LC in the PBL transcriptome prompted us to determine whether relevant gene signatures have prognostic value for LC. Considering the more advanced status of the patients, the inflammatory and suppressed peripheral immune status in LC2, we speculated that LC2 may have a poorer prognosis and its signature genes have good performance in survival prediction for LC patients. Hence, we took the 1838 DEGs identified from LC2 as candidate genes to develop and validate a prognostic prediction model within a PBMC expression profile data set (n = 108). Eventually, a ten‐gene signature (HK3, SLC36A1, MSR1, CEACAM1, CEACAM6, HCG27, FXYD7, TRPLC1, NR3C2 and RLN2) predictive model was constructed with univariate regression and LASSO Cox regression analysis (Table S4 and Figure S8A,B,E). Most of the ten genes are expressed in monocytes or lymphoid cells and differentially expressed across the LC subsets in our study, suggesting possible differential outcomes between LC1 and LC2 (https://www.proteinatlas.org/) (Figure S8F).We calculated the RS for patients with expression values of the ten genes and corresponding regression coefficients and then divided the training set (n = 54) into high‐risk and low‐risk groups according to the optimum cut‐off value (cut‐off = −6.1; Figure S8C,D). Time‐dependent ROC analysis showed that the areas under the curve (AUCs) at 1, 2, 3 and 5 years were 0.897, 0.853, 0.813 and 0.911, respectively (Figure S9A). The high‐risk patients had poorer OS than patients with lower RS (log‐rank P < .0001, Figure 5A). The model showed good performance with the same cut‐off value in the testing set (n = 54, log‐rank P = .0042), even in patients with stage I disease (n = 38, P = .0062) (Figure 5B,C). The AUCs of ROC analysis at 1, 2, 3 and 5 years in the test set are shown in Figure S9B.
FIGURE 5
Predictive performance of the RS model. A, KM curves of the training set (54 samples) stratified by the RS cut‐off value. B, KM curves of the independent testing set (54 samples) stratified by the same cut‐off value. C, KM curves of the stage I subgroup (38 samples) from the test set. D, Univariate and multivariate regression analysis of the RS model and the other clinical characteristics regarding prognostic value
Predictive performance of the RS model. A, KM curves of the training set (54 samples) stratified by the RS cut‐off value. B, KM curves of the independent testing set (54 samples) stratified by the same cut‐off value. C, KM curves of the stage I subgroup (38 samples) from the test set. D, Univariate and multivariate regression analysis of the RS model and the other clinical characteristics regarding prognostic valueTo determine whether the prognostic value of the RS was independent of other clinical characteristics, we conducted univariate and multivariate Cox regression analyses with the training set. The RS remained an independent prognostic indicator with the highest median hazard ratio (HR = 7.06, 95% confidence interval (CI) = 3.45‐14.44) for LC after adjusting for race, adjuvant chemotherapy, smoking status and annual amount, age, sex, histology, COPD status and pathologic stage (Figure 5D). Thus, the robustness of the RS for independently predicting LC patient OS was confirmed.We next constructed a nomogram that integrated the RS and conventional clinical traits in survival prediction, including age and stage, to provide a quantitative method for predicting the prognosis of LC patients (Figure 6A). In the nomogram, points of variables assigned by a point scale and the sum of all variable points were used as the total points to predict the survival probability of LC patients at 1, 2, 3 and 5 years. Remarkably, prediction with the RS prognosis model showed a higher C‐index than that with age or stage, and the nomogram model had a larger C‐index than the other three predictors in the training set and testing set (Figure 6B). Altogether, the RS model performed better than conventional clinical characteristics in survival prediction and constituted a quantitative method for predicting survival of LC patients, which was better than individual predictors.
FIGURE 6
Nomogram model construction with RS, age and stage. A, Nomogram model for predicting the probability of 1‐, 2‐, 3‐ and 5‐year OS for LC patients. B, The prognostic performance was compared among the RS model, age, stage and the nomogram model by calculating the C‐index
Nomogram model construction with RS, age and stage. A, Nomogram model for predicting the probability of 1‐, 2‐, 3‐ and 5‐year OS for LC patients. B, The prognostic performance was compared among the RS model, age, stage and the nomogram model by calculating the C‐index
DISCUSSION
Lung cancer (LC) is widely recognized as a highly heterogeneous disease that greatly threatens human health. Previously, many targeted studies have been conducted to comprehend the local immune status of LC lesions, but the understanding of variations in the peripheral immune system of the LC host is still limited., , In this study, we characterized the host systemic immune response to LC and decrypted its heterogeneity. Two new LC subtypes were established based on the PBL transcriptome with different antitumour immune statuses and possible outcome differences. In addition, we constructed a RS prognostic model based on DEG signatures of the LC subtype with poor immune status to explore potential clinical application value.The viewpoint that most tumours are systemic diseases that not only affect the organs of the lesion is increasingly accepted. Our study provides a comprehensive characterization of the systemic immune context in LC, which confirmed that the immune macroenvironment of LC has indeed undergone substantial changes at the PBL transcriptome level. An up‐regulated humoral immune response, regulation of lymphocyte activation and changes in immune cell proportions indicated a widespread impact of LC on host peripheral adaptive and innate immunity. Furthermore, enrichment of arachidonic acid metabolism and transcriptional misregulation in cancer supported that cancerous features may be captured from the peripheral blood transcriptome.In this study, we revealed substantial PBL transcriptome heterogeneity independent of histological type in LC patients. In other words, although different histological types have distinct origins, there seem to be no marked differences in influence on the peripheral immune system. Interestingly, similar findings were presented in research on breast cancer, and whether this observation represents a pan‐cancer phenomenon needs additional study to verify., , Additionally, the heterogeneity seems to be related to stage in our study, raising the possibility that the PBL transcription profile can partly reflect the tumour burden of LC patients.We tried to uncover the underlying biological mechanisms and significance of the heterogeneity. For LC1, one possible explanation for the marked up‐regulation of cell cycle–related pathways is that the proliferation of lymphocytes, especially CD8 T cells, is stimulated via antigen presentation by dendritic cells (DCs), thereby producing effector cells and enhancing cell immunity against tumour cells. Important effector T cells, cytotoxic T lymphocytes (CTLs) are derived from the clonal proliferation of CD8 CTL precursor cells and kill tumour cells by releasing granules or inducing FasL‐mediated apoptosis., Analogously, NK cells perform broad‐spectrum antitumour functions employing similar cytotoxicity activities. Increased NK cell abundance and enhanced cytotoxicity in LC1 mean more NK cells were available for recruitment to exert antitumour cytotoxicity. As the only cells that lack a pro‐tumorigenic role, NK cells with a high proportion in peripheral blood or infiltration in tumour tissue often indicate a favourable outcome in cancer patients. Moreover, the high expression of stimulatory IMs may play an important role in the regulation of antitumour immunity in LC1. For example, CXCL9 mediates the trafficking of CD8 T cells, and CD70 is involved in the activation of CD8 T cells.,For LC2, we first observed activation of blood coagulation. Previous findings suggest that activation of blood clotting leads to fibrin deposition around the tumour, which can build up a provisional matrix to sustain and boost angiogenesis and induce the cellular responses of adhesion, proliferation and migration of tumour cells., On the other hand, a higher neutrophil proportion and cytokine‐cytokine receptor interaction, regarded as crucial aspects of inflammation, were prominent in LC2. A growing body of evidence shows that inflammation promotes tumorigenesis and cancer progression. As the first responders to inflammation, active neutrophils release several soluble neutrophil granule proteins that can promote tumour progression by inducing tumour cell proliferation, stimulating angiogenesis and disabling T cell–dependent antitumour immunity., Neutrophils facilitate the metastasis of circulating tumour cells, and cancer‐associated thrombosis by NETosis is considered to be an important factor in tumour progression. Cytokines are a double‐edged sword for carcinoma progression that are produced by host stromal cells and immune cells, either in response to molecules secreted by cancer cells or as part of the inflammation that frequently accompanies tumour growth. Through activating various downstream effectors, cytokines control the immune and inflammatory milieu to either favour antitumour immunity or enhance tumour progression and have direct effects on tumour cell growth and survival. For example, the Th1‐type cytokine IFN‐γ participates in the antitumour immune process, while TGF‐β and IL13 have the opposite action., , In addition, the high expression levels of inhibitory immune checkpoint factors and low expression levels of NK cell activation receptors in LC2 demonstrate restrained lymphocyte activation and depletion of NK cells., In general, the peripheral immunity of LC2 presents an inflammatory status, and antitumour immunity may be suppressed.Considering the PBL transcriptome characteristics, we measured the prognostic relevance of DEG signatures derived from LC2. In terms of the ten signatures, in LC2, four risk factors were highly expressed (HR > 1; CEACAM6, CEACAM1, HK3 and SLC36A1), and four protective factors showed low expression levels (HR < 1; NR3C2, RLN2, TRPC1 and FXYD7), suggesting their potential important roles in the macroenvironment of LC among tumorigenesis and tumour progression (Figure S8F). The risk factors may be related to immunosuppression and inflammation. For instance, CEACAMs mediate cell adherence and transcellular transcytosis, resulting in the suppression of immune cell activities, as predictors in LC.,
HK3 correlated positively with inflammatory activities and multiple immune checkpoints. SLC36A1 may be involved in cancer metabolism. As for the protective factors, they may play a role in immune activation and anti‐inflammatory. NR3C2 is considered to be a tumour suppressor and may be correlated with T cell activation. It is traditionally believed that M2 macrophages are generally associated with immunosuppression and tumour metastasis. TRPC1 and MSR1 can induce M2 to polarize into M1 macrophages., RLN2 has anti‐inflammatory properties. However, the role of FXYD7 and HCG27 in the immune response is unclear.The predictive ability of the RS model developed from PBMC data was validated even for patients with the same stage, proving the reliability of signature gene selection. The good performance of the prognostic model, especially the prognostic prediction for early‐stage LC patients, may help improve follow‐up treatment after surgical resection. More meaningfully, the RS contributed to constructing a comprehensive quantitative method for survival prediction, which performed better than existing clinical prognostic indicators.We recognize limitations in our study. Due to the limitation of insufficient clinical data, we cannot demonstrate the potential clinical value of the heterogeneity between LC subsets directly in our data set. Moreover, both the heterogeneity of the PBL transcriptome and the universality of the RS prognostic model still require further validation with multicentre and larger scale studies.In summary, to our knowledge, this study is the first to cover multiple LC histological types and stages and to target the peripheral blood transcriptome based on RNA‐Seq. We characterized the PBL transcriptional profiles and interpreted the heterogeneity of peripheral immunity in LC hosts for the first time. A robust RS prognostic model, as a non‐invasive and unbiased method to predict LC patient survival, was established and may help guide clinical treatment in the future.
CONFLICT OF INTEREST
No conflicts of interest are declared.
AUTHOR CONTRIBUTIONS
Qi Zhang: Formal analysis (lead); Methodology (lead); Resources (lead); Visualization (lead); Writing‐original draft (lead). Manchao Kuang: Data curation (lead); Methodology (equal); Validation (equal). Haiyin An: Data curation (equal). Yajing Zhang: Data curation (equal). Kai Zhang: Resources (equal). Lin Feng: Supervision (lead); Validation (equal); Writing‐review & editing (lead). Lei Zhang: Project administration (equal); Resources (equal). Shujun Cheng: Conceptualization (lead); Project administration (equal).Fig S1‐S9Click here for additional data file.Table S1‐S4Click here for additional data file.Supplementary MaterialClick here for additional data file.File S1Click here for additional data file.File S2Click here for additional data file.File S3Click here for additional data file.File S4Click here for additional data file.
Authors: Sadna Budhu; David A Schaer; Yongbiao Li; Ricardo Toledo-Crow; Katherine Panageas; Xia Yang; Hong Zhong; Alan N Houghton; Samuel C Silverstein; Taha Merghoub; Jedd D Wolchok Journal: Sci Signal Date: 2017-08-29 Impact factor: 8.192
Authors: R A D Bathgate; M L Halls; E T van der Westhuizen; G E Callander; M Kocan; R J Summers Journal: Physiol Rev Date: 2013-01 Impact factor: 37.312
Authors: Ayden Saffari; Matt Arno; Eric Nasser; Angelica Ronald; Chloe C Y Wong; Leonard C Schalkwyk; Jonathan Mill; Frank Dudbridge; Emma L Meaburn Journal: Mol Autism Date: 2019-11-07 Impact factor: 7.509