Literature DB >> 32592319

Development and validation of an immune-related prognostic signature in lung adenocarcinoma.

Sijin Sun1, Wei Guo1, Zhen Wang1, Xin Wang1, Guochao Zhang1, Hao Zhang1, Renda Li1, Yibo Gao1, Bin Qiu1, Fengwei Tan1, Yushun Gao1, Qi Xue1, Shugeng Gao1, Jie He1.   

Abstract

BACKGROUND: Lung adenocarcinomas (LUAD) is the most common histological subtype of lung cancers. Tumor immune microenvironment (TIME) is involved in tumorigeneses, progressions, and metastases. This study is aimed to develop a robust immune-related signature of LUAD.
METHODS: A total of 1774 LUAD cases sourced from public databases were included in this study. Immune scores were calculated through ESTIMATE algorithm and weighted gene co-expression network analysis (WGCNA) was applied to identify immune-related genes. Stability selections and Lasso COX regressions were implemented to construct prognostic signatures. Validations and comparisons with other immune-related signatures were conducted in independent Gene Expression Omnibus (GEO) cohorts. Abundant infiltrated immune cells and pathway enrichment analyses were carried out, respectively, through ImmuCellAI and gene set enrichment analysis (GSEA).
RESULTS: In Cancer Genome Atlas (TCGA) LUAD cohorts, immune scores of higher levels were significantly associated with better prognoses (P < .05). Yellow (n = 270) and Blue (n = 764) colored genes were selected as immune-related genes, and after univariate Cox regression analysis (P < .005), a total of 133 genes were screened out for subsequent model constructions. A four-gene signature (ARNTL2, ECT2, PPIA, and TUBA4A) named IPSLUAD was developed through stability selection and Lasso COX regression. It was suggested by multivariate and subgroup analyses that IPSLUAD was an independent prognostic factor. It was suggested by Kaplan-Meier survival analysis that eight out of nine patients in high-risk groups had significantly worse prognoses in validation data sets (P < .05). IPSLUAD outperformed other signatures in two independent cohorts.
CONCLUSIONS: A robust immune-related prognostic signature with great performances in multiple LUAD cohorts was developed in this study.
© 2020 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

Entities:  

Keywords:  biomarker; infiltrated immune cell; lung adenocarcinoma; prognostic signature; tumor immune microenvironment

Mesh:

Substances:

Year:  2020        PMID: 32592319      PMCID: PMC7433810          DOI: 10.1002/cam4.3240

Source DB:  PubMed          Journal:  Cancer Med        ISSN: 2045-7634            Impact factor:   4.452


INTRODUCTION

According to estimates of cancer incidence in GLOBOCAN 2018, lung cancers remain a leading cause of cancer‐related deaths, which are cancers with the highest death incidence worldwide. Approximately, 85% of lung cancers are NSCLCs, which can be further divided into three subtypes: large cell carcinomas, squamous cell carcinomas, and lung adenocarcinomas (LUAD). , In most countries, LUAD is the most common histological subtype, which is a result of an increasing number of nonsmokers. With rapid evolvements of precision medicines, novel therapeutic strategies, especially immunotherapies, have been proposed to improve clinical outcomes of LUAD patients. , However, only a fraction of patients was benefited from immunotherapies, leaving urgencies of finding potential biomarkers for efficient and prognostic predictions. Cytotoxic T‐lymphocyte‐associated Antigen 4 (CTLA‐4) and Programmed Death 1 (PD‐1) immune checkpoints start a new paradigm shift in the field of immunotherapies. Mechanisms of immune checkpoint inhibitors (ICIs) targeting at these molecules are to relieve certain inhibitory pathways and thus, boosting the immune system to generate antitumor activities. Accordingly, efficacies of ICIs are considered strongly associated with hosts’ immune systems and tumor immune microenvironments (TIME). Crosstalk between cancer cells and TIME was sophisticated comprising both protumorigenic and antitumorigenic manners. A T cell inflamed TIME induced by a CDK4/6 Inhibitor was shown to enhance efficacies of ICIs in vitro. Therefore, in‐depth understanding of TIME can assist identifications of novel predictive biomarkers and developments of new therapeutic strategies. With diminishing costs, high‐throughput sequencing has emerged as a commonplace technology in the field of molecular biology. Computational algorithms, including Cibersort, Timer, and ImmuCellAI, have also been developed for assessments on abundance of infiltrated immune cells based on gene expression profiles. Altogether, these techniques provided valid and economical methods for providing detailed TIME profiles. Recently, several studies have been devoted to constructing immune‐related signatures in LUAD , and. Gene selections in their studies were based on prior knowledge sourced from external databases or differentially expressed genes (DEGs). Models based on these criteria might exclude immune‐related genes that had not yet been confirmed or prognostic genes without different expressions. In addition, models containing too many genes limited feasibilities of their clinical applications. Therefore, there is an urgent need to construct a robust and simple immune‐related prognostic signature. In this study, immune scores of each case related to TCGA‐LUAD were first calculated through ESTIMATE algorithm, then, weighted gene co‐expression network analysis (WGCNA) was applied to identify immune‐related modules. After that, a three‐gene prognostic model was constructed based on stability selection and Lasso COX regression, whose effectiveness was further evaluated based on nine independent data sets and compared with other previously reported immune‐related signatures. Finally, it was suggested by infiltrated immune cells and pathway enrichment analysis that our prognostic signature was closely related to components of innate immunities. In conclusion, we constructed a robust immune‐related signature based on transcriptomics.

MATERIALS AND METHODS

Data acquisition and preprocessing

RNA‐seq mRNA expression profiles and clinical information of TCGA‐LUAD cohorts were downloaded from the Cancer Genome Atlas (TCGA) Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov/). Microarray‐based data about expressions, which contained nine Gene Expression Omnibus (GEO) data sets, was obtained from official website of GEO (https://www.ncbi.nlm.nih.gov/geo/) via GEOquery R package. Only patients with pathologically confirmed LUAD and complete information about transcriptomics overall survivals (OS) could be considered. Finally, a total of 1774 LUAD cases (TCGA:504, GEO:1270) sourced from TCGA and GEO databases were included in our study. For TCGA‐LUAD cohorts, paired‐normal samples were first removed according to their TCGA barcodes. Then, Fragments per Kilobase Million (FPKM) values were transformed into Transcripts Per Million (TPM). Average expression value of multiple samples corresponding to the same patients was calculated for further analyses. For GEO data sets, probe IDs were converted to gene symbols according to platform annotation files. Normalized expression values were logarithmically transformed and scaled before being used in model validations. Average value of genes with multiple probes was used as their expression value.

Evaluating tumor microenvironment in TCGA‐LUAD cohort

ESTIMATE is a computerized algorithm which can be used to infer infiltration levels of stromal and immune cells in tumor tissues based on expression profiles. Stromal, immune, and ESTIMATE scores of each sample in TCGA‐LUAD cohorts were downloaded from the official website (https://bioinformatics.mdanderson.org/estimate/). Survival and association analyses were performed between tumor stages and these scores.

Construction of weighted gene co‐expression networks and identification of immune‐related modules

WGCNA is a systematic biological method developed by Langfelder et al It can be used to cluster highly correlated genes into modules and relative modules to phenotypes of interest. The WGCNA package of R software (http://www.rproject.org/) was used for network developments and visualizations. In network construction processes, soft thresholding power β was chosen as the lowest power with which fit index of scale‐free topology reached 0.90. The minimum module size was set as 30. After clustering, modules were displayed together through a dendrogram with colored assignments. To identify immune‐related modules, a module‐trait relationship heatmap was drawn with correlated coefficients and P‐value. Two modules, which were positively and negatively related to immune scores, respectively, were selected after comprehensive considerations on both module size (n > 100) and significances of the association.

Development and validation of the immune‐related prognostic signature for lung adenocarcinomas (IPSLUAD)

A prognostic signature was developed in the following steps. First, expression values of 1057 genes were extracted from two selected modules of TCGA‐LUAD cohort. Univariate Cox proportional hazard regressions were performed on these genes to identify factors related to prognoses (P < .05). Then, stability selection method was implemented to further narrow the scope through R package c060. The logic of stability selection is to induce additional randomizations and find out which features are consistently important in every subsampling step. Finally, a Lasso‐penalized Cox model was applied to construct a prognostic model based on genes selected in previous steps through R package glmnet. A 10‐fold cross validation was performed to determine the optimal value of Lasso penalty parameter. Coefficients of each gene in the model were determined by Lasso‐penalized Cox model. The formula is: where β is coefficient of Gene k, and G is the normalized expression value of Gene k. Survival analyses were performed in TCGA‐LUAD cohort using Kaplan‐Meier estimator and multivariate Cox regression. Cutoff value was set as median of IPSLUAD score. To further validate prognostic value of IPSLUAD, Kaplan‐Meier survival analysis and Cox regression were performed in nine independent GEO LUAD data sets (GSE3141, GSE13213, GSE14814, GSE29016, GSE30219, GSE31210, GSE37745, GSE50081, and GSE68465) , , , , , , , and, where cutoff value was set as median of IPSLUAD score. Gene expression values were normalized and scaled before validations. Area under the curve (AUC) of each data set was calculated for detailed evaluations.

Estimating immune cell infiltrations between high‐risk and low‐risk groups stratified by IPSLUAD

Infiltrated immune cells, particularly T cells, play indispensable roles in tumor immunotherapies. To compare infiltrated immune cells in samples with different IPSLUAD scores, ImmuCellAI was adopted to calculate the abundance of 24 immune cell types including 18 T‐cell subsets. For cells with a median abundance that was higher than 0.1 and significant differences among groups (P < .05), association analyses between markers of these cells and IPSLUAD scores were performed to validate those differences. Cell markers were chosen according to CellMarker database.

Functional enrichment analysis of IPSLUAD using gene set enrichment analysis (GSEA)

To further understand biologic functions of IPSLUAD, GSEA was performed on TCGA‐LUAD cohort based on GO Biological Process Ontology and KEGG gene sets. False discovery rate (FDR) was introduced as a control of Type I errors, and FDRs lower than 0.05 were considered significant. All the analyses were implemented through GSEA software (version 4.0.2).

Comparison between IPSLUAD and other existing immune‐related signatures

In order to further evaluate prediction accuracies of IPSLUAD, we compared IPSLUAD with other immune‐related signatures ranging from three to thirty genes (Table S3). Two large independent cohorts (n > 200, GSE31210, GSE68465) were employed for comparisons. , Risk scores of each signature were calculated based on normalized expression values and coefficients provided by original articles. Two‐year and five‐year AUCs were calculated and compared based on two data sets.

Statistical analysis

Continuous variables were summarized through mean and standard deviations and compared through Wilcoxon test. Categorized variables were presented by frequency (n) and proportion (%), and then compared through ANOVA. Both Cox proportional hazard model and Log‐rank test were applied to survival analyses, all of which were performed through R software (Version 3.6.3, The R Foundation for Statistical Computing). P values were two‐side and were considered to be statistically significant if they were lower than .05.

RESULTS

Association between ESTIMATE‐calculated scores and clinicopathological indicators

The whole processes of signature constructions and data analyses were shown in Figure 1. After excluding 13 cases with incomplete stages or survival information, 504 cases in TCGA‐LUAD cohort were eligible for analyses. In each sample, stromal, immune and ESTIMATE scores were obtained, which, respectively, ranged from −1779.3 to 2106.9, −932.6 to 3237.6, and −2338.0 to 4907.6. In difference comparisons, immune and ESTIMATE scores were significantly associated with pathologic stages (Figure 2A‐C, P < .05). The lowest stromal, immune, and ESTIMATE scores were all found in cases of Stage IV, which was the most progressive. In addition, cases with higher stromal, immune or ESTIMATE scores tended to have a better prognosis than those with scores of lower levels (Figure 2D‐F).
Figure 1

Flowchart of developments and validations of IPSLUAD

Figure 2

Associations between immune/stromal/ESTIMATE scores and clinicopathological indicators. A‐C, Differences among stromal/immune/ESTIMATE scores in different pathologic stages. Asterisk (*) indicated a significant difference between two groups (P < .05). D‐F, Survival analyses of stromal/immune/ESTIMATE scores through Kaplan‐Meier curve with log‐rank test

Flowchart of developments and validations of IPSLUAD Associations between immune/stromal/ESTIMATE scores and clinicopathological indicators. A‐C, Differences among stromal/immune/ESTIMATE scores in different pathologic stages. Asterisk (*) indicated a significant difference between two groups (P < .05). D‐F, Survival analyses of stromal/immune/ESTIMATE scores through Kaplan‐Meier curve with log‐rank test

Construction of co‐expression modules and identification of modules associated with immune scores

Network constructions and module detections were followed by a step‐by‐step approach. First, soft threshold power 5 was chosen to calculate adjacencies since it was the lowest power with which fit index of scale‐free topologies reached 0.90 (Figure 3A and B). Then, module identifications were performed through a dynamic tree cut with a deepSplit parameter set as 2. After merging similar modules, a total of 23 modules were identified and clustering dendrograms were presented (Figure 3C). A heatmap was drawn to shown correlated modules, in which red color represented positive correlations and blue color represented negative ones (Figure S2A). To identify immune‐related modules, module‐trait association plots were presented (Figure S1). Results demonstrated that the top three modules with the greatest positive associations were colored with Yellow, Blue, and Tan, while the top three ones with the greatest negative associations were colored with Black, Purple, and Blue. Among these modules, genes colored with Blue (n = 764) and Yellow (n = 270) were selected in module constructions based on module sizes (n > 200) and association significances. Finally, association analyses of Gene Significances (GS) and module membership (MM) were performed on Blue and Yellow modules. GSs of all the above three scores were significantly (P < .05) associated with MM in these two modules (Figure S2B‐G).
Figure 3

Network constructions and module detections of LUAD. A and B, Analyses of network topologies for various soft‐thresholding powers through scale‐free fit index (A) and mean connectivity (B). C, Clustering dendrogram of genes based on topological overlapping. Different colors were assigned to corresponding modules. A total of 23 modules were identified

Network constructions and module detections of LUAD. A and B, Analyses of network topologies for various soft‐thresholding powers through scale‐free fit index (A) and mean connectivity (B). C, Clustering dendrogram of genes based on topological overlapping. Different colors were assigned to corresponding modules. A total of 23 modules were identified

Construction of IPSLUAD in TCGA‐LUAD cohort

A total of 1034 genes were selected through previous steps (Table S1). To explore genes significantly associated with OS of patients with LUAD, univariate Cox regression analysis was performed and genes with P values of lower than .005 were chosen. 133 genes were screened out and selected for constructions of subsequent models (Table S2). Stability selection was first implemented to identify important features, and estimated sets of stable features comprised four genes, namely Aryl Hydrocarbon Receptor Nuclear Translocator Like 2 (ARNTL2), Epithelial Cell Transforming 2 (ECT2), Peptidylprolyl Isomerase A (PPIA), and Tubulin Alpha 4a (TUBA4A) (Figure S3A). Then, the optimal value of Lasso penalty parameter λ was determined as 0.0048 through 10‐fold cross validation (Figure S3B‐C). λ was then substituted into the model to generate coefficients of each gene. The final IPSLUAD was calculated as follows: (0.2319 × EXPARNTL2) + (0.1595 × EXPECT2) + (0.1611 × EXPPPIA) + (0.1286 × EXPTUBA4A). IPSLUAD of each patient in TCGA‐LUAD was calculated, and high‐/low‐risk groups were divided with median IPSLUAD as cutoff. Distributions of risk scores, survival statuses, and four‐gene expression profiles were shown in Figure 4A. It was demonstrated by association analyses that factors including genders, T Stage, N Stage, pathological stages, and survival statuses were significantly correlated (P < .05) with different IPSLUAD groups (Table 1). Patients in high‐risk group had significantly worse prognoses than those in low‐risk group (Log‐rank P < .05) (Figure 4B). In subgroup analyses, a significantly shorter OS was shown in high‐risk group than low‐risk group in Stage I and III (Log‐rank P < .05) (Figure 4C and E). A similar trend was found among patients in Stage II, although there were no statistical significances (Log‐rank P = .050) (Figure 4D). Univariate Cox regressions showed that T Stage, lymph node involvements, distant metastases, pathological stage, and IPSLUAD were significantly associated with OS (Table 2). Further, multivariate Cox regressions demonstrated that T Stage, lymph node involvements, and IPSLUAD were independent prognostic factors (Table 2).
Figure 4

Prediction performances of IPSLUAD in TCGA‐LUAD cohort. A, Distributions of risk scores (top), survival statuses of patients in low‐risk and high‐risk groups (middle), and four‐gene expression profiles of each patient (bottom). B‐E, Kaplan‐Meier curves of OS between low‐risk and high‐risk groups based on whole‐TCGA cohort (B), Stage I (C), Stage II (D), and Stage III subgroup (E)

Table 1

Correlations between IPSLUAD and clinicopathological parameters of 465 patients in TCGA‐LUAD cohort

CategoryCases, 465 (100%)IPSLUAD P value
Low (n = 232)High (n = 233)
Age, mean (SD)66.1 (9.5)64.3 (10.5)0.060
Gender
Male211 (45.4%)90 (38.8%)121 (51.9%) 0.005
Female254 (54.6%)142 (61.2%)112 (48.1%)
Smoking
Ever400 (86.0%)199(85.8%)201 (86.3%)0.985
Never65 (14.0%)33 (14.2%)32 (13.7%)
T stage
T1161 (34.6%)98 (42.2%)63 (27.0%) 0.006
T2245 (52.7%)110 (47.4%)135 (57.9%)
T344 (9.4%)17 (7.3%)27 (11.6%)
T415 (3.2%)7 (3.0%)8 (3.4%)
N stage
N0314 (67.5%)175 (75.4%)139 (59.7%) 0.004
N184 (18.1%)33 (14.2%)51 (21.9%)
N265 (14.0%)23 (9.9%)42 (18.0%)
N32 (0.4%)1 (0.4%)1 (0.4%)
M stage
M0444 (95.5%)225 (97.0%)219 (94.0%)0.184
M121 (4.5%)7 (3.0%)14 (6.0%)
TNM stage
I257 (55.3%)146 (62.9%)111 (47.6%) 0.005
II112 (24.1%)51 (22.0%)61 (26.2%)
III75 (16.1%)28 (12.1%)47 (20.2%)
IV21 (4.5%)7 (3.0%)14 (6.0%)
Status
Survival300 (64.5%)172128 <0.001
Death165 (35.5%)60105

Significant values of P < 0.05 is indicated in bold.

Table 2

Univariate analysis and multivariate analysis of risk factors for prognosis in TCGA‐LUAD cohort

Univariate analysisMultivariate analysis
P valueHR95% CI P valueHR95% CI
Age
≤66 years0.0911.3020.959‐1.769
>66 years
Gender
Female0.3531.1560.851‐1.569
Male
Smoking
Never0.9850.9960.640‐1.549
Ever
T stage
I‐II <0.001 2.3131.557‐3.436 0.003 1.9201.246‐2.959
III‐IV
Lymph node metastasis
Negative <0.001 2.5071.843‐3.411 <0.001 2.0881.429‐3.050
Positive
Distant metastasis
No 0.018 1.1881.030‐1.3690.2831.0910.931‐1.278
Yes
TNM stage
I‐II <0.001 2.3431.687‐3.2540.8251.0540.660‐1.683
III‐IV
IPSLUAD
Low‐risk <0.001 2.1781.582‐2.999 <0.001 1.8751.351‐2.602
High‐risk

Significant values of P < 0.05 is indicated in bold.

Prediction performances of IPSLUAD in TCGA‐LUAD cohort. A, Distributions of risk scores (top), survival statuses of patients in low‐risk and high‐risk groups (middle), and four‐gene expression profiles of each patient (bottom). B‐E, Kaplan‐Meier curves of OS between low‐risk and high‐risk groups based on whole‐TCGA cohort (B), Stage I (C), Stage II (D), and Stage III subgroup (E) Correlations between IPSLUAD and clinicopathological parameters of 465 patients in TCGA‐LUAD cohort Significant values of P < 0.05 is indicated in bold. Univariate analysis and multivariate analysis of risk factors for prognosis in TCGA‐LUAD cohort Significant values of P < 0.05 is indicated in bold.

Validation of IPSLUAD in nine GEO data sets

Summaries of the nine GEO data sets were shown in Table 3. In each data set, patients were stratified into high‐/low‐risk groups according to the median of IPSLUAD. It was suggested by Kaplan‐Meier survival analyses that patients in high‐risk group had significantly worse prognoses in eight out of nine data sets (88.9%) (Figure 5A‐D, I‐K, Q). The same trend was also observed in IPSLUAD of one remaining data set, though it was not statistically significant (Figure 5L). AUC was also calculated to evaluate abilities of IPSLUAD, which was ranged from 0.617 to 0.753 in the nine data sets (Figure 5E‐H, M‐P, R). Importantly, there were five data sets (GSE3141, GSE13213, GSE29016, GSE31210, and GSE68465) with AUC values that were higher than 0.7.
Table 3

Characteristics of GEO data sets used in this study

AuthorGEO AccessionYear of publicationPlatformNumber of patientsEvents
Nevins et al GSE3141 2006GPL570 [HG‐U133_Plus_2]5832
Tomida et al GSE13213 2009GPL648011749
Tsao et al GSE14814 2010GPL96 [HG‐U133A]7135
Staaf et al GSE29016 2012GPL69473828
Rousseaux et al GSE30219 2013GPL570 [HG‐U133_Plus_2]8545
Okayama et al GSE31210 2011GPL570 [HG‐U133_Plus_2]22635
Botling et al GSE37745 2012GPL570 [HG‐U133_Plus_2]10677
Pintilie et al GSE50081 2013GPL570 [HG‐U133_Plus_2]12751
Heiskanen et al GSE68465 2015GPL96 [HG‐U133A]442236
Figure 5

Prediction performances of IPSLUAD in validating data sets. Kaplan‐Meier survival curves of overall survivals in (A) GSE3141, (B) GSE13213, (C) GSE14814, (D) GSE29016, (I) GSE30219, (J) GSE31210, (K) GSE37745, (L) GSE50081, (Q) GSE68465. Receiver operating curve (ROC) analysis of IPSLUAD in (E) GSE3141, (F) GSE13213, (G) GSE14814, (H) GSE29016, (M) GSE30219, (N) GSE31210, (O) GSE37745, (P) GSE50081, (R) GSE68465

Characteristics of GEO data sets used in this study Prediction performances of IPSLUAD in validating data sets. Kaplan‐Meier survival curves of overall survivals in (A) GSE3141, (B) GSE13213, (C) GSE14814, (D) GSE29016, (I) GSE30219, (J) GSE31210, (K) GSE37745, (L) GSE50081, (Q) GSE68465. Receiver operating curve (ROC) analysis of IPSLUAD in (E) GSE3141, (F) GSE13213, (G) GSE14814, (H) GSE29016, (M) GSE30219, (N) GSE31210, (O) GSE37745, (P) GSE50081, (R) GSE68465

Comparisons between IPSLUAD and other existing immune‐related prognostic signatures

Characteristics of other immune‐related signatures used in this study were summarized in Table 4. Two largest GEO cohorts (GSE31210, GSE68465) with sample sizes higher than 200 were used in model comparisons. GSE31210 was an early‐stage cohort: all patients were in Stage I or II, while GSE68465 was a multi‐site cohort comprising both early‐stage patients and those in more advanced stages. For all comparisons in these two cohorts, IPSLUAD outperformed other signatures with superior predictive performances according to AUC values (0.75 vs 0.64‐0.73; 0.70 vs 0.49‐0.66; 0.72 vs 0.40‐0.51; 0.66 vs 0.43‐0.52) (Figure 6A‐D).
Table 4

Summary of existing immune‐related prognostic signatures

StudyNumber of genesGene list
Li et al5SLCO4C1, ELAC1, HLF, ZNF204P, ST3GAL5
Yue et al3ADAM12, BTK, ERG
Song et al30PSMC6, LIFR, PIK3CG, CTF1, RELA, MAP3K8, HLA‐DOB, LGR4, RXRB, CD79A, ADIPOR2, CCL20, PTPN6, HSPA4, GPI, ADM, IL22RA1, ANGPTL4, XCR1, AP3B1,RFXAP, HSPA2, IL23R, PDGFB, DKK1, PAK4, PSMD2,VEGFC, SHC1, HGF
Figure 6

Comparisons of IPSLUAD with other published immune‐related models in GSE31210 and GSE68465 data sets. A, Two‐year ROC in GSE31210. B, Five‐year ROC in GSE31210. C, Two‐year ROC in GSE68465. D, Five‐year ROC in GSE68‐465

Summary of existing immune‐related prognostic signatures Comparisons of IPSLUAD with other published immune‐related models in GSE31210 and GSE68465 data sets. A, Two‐year ROC in GSE31210. B, Five‐year ROC in GSE31210. C, Two‐year ROC in GSE68465. D, Five‐year ROC in GSE68‐465

Estimating the composition of infiltrated immune cells

To further characterize immune microenvironment of tumors, ImmuCellAI was employed to estimate the abundance of 24 types of immune cells in TCGA‐LUAD cohort based on RNA‐Seq data. Abundant immune cell populations with various kinds in each sample were shown in Figure 7A. Several cell types, namely CD4 naïve T cells, exhausted T cells (Tex), Type 1 regulatory T cells (Tr1), natural regulatory T (nTreg), Th1, Th2, Th17, Tfh, effector memory T (Tem) cells, natural killer T (NKT) cells, mucosal‐associated invariant T cells (MAIT), dendritic cells (DC), B cells, monocytes, macrophages, natural killer cells (NK), Neutrophil, Gamma delta T cells (Tgd), CD4 T cells, and CD8 T cells, were significantly different (P < .05) in high‐risk and low‐risk groups (Figure 7B).
Figure 7

Compositions of infiltrated immune cells between low‐risk and high‐risk groups in TCGA‐LUAD cohort through ImmuCellAI. A, Abundance of 24 immune cell types in TCGA. B, Comparisons between immune cells in low‐risk and high‐risk groups in TCGA

Compositions of infiltrated immune cells between low‐risk and high‐risk groups in TCGA‐LUAD cohort through ImmuCellAI. A, Abundance of 24 immune cell types in TCGA. B, Comparisons between immune cells in low‐risk and high‐risk groups in TCGA

Associations between IPSLUAD and immune cell markers

Different abundances of infiltrated immune cells between high‐risk and low‐risk groups were further validated through correlation analyses on gene markers and IPSLUAD. Markers significantly correlated with IPSLUAD included CD4 for CD4 T cells, CD66b and CD11b for neutrophil, CD56, CD16, and CD94 for NK cells, CD206 for M2 macrophages, CD14 for monocytes, CD19 and CD79A for B cells, CD11c and HLA‐DRA for dendritic cells, BCL6 and CD185 for Tfh, CD196, RORC, STAT3 for Th17, STAT1 for Th1 and STAT5B and HELIOS for Tregs (Table 5). It was suggested by these findings that IPSLUAD was strongly linked with innate immunities since it was closely related to several critical innate immunity‐related components (neutrophil, NK cells, and dendritic cells).
Table 5

Correlation analysis between IPSLUAD and immune cell markers

MarkerIPSLUAD P value
Low, median (IQR)High, median (IQR)
CD8+ T cell
CD8A2.39 (1.87, 2.95)2.55 (1.82, 3.18)0.148
CD8B1.56 (1.08, 2.17)1.68 (1.13, 2.35)0.136
CD4+ T cell
CD44.36 (3.80, 4.71)4.19 (3.70, 4.60) 0.035
Neutrophil
CD66b0.12 (0.04, 0.44)0.06 (0.02, 0.15) <0.001
CD11b2.79 (2.26, 3.27)2.72 (2.06, 3.22) 0.315
Natural killer cell
CD560.42 (0.23, 0.71)0.30 (0.16, 0.61) 0.001
CD164.36 (3.78, 4.94)4.61 (4.03, 5.16) <0.001
CD940.41 (0.26, 0.68)0.50 (0.30, 0.79) 0.014
M1 Macrophage
CD862.81 (2.27, 3.18)2.87 (2.41, 3.31)0.139
CD801.04 (0.67, 1.35)1.04 (0.72, 1.40)0.533
iNOS0.57 (0.34, 0.90)0.65 (0.37, 0.97)0.120
M2 Macrophage
CD1633.65 (3.07, 4.15)3.75 (3.03, 4.39)0.178
CD2063.79 (2.92, 4.32)3.49 (2.64, 4.17) 0.033
Monocyte
CD144.86 (4.40, 5.26)5.00 (4.46, 5.48) 0.043
CD331.31 (0.92, 1.72)1.24 (0.83, 1.62)0.156
CD172a3.49 (3.06, 3.84)3.56 (3.08, 3.94)0.107
B cell
CD191.65 (0.94, 2.40)1.25 (0.78, 1.76) <0.001
CD79A4.03 (3.02, 4.78)3.40 (2.76, 4.18) <0.001
Dendritic cell
CD11c2.90 (2.50, 3.38)2.77 (2.39, 3.28) 0.035
CD2051.77 (1.39, 2.13)1.82 (1.38, 2.27)0.160
HLA‐DRA8.17 (7.62, 8.68)8.00 (7.32, 8.46) 0.002
Tfh
BCL63.71 (3.46, 4.00)3.58 (3.34, 3.90) 0.001
CD1850.16 (0.06, 0.35)0.10 (0.04, 0.20) <0.001
CD2781.27 (0.85, 1.69)1.31 (0.90, 1.82)0.417
Th17
CD1960.23 (0.10, 0.41)0.13 (0.05, 0.28) <0.001
RORC3.39 (2.85, 3.86)3.20 (2.52, 3.59) <0.001
STAT34.82 (4.57, 5.03)4.76 (4.46, 4.98) 0.033
Th1
CD1832.26 (1.70,2.79)2.15 (1.60, 2.73)0.472
T‐bet1.00 (0.73, 1.48)1.00 (0.64, 1.56)0.410
STAT14.89 (4.53, 5.30)5.35 (4.86, 5.75) <0.001
STAT41.80 (1.51, 2.18)1.75 (1.38, 2.25)0.331
Treg
FOXP31.96 (1.47, 2.38)1.98 (1.57, 2.40)0.572
STAT5B3.46 (3.27, 3.68)3.39 (3.13, 3.60) 0.002
TGFβ4.32 (3.99, 4.60)4.26 (3.77, 4.66)0.619
CD3043.77 (3.43, 4.18)3.85 (3.35, 4.24)0.612
HELIOS1.71 (1.37, 1.96)1.79 (1.44, 2.06) 0.038

Significant values of P < 0.05 is indicated in bold.

Correlation analysis between IPSLUAD and immune cell markers Significant values of P < 0.05 is indicated in bold.

Pathway enrichment analysis of IPSLUAD

To further investigate molecular functions of IPSLUAD in immune systems, a GSEA was performed based on GO Biological Process Ontology and KEGG gene sets. Two significant immune‐related pathways were identified, namely innate immune response activating cell surface receptor signaling and interleukin 1 mediated signaling pathway (Figure 8A‐D). Again, close associations between genes of IPSLUAD and innate immunities were validated through this result.
Figure 8

Different immune statuses between low‐risk and high‐risk groups in TCGA‐LUAD cohort. A and B, Significant enrichments of immune‐related pathways among high‐risk patients were indicated through gene set enrichment analysis (GSEA). C and D, Gene sets between low‐risk and high‐risk groups were analyzed through expression profiles of the two enrichments

Different immune statuses between low‐risk and high‐risk groups in TCGA‐LUAD cohort. A and B, Significant enrichments of immune‐related pathways among high‐risk patients were indicated through gene set enrichment analysis (GSEA). C and D, Gene sets between low‐risk and high‐risk groups were analyzed through expression profiles of the two enrichments

DISCUSSION

With rapid developments of molecular biology and high‐throughput sequencing, contributions of tumor microenvironment to cancer developments, progressions, and metastases have been increasingly acknowledged in recent years. , There is a complex dynamic crosstalk between tumor and non‐cancer cells including infiltrated immune cells and adjacent stroma cells. However, tumors are considered as autonomous subjects by TNM staging system, which ignores significant effects of TIME. In the present study, a novel 4‐gene immune‐related prognostic signature was developed to assist TNM staging for more accurate predictions. Several immune‐related signatures for LUAD have been used to divide patients into different prognostic subgroups in previous researches. , , A list of genes for model constructions was extracted from public immune databases or through differential analyses. In contrast, we applied a different selection method based on co‐expression networks, which might identify immune‐related genes that had not been reported. In addition, by means of deliberately adding noises during model constructions, IPSLUAD was found to be robust with respect to OS predictions in various independent data sets. Both multivariate and subgroup analyses stratified through TNM staging suggested that IPSLUAD be used as an independent prognostic factor. Furthermore, IPSLUAD outperformed other immune‐related signatures in both early‐stage LUAD and multicenter cohorts. Therefore, IPSLUAD could not only significantly supplement traditional staging systems, but also be used as an accurate OS predictor for patients with LUAD. Among the four genes in IPSLUAD, ECT2, and PPIA were widely studied in previous researches. ECT2 was recognized as an oncogenic gene necessary for KrasTrp53 lung tumorigeneses in vivo. Liu et al also reported on collaborations of ECT2 with growths of lung squamous cell carcinomas promoted by PRKCI. It was suggested by Meta‐Analysis that ECT2 was a promising prognostic factor in cancers. PPIA is a housekeeping gene involved in several cancers including NSCLC, pancreatic adenocarcinomas as well as head, and neck squamous cell carcinomas. , , PPIA was also reported to participate in tumor proliferations and invasions. However, relationships between immune cells and these genes have not been illustrated. Although little is known about roles of ARNTL2 and TUBA4A in LUAD, ARNTL2 was reported to be associated with tumor progressions and metastases in colorectal and breast cancers. , TUBA4A was found highly expressed in exosomes secreted by NSCLC cell lines in vitro. Underlying mechanisms of these two genes in LUAD still need further cellular‐level explorations. Infiltrated immune cells constitute important parts of tumors and have been widely studied in recent years. Certain immune cell types were found to be indicative of responses to immunotherapies and OSs. Based on quantifications and locations of demonstrated T cells, the immunoscore developed by Galon et al showed superior predictive accuracies in prognoses compared to that of traditional TNM staging in colorectal carcinomas. High‐level expressions of PD1 or PD1 and CTLA4 on infiltrated CD8+ T cells were reported to be predictive of responses to ICIs in NSCLC. In addition, dysfunctional T cells were also considered as predictive markers of immunotherapeutic responses in melanomas. Since T cells were at the central position of antitumor immunity, a novel algorithm named ImmuCellAI was implemented to evaluate the abundance of infiltrated immune cells in TCGA cohort. Compositions of 18 T cell subsets can be calculated by ImmuCellAI, which is of superior accuracies than other frequently‐used methods including Cibersort and Timer. , , In addition to differences in multiple T cell subsets among IPSLUAD subgroups, one more intriguing finding was the significantly different pattern of innate immune systems. In respect of innate immunity, some types of cells including DCs and macrophages are context‐specific that may exert either protumorigenic or antitumorigenic functions. The most popular “don't eat me” signals including CD47 and CD24 have been extensively studied, and drugs targeted at these molecules are currently under prudent trials. , High expressions of CD47 were also found to be associated with worse prognoses in various types of tumors. Through previous researches, innate immunity might play a more important role in the field of antitumor therapies. However, clinical significances of different innate immune statuses observed among IPSLUAD subgroups need further systematic explorations. There were several limitations that should be stated in the present study. First, missing data and selection biases were inevitable as this was a retrospective study. Second, values of gene expressions extracted from RNA‐seqs or microarrays were all relative. Thus, absolute thresholds for stratifications among different cohorts could not be calculated. Since median cutoff values were used in each data set, accurate external validations would be needed in the future. Third, due to lack of data about immunotherapies, we were unable to investigate relationships between IPSLUADs and responses of ICIs.

CONCLUSIONS

In conclusion, a robust immune‐related prognostic signature was developed and validated in nine independent data sets. Different innate immune statuses were observed between low‐risk and high‐risk groups. This signature may serve as a promising prognostic biomarker for LUADs in the future.

CONFLICT OF INTEREST

The authors declare that they have no competing interest.

AUTHORS' CONTRIBUTIONS

YG, SG, and JH conceived and designed the study. ZW, XW, GZ, HZ, and RL collected the data. SS performed data analysis. SS and WG wrote the paper. BQ, FT, YG, and QX reviewed and edited the manuscript. All authors read and approved the manuscript. Fig S1 Click here for additional data file. Fig S2 Click here for additional data file. Fig S3 Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file. Table S3 Click here for additional data file.
  50 in total

1.  Releasing the Brakes on Cancer Immunotherapy.

Authors:  Antoni Ribas
Journal:  N Engl J Med       Date:  2015-09-08       Impact factor: 91.245

2.  Ectopic activation of germline and placental genes identifies aggressive metastasis-prone lung cancers.

Authors:  Sophie Rousseaux; Alexandra Debernardi; Baptiste Jacquiau; Anne-Laure Vitte; Aurélien Vesin; Hélène Nagy-Mignotte; Denis Moro-Sibilot; Pierre-Yves Brichon; Sylvie Lantuejoul; Pierre Hainaut; Julien Laffaire; Aurélien de Reyniès; David G Beer; Jean-François Timsit; Christian Brambilla; Elisabeth Brambilla; Saadi Khochbin
Journal:  Sci Transl Med       Date:  2013-05-22       Impact factor: 17.956

3.  ARNTL2 and SERPINE1: potential biomarkers for tumor aggressiveness in colorectal cancer.

Authors:  Gianluigi Mazzoccoli; Valerio Pazienza; Anna Panza; Maria Rosa Valvano; Giorgia Benegiamo; Manlio Vinciguerra; Angelo Andriulli; Ada Piepoli
Journal:  J Cancer Res Clin Oncol       Date:  2011-12-24       Impact factor: 4.553

4.  TIMER: A Web Server for Comprehensive Analysis of Tumor-Infiltrating Immune Cells.

Authors:  Taiwen Li; Jingyu Fan; Binbin Wang; Nicole Traugh; Qianming Chen; Jun S Liu; Bo Li; X Shirley Liu
Journal:  Cancer Res       Date:  2017-11-01       Impact factor: 12.701

5.  Proteomics identification of cyclophilin a as a potential prognostic factor and therapeutic target in endometrial carcinoma.

Authors:  Zhengyu Li; Xia Zhao; Shujun Bai; Zhi Wang; Lijuan Chen; Yuquan Wei; Canhua Huang
Journal:  Mol Cell Proteomics       Date:  2008-04-17       Impact factor: 5.911

Review 6.  Implications of the tumor immune microenvironment for staging and therapeutics.

Authors:  Janis M Taube; Jérôme Galon; Lynette M Sholl; Scott J Rodig; Tricia R Cottrell; Nicolas A Giraldo; Alexander S Baras; Sanjay S Patel; Robert A Anders; David L Rimm; Ashley Cimino-Mathews
Journal:  Mod Pathol       Date:  2017-12-01       Impact factor: 7.842

7.  Validation of a histology-independent prognostic gene signature for early-stage, non-small-cell lung cancer including stage IA patients.

Authors:  Sandy D Der; Jenna Sykes; Melania Pintilie; Chang-Qi Zhu; Dan Strumpf; Ni Liu; Igor Jurisica; Frances A Shepherd; Ming-Sound Tsao
Journal:  J Thorac Oncol       Date:  2014-01       Impact factor: 15.609

8.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

Authors:  Freddie Bray; Jacques Ferlay; Isabelle Soerjomataram; Rebecca L Siegel; Lindsey A Torre; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2018-09-12       Impact factor: 508.702

Review 9.  Immune crosstalk in cancer progression and metastatic spread: a complex conversation.

Authors:  Hannah Garner; Karin E de Visser
Journal:  Nat Rev Immunol       Date:  2020-02-05       Impact factor: 53.106

10.  Development and validation of an immune-related prognostic signature in lung adenocarcinoma.

Authors:  Sijin Sun; Wei Guo; Zhen Wang; Xin Wang; Guochao Zhang; Hao Zhang; Renda Li; Yibo Gao; Bin Qiu; Fengwei Tan; Yushun Gao; Qi Xue; Shugeng Gao; Jie He
Journal:  Cancer Med       Date:  2020-06-26       Impact factor: 4.452

View more
  35 in total

1.  Identified lung adenocarcinoma metabolic phenotypes and their association with tumor immune microenvironment.

Authors:  Xian-Ning Wu; Dan Su; Yi-De Mei; Mei-Qing Xu; Hao Zhang; Ze-Yu Wang; Li-Ling Li; Li Peng; Jun-Yi Jiang; Jia-Yi Yang; Dong-Jie Li; Hui Cao; Zhi-Wei Xia; Wen-Jing Zeng; Quan Cheng; Nan Zhang
Journal:  Cancer Immunol Immunother       Date:  2021-03-03       Impact factor: 6.968

2.  A Novel Prognostic and Predictive Signature for Lung Adenocarcinoma Derived from Combined Hypoxia and Infiltrating Immune Cell-Related Genes in TCGA Patients.

Authors:  Xiaofeng Wu; Jing Zhu; Wei Liu; Meng Jin; Mengqing Xiong; Ke Hu
Journal:  Int J Gen Med       Date:  2021-12-30

3.  Tumor stemness and immune infiltration synergistically predict response of radiotherapy or immunotherapy and relapse in lung adenocarcinoma.

Authors:  Hongjie Shi; Linzhi Han; Jinping Zhao; Kaijie Wang; Ming Xu; Jiajun Shi; Zhe Dong
Journal:  Cancer Med       Date:  2021-11-05       Impact factor: 4.452

4.  A Novel Prognostic Signature Revealed the Interaction of Immune Cells in Tumor Microenvironment Based on Single-Cell RNA Sequencing for Lung Adenocarcinoma.

Authors:  Xing Jin; Zhengyang Hu; Qihai Sui; Mengnan Zhao; Jiaqi Liang; Zhenyu Liao; Yuansheng Zheng; Hao Wang; Yu Shi
Journal:  J Immunol Res       Date:  2022-07-01       Impact factor: 4.493

5.  Comprehensive Analysis and Identification of an Immune-Related Gene Signature with Prognostic Value for Prostate Cancer.

Authors:  Yongrui Zhang; Yaowen Fu
Journal:  Int J Gen Med       Date:  2021-06-28

6.  Development and validation of an immune-related prognostic signature in lung adenocarcinoma.

Authors:  Sijin Sun; Wei Guo; Zhen Wang; Xin Wang; Guochao Zhang; Hao Zhang; Renda Li; Yibo Gao; Bin Qiu; Fengwei Tan; Yushun Gao; Qi Xue; Shugeng Gao; Jie He
Journal:  Cancer Med       Date:  2020-06-26       Impact factor: 4.452

7.  NEIL3 may act as a potential prognostic biomarker for lung adenocarcinoma.

Authors:  Cui Zhao; Jian Liu; Yifei Liu; Jianguo Zhang; Haomiao Zhou; Xin Qian; Hui Sun; Xuewen Chen; Miaosen Zheng; Tingting Bian; Lei Liu
Journal:  Cancer Cell Int       Date:  2021-04-20       Impact factor: 5.722

8.  Identification of the pyroptosis‑related prognostic gene signature and the associated regulation axis in lung adenocarcinoma.

Authors:  Wanli Lin; Ying Chen; Bomeng Wu; Ying Chen; Zuwei Li
Journal:  Cell Death Discov       Date:  2021-06-25

9.  Do mRNA profiles of lung adenocarcinomas provide information that will help individual patients?

Authors:  Mitchell S von Itzstein; David E Gerber; John D Minna
Journal:  EBioMedicine       Date:  2020-09-22       Impact factor: 8.143

10.  Gene signature based on B cell predicts clinical outcome of radiotherapy and immunotherapy for patients with lung adenocarcinoma.

Authors:  Linzhi Han; Hongjie Shi; Yuan Luo; Wenjie Sun; Shuying Li; Nannan Zhang; Xueping Jiang; Yan Gong; Conghua Xie
Journal:  Cancer Med       Date:  2020-10-24       Impact factor: 4.452

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.