Chaoqi Zhang1, Zhihui Zhang1, Nan Sun1, Zhen Zhang2, Guochao Zhang1, Feng Wang1, Yuejun Luo1, Yun Che1, Jie He1. 1. Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China. 2. Biotherapy Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China.
Abstract
Background: Costimulatory molecules play significant roles in mounting anti-tumor immune responses, and antibodies targeting these molecules are recognized as promising adjunctive cancer immunotherapies. Here, we aim to conduct a first full-scale exploration of costimulatory molecules from the B7-CD28 and TNF families in patients with lung adenocarcinoma (LUAD) and generated a costimulatory molecule-based signature (CMS) to predict survival and response to immunotherapy. Methods: We enrolled 1549 LUAD cases across 10 different cohorts and included 502 samples from TCGA for discovery. The validation set included 970 cases from eight different Gene Expression Omnibus (GEO) datasets and 77 frozen tumor tissues with qPCR data. The underlying mechanisms and predictive immunotherapy capabilities of the CMS were also explored. Results: A five gene-based CMS (CD40LG, TNFRSF6B, TNFSF13, TNFRSF13C, and TNFRSF19) was initially constructed using the bioinformatics method from TCGA that classifies cases as high- vs. low-risk groups per OS. Multivariable Cox regression analysis confirmed that the CMS was an independent prognostic factor. As expected, CMS exhibited prognostic significance in the stratified cohorts and different validation cohorts. Additionally, the prognostic meta-analysis revealed that CMS was superior to the previous signature. Samples in high- and low-risk groups exhibited significantly different tumor-infiltrating leukocytes and inflammatory activities. Importantly, we found that the CMS scores were closely related to multiple immunotherapy biomarkers. Conclusion: We conducted the first and most comprehensive costimulatory molecule landscape analysis of patients with LUAD and built a clinically feasible CMS for prognosis and immunotherapy response prediction, which will be helpful for further optimize immunotherapies for cancer.
Background: Costimulatory molecules play significant roles in mounting anti-tumor immune responses, and antibodies targeting these molecules are recognized as promising adjunctive cancer immunotherapies. Here, we aim to conduct a first full-scale exploration of costimulatory molecules from the B7-CD28 and TNF families in patients with lung adenocarcinoma (LUAD) and generated a costimulatory molecule-based signature (CMS) to predict survival and response to immunotherapy. Methods: We enrolled 1549 LUAD cases across 10 different cohorts and included 502 samples from TCGA for discovery. The validation set included 970 cases from eight different Gene Expression Omnibus (GEO) datasets and 77 frozen tumor tissues with qPCR data. The underlying mechanisms and predictive immunotherapy capabilities of the CMS were also explored. Results: A five gene-based CMS (CD40LG, TNFRSF6B, TNFSF13, TNFRSF13C, and TNFRSF19) was initially constructed using the bioinformatics method from TCGA that classifies cases as high- vs. low-risk groups per OS. Multivariable Cox regression analysis confirmed that the CMS was an independent prognostic factor. As expected, CMS exhibited prognostic significance in the stratified cohorts and different validation cohorts. Additionally, the prognostic meta-analysis revealed that CMS was superior to the previous signature. Samples in high- and low-risk groups exhibited significantly different tumor-infiltrating leukocytes and inflammatory activities. Importantly, we found that the CMS scores were closely related to multiple immunotherapy biomarkers. Conclusion: We conducted the first and most comprehensive costimulatory molecule landscape analysis of patients with LUAD and built a clinically feasible CMS for prognosis and immunotherapy response prediction, which will be helpful for further optimize immunotherapies for cancer.
Over the last few years, lung cancer has become the most common malignant tumor and is a grave danger to global human health, with an annual incidence increasing at a rate of 7.5%.[1] Approximately four out of five lung cancers are classified as non-small cell lung cancer (NSCLC). As the major histological subtype of NSCLC, lung adenocarcinoma (LUAD) accounts for over 1 million worldwide deaths annually.[2] Despite the amplification of traditional approaches which – in combination with targeted therapy – have reduced mortality, the five-year OS (OS) rate of LUAD remains about 15%.[3] The introduction of immunotherapy, especially immune checkpoint inhibitors (ICIs) targeting programmed cell death protein 1 (PD-1) and programmed cell death 1 ligand 1 (PD-L1), has revolutionized lung cancer treatment.[4,5] More recently, pembrolizumab monotherapy was approved to replace chemotherapy as the frontline treatment for patients with PD-L1 positive metastatic NSCLC.[6] Although treatment for lung cancer has been improved with the development of ICI-based immunotherapies, only a small proportion of patients with lung cancer can benefit from this schedule. Therefore, we must be able to predict the best candidates for immunotherapy and develop other novel immune checkpoint targets.The success of ICIs has emerged from a deep understanding of the functions of the immune system and immunosuppressive conditions that are generated in the tumor microenvironment (TME).[7,8] In the TME, T cells help distinguish cancer cells from healthy cells and initiate subsequent attacks. Before the attack, the naïve T cells need two signals to be active. The first signal is generated once a specific antigen is recognized by the T cell receptor (TCR). The second signal is a nonspecific costimulatory signal.[9] Based on the fact that the naïve T cells cannot be activated in without costimulatory signals,[10] cancer cells prevent the recognition of these signals by changing the costimulatory molecule signals and expressions in the TME.[11] Hence, ICIs prevent tumor cells from delivering incorrect messages to T cells, thereby selectively restoring a tumor-induced immune deficiency in the TME.[12] In addition to the best described immune checkpoint pathways (PD-L1/PD-1, CD86/CTLA4) that belong to the B7-CD28 family,[13,14] other co-stimulation pathways mainly arise from the tumor necrosis factor (TNF) family.[15]Currently, 13 molecules are classified as B7-CD28 family members, including eight molecules (CD80, CD86, PD-L1, PD-L2, ICOSLG, B7-H3, B7x, and HHLA2) that belong to the B7 family and five molecules (CD28, CTLA4, ICOS, PD-1, and TMIGD2) that belong to the CD28 family.[13] The TNF family consists of the TNF ligand superfamily (TNFSF) and the TNF receptor superfamily (TNFRSF) with 48 molecules.[16] Nineteen legends were defined as TNFSF, and other 29 receptors considered members of the TNFRSF (Table 1). These costimulatory molecules – consisting of members of the B7-CD28 and TNF families – constitute potential molecular targets for the development of novel ICIs and may make excellent additions to existing immunotherapeutic strategies.[17,18] However, the expression patterns and clinical significance of the majority of these members remain unknown. There is a need for full-scale investigations of these molecules in patients with LUAD.
Table 1.
Univariate Cox analysis of costimulatory molecule genes in TCGA Cohort.
Official symbol
Aliases
Family
HR
95%CI
P value
CD27
TNFRSF7
TNFRSF
0.8682
0.7858-0.9592
0.0055
CD274
PD-L1, B7-H1
B7
1.0125
0.9239-1.1097
0.7897
CD276
B7-H3
B7
1.3865
1.0728-1.7919
0.0125
CD28
Tp44
CD28
0.8635
0.7721-0.9658
0.0102
CD40
TNFRSF5
TNFRSF
0.9089
0.8063-1.0246
0.1182
CD40LG
TNFSF5, CD154, CD40L
TNFSF
0.8194
0.7421-0.9049
0.0000
CD70
TNFSF7, CD27L
TNFSF
1.0688
0.9757-1.1708
0.1522
CD80
B7-1, CD28LG1
B7
0.8878
0.7986-0.9869
0.0275
CD86
B7-2, CD28LG2
B7
0.9099
0.8076-1.0252
0.1210
CTLA4
CD152
CD28
0.8772
0.7924-0.9711
0.0116
EDA
EDA-A1, EDA-A2
TNFSF
0.9561
0.8727-1.0474
0.3345
EDA2R
TNFRSF27, XEDAR
TNFRSF
0.9107
0.8387-0.9888
0.0259
EDAR
EDA-A1R
TNFRSF
1.0425
0.9744-1.1154
0.2275
FAS
TNFRSF6, CD95
TNFRSF
0.9501
0.8396-1.0752
0.4176
FASLG
TNFSF6, CD95-L
TNFSF
0.9083
0.8204-1.0056
0.0641
HHLA2
B7-H5
B7
1.0016
0.9602-1.0448
0.9414
ICOS
CD278, CVID1
CD28
0.8970
0.8092-0.9942
0.0384
ICOSLG
B7-H2, CD275
B7
0.8427
0.7104-0.9996
0.0495
LTA
TNFSF1
TNFSF
0.8918
0.8002-0.9939
0.0385
LTB
TNFSF3
TNFSF
0.8771
0.7949-0.9679
0.0091
LTBR
TNFRSF3
TNFRSF
1.3077
1.0591-1.6147
0.0126
NGFR
TNFRSF16, CD271
TNFRSF
1.0011
0.9123-1.0986
0.9808
PDCD1
PD-1, CD279
CD28
0.9625
0.8693-1.0658
0.4630
PDCD1LG2
PD-L2, B7DC, CD273
B7
0.9691
0.8733-1.0754
0.5544
RELT
TNFRSF19L
TNFRSF
0.9259
0.7884-1.0873
0.3477
TMIGD2
CD28H
CD28
0.9290
0.8279-1.0423
0.2096
TNF
TNFSF2, TNFA
TNFSF
0.9098
0.8257-1.0024
0.0560
TNFRSF10A
TRAILR1, CD261
TNFRSF
1.0582
0.9025-1.2408
0.4858
TNFRSF10B
TRAILR2, CD262
TNFRSF
1.0336
0.8482-1.2596
0.7428
TNFRSF10C
TRAILR3, CD263
TNFRSF
0.8628
0.7727-0.9634
0.0087
TNFRSF10D
TRAILR4, CD264
TNFRSF
1.0799
0.9334-1.2495
0.3015
TNFRSF11A
RANK, CD265
TNFRSF
1.1316
0.9925-1.2902
0.0647
TNFRSF11B
OPG
TNFRSF
1.0088
0.9139-1.1135
0.8619
TNFRSF12A
FN14, TWEAKR, CD266
TNFRSF
1.1040
0.9768-1.2477
0.1131
TNFRSF13B
TACI, TNFRSF14B, CD267
TNFRSF
0.8682
0.7987-0.9438
0.0009
TNFRSF13C
BAFFR, CD268
TNFRSF
0.8788
0.7977-0.9682
0.0090
TNFRSF14
LIGHTR, HVEM, CD270
TNFRSF
0.8253
0.7112-0.9577
0.0114
TNFRSF17
BCMA, TNFRSF13A, CD269
TNFRSF
0.9023
0.8370-0.9727
0.0073
TNFRSF18
GITR, AITR, CD357
TNFRSF
0.9919
0.9060-1.0860
0.8607
TNFRSF19
TROY, TAJ
TNFRSF
0.8596
0.7735-0.9553
0.0050
TNFRSF1A
TNFR1, CD120A
TNFRSF
1.3583
1.0542-1.7500
0.0179
TNFRSF1B
TNFR2, CD120B
TNFRSF
0.8640
0.7360-1.0142
0.0739
TNFRSF21
DR6, CD358
TNFRSF
1.1041
0.9425-1.2935
0.2201
TNFRSF25
DR3, TNFRSF12
TNFRSF
0.9262
0.8376-1.0241
0.1349
TNFRSF4
OX40, CD134
TNFRSF
0.9510
0.8323-1.0867
0.4607
TNFRSF6B
DCR3
TNFRSF
1.1399
1.0481-1.2397
0.0022
TNFRSF8
CD30
TNFRSF
0.9513
0.8343-1.0847
0.4560
TNFRSF9
4-1BB, CD137, ILA
TNFRSF
0.9837
0.9007-1.0744
0.7156
TNFSF10
TRAIL, CD253
TNFSF
0.9741
0.8621-1.1007
0.6740
TNFSF11
RANKL, CD254
TNFSF
1.0685
0.9942-1.1484
0.0717
TNFSF12
TWEAK
TNFSF
0.7643
0.6331-0.9226
0.0051
TNFSF13
APRIL, CD256
TNFSF
0.7978
0.6847-0.9296
0.0038
TNFSF13B
BAFF, CD257
TNFSF
0.9889
0.8846-1.1056
0.8448
TNFSF14
LIGHT, HVEML, CD258
TNFSF
1.0094
0.9061-1.1246
0.8648
TNFSF15
TL1A
TNFSF
0.9212
0.8391-1.0113
0.0848
TNFSF18
GITRL
TNFSF
0.8992
0.7838-1.0316
0.1296
TNFSF4
OX-40L, CD134L, CD252
TNFSF
1.0323
0.9090-1.1723
0.6246
TNFSF8
CD30L, CD153
TNFSF
0.8864
0.7979-0.9848
0.0247
TNFSF9
4-1BB-L, CD137L
TNFSF
1.0888
0.9787-1.2112
0.1177
VTCN1
B7-H4
B7
0.9732
0.9192-1.0304
0.3513
TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval.
Univariate Cox analysis of costimulatory molecule genes in TCGA Cohort.TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval.We used LUAD gene expression data from The Cancer Genome Atlas (TCGA) to systematically explore the expression patterns and prognoses of these costimulatory molecules. Then, through a series of statistical methods, we built a costimulatory molecule-based signature (CMS) with significantly different prognoses. The CMS was well-validated in nine different cohorts from Gene Expression Omnibus (GEO) datasets and an independent cohort using clinical samples. Also, according to a prognostic meta-analysis, we determined that CMS was superior to the previous costimulatory molecule-related model. We also found that the CMS was characterized by distinct inflammatory profiles and specific immune infiltrating lymphocytes. What’s more, the CMS was able to predict the immunotherapy response in patients with LUAD. Therefore, our work describes the systemic landscape of costimulatory molecules based on B7-CD28 and TNF families and highlights the potential underlying clinical applications for the CMS, thereby supporting the development of rationales to guide prognosis management and immunotherapy in patients with LUAD.
Materials and methods
mRNA expression datasets and clinical information
A total of nine public datasets, including 1472 cases with corresponding mRNA expression data and clinical data, were gathered in this study. The training set consisted of data from 502 patients with genetic information (Illumina HiSeq 2000, log2 transformed RSEM normalized read count) and matching OS data from TCGA that were downloaded from the Cancer Genomics Browser of University of California Santa Cruz (UCSC) (https://genomecancer. ucsc.edu).[19] Eight other public datasets with mRNA microarray data were collected from GEO datasets with processed series matrix files (http://www.ncbi.nlm.nih.gov/geo), including GSE11969 (n = 91, log10 ratio (Cy5/Cy3) normalized read count),[20] GSE13213 (n = 117, log10 ratio (Cy5/Cy3) normalized read count),[21] GSE19188 (n = 40, log2 transformed RMA normalized read count),[22] GSE30219 (n = 83, log2 transformed RMA normalized read count),[23] GSE31210 (n = 226, log2 transformed RMA normalized read count),[24] GSE37745 (n = 106, log2 transformed RMA normalized read count),[25] GSE41271 (n = 180, log2 transformed RMA normalized read count),[26] and GSE50081 (n = 127, log2 transformed RMA normalized read count).[27] Moreover, for the genes with one more probe, mean expression values were recognized as the expression data. The clinical characteristics of these patients from multiple institutions are summarized in Table 2.
Table 2.
Clinical characteristics of the enrolled patients.
Characteristics
TCGAN=502
GSE11969N=90
GSE13213 N=117
GSE19188 N=40
GSE30219 N=83
GSE31210 N=226
GSE37745 N=106
GSE41271 N=180
GSE50081 N=127
IndependentN=77
Age, year
Mean
65.3
61.0
60.7
–
61.1
59.6
63.0
64.1
68.7
60.0
Gender
Male
231
47
60
25
65
105
46
91
65
39
Female
271
43
57
15
18
121
60
89
62
38
Smoking history
Yes
416
45
61
–
–
111
–
–
92
46
No
72
45
56
–
–
115
–
–
23
31
NA
14
0
0
–
–
0
–
–
12
0
TNM stage
I and II
388
65
92
–
83
226
89
129
127
62
III and Ⅳ
105
25
25
–
0
0
17
51
0
15
NA
9
0
0
–
0
0
0
0
0
0
OS state
Alive
320
50
68
16
40
191
29
111
76
57
Death
182
40
49
24
43
35
77
69
51
20
NA, not available; OS, overall survival.
Clinical characteristics of the enrolled patients.NA, not available; OS, overall survival.
RNA extraction and quantitative real-time reverse transcription–PCR
We used 77 surgically resected LUAD tissues, collected from The First Affiliated Hospital of Zhengzhou University between August 2013 and January 2015, as the independent cohort. Then total RNA was extracted from LUAD tissues using the RNAiso Plus reagent (Takara, #9109) according to the manufacturer’s instructions. The first strand of complementary DNA was synthesized from total RNA using the Prime Script™ RT reagent kit (Takara, #RR047A). Quantitative real-time PCR was performed with SYBR Premix Ex Taq II (Takara, #RR820A), and data were analyzed in the Agilent Mx3005P. With the endogenous control for normalization of GAPDH, the expression data of all the selected genes were log2 transformed before signature validation. All the primer sequences in this research are displayed in Supplementary Table 1. All patients were received and signed the informed consents. The samples used in the study were approved by the Institutional Review Boards of the First Affiliated Hospital of Zhengzhou University.
Functional enrichment analysis
After deleting the genes with low expression values (more than half of all genes analyzed had 0 expression), functional enrichment analysis based on CMS related genes were conducted through the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) in DAVID 6.8 (http://david.abcc.ncifcrf.gov).
Estimated the profiling of immune cell infiltration
The FPKM of mRNAseq data of LUAD from TCGA was used for estimating the fractions of 22 immune cell types in the TME by CIBERSORT.[28] The data was not standardized and genes with mean expression 0 were filtered out before submitting to CIBERSORT. The profiling of multiple immune cell types was performed through the leucocyte gene signature matrix, termed LM22, for the CIBERSORT software (http://cibersort.stanford.edu/). LM22 consists of 547 genes that can distinguish 22 immune cells, including different subtypes of B cell types, T cell types, natural killer cells (NKs), plasma cells, and myeloid cell types.
Biomarkers for predicting immunotherapy response
The potential immunotherapy response prediction performance of CMS was estimated with the following biomarkers: tumor mutation burden (TMB), neoantigen, PD-L1 protein expression, and Tumor Immune Dysfunction and Exclusion (TIDE) score. The TMB and neoantigen data of LUAD patients in the TCGA dataset were separated from The Cancer Immunome Atlas (TCIA) (https://tcia.at/home).[29] The protein data of PD-L1 expression was realized through the reverse-phase protein array (RPPA) analysis, which was retrieved from cbioPortal (http://www.cbioportal.org). TIDE has been proven to outperform known immunotherapy biomarkers in predicting immunotherapy response in patients with melanoma and lung cancer, especially those treated with ICIs.[30] TIDE scores, T cell dysfunction scores, and T cell exclusion scores were download from the TIDE web (http://tide.dfci.harvard.edu) after following the instructions on the website to uploaded input data. All the expression details of these biomarkers used in this research are summarized in Supplementary Table 2.
Signature identification and statistical analysis
A univariate Cox proportional hazards regression analysis and stepwise Cox proportional hazards regression model were used to construct the signature. Then, the CMS was constructed using five selected genes (CD40LG, TNFRSF6B, TNFSF13, TNFRSF13C, and TNFRSF19) with a linear combination of their expression values. These inputs were weighted with the regression coefficients from the stepwise Cox regression analyses. The expression details of the five selected genes and the corresponding risk scores in different public cohorts are displayed in Supplementary Table 2. All the patients in different cohorts were divided into high- and low-risk groups based on the optimal cutoff point, which was determined by the “surv_cutpoint” function of the “survminer” R package. The prognostic significance of the CMS between the high- and low-risk groups in different sets and subgroups were calculated with Kaplan-Meier curves and a 2-tailed log-rank test. The Mann-Whitney U-test was applied to analyze the between-group differences for immune cell fractions and immunotherapy biomarkers. Univariate and multivariate Cox regression analyses were conducted to clarify the independent prognostic factors. P < .05 was considered statistically significant for all statistical methods. STATA software (version 12.0) was used to perform the prognostic meta-analysis of CMS and B7-CD 28 signature. The two overall hazard ratio (HR) values were calculated using the random-effects model. R software version 3.5.1 (https://www.r-project.org) was used for other data analyses.
Results
The panorama and prognostic significance of costimulatory molecule genes in LUAD
A total of 60 costimulatory molecule genes were separated from the TCGA LUAD data, which consisted of 13 well-defined B7-CD28 family costimulatory molecules,[13] and 47 TNF family costimulatory molecules (Table 1).[16] The relationships of these molecules are shown in Supplementary Figure 1. Correlation analysis based on TCGA dataset revealed that most of the costimulatory molecules were highly relevant to others. Then, 502 LUAD patients with 60 costimulatory molecule expression data and matched complement OS information from the TCGA data were used to evaluate the prognostic significance of these candidate genes. Univariate cox proportional hazards regression analysis was conducted, and the results showed that 23 genes were significantly associated with OS (P < .05, Table 1). Among the significant genes, four genes (CD276, LTBR, TNFRSF1A and TNFRSF6B) were confirmed as risky factors with HRs (HR)>1, and 19 genes (CD27, CD28, CD40LG, CD80, CTLA4, EDA2R, ICOS, ICOSLG, LTA, LTB, TNFRSF10C, TNFRSF13B, TNFRSF13C, TNFRSF14, TNFRSF17, TNFRSF19, TNFSF12, TNFSF13 and TNFSF8) were confirmed as protective factors with HR<1.
Figure 1.
Identification of the CMS in the TCGA dataset. (a) the distribution of risk score, survival status, and the five-gene expression panel. Kaplan-Meier curves were conducted to estimate overall survival for the high- and low-risk groups based on the risk score; (b) total patients with LUAD (c) patients with early-stage (stage I and II) LUAD. (d) patients with advanced-stage (stage III and Ⅳ) LUAD.
Identification of the CMS in the TCGA dataset. (a) the distribution of risk score, survival status, and the five-gene expression panel. Kaplan-Meier curves were conducted to estimate overall survival for the high- and low-risk groups based on the risk score; (b) total patients with LUAD (c) patients with early-stage (stage I and II) LUAD. (d) patients with advanced-stage (stage III and Ⅳ) LUAD.
Identification of CMS for prognostication
With the tremendous success in the clinical use of ICIs targeting costimulatory factors for lung cancer, we sought to establish CMS for prognostication. A stepwise Cox proportional hazards regression model was then used to filter out the redundant candidate genes and construct a prognostic model. Using the prognostic information of the 502 cases and the corresponding expression details of the 23 significant candidate genes, the stepwise method finally filtered out the combination of the 5 genes. We then developed a risk score formula for patients with LUAD based on the gene’s expression levels to predict patient survival: risk score = (−0.1075× CD40LG) + (0.1418× TNFRSF6B) + (−0.1603× TNFSF13) + (−0.1069× TNFRSF13C) + (−0.0803× TNFRSF19). The expression panel of the five genes, the distribution of risk scores, and survival status of each patient are shown in Figure 1(a). Next, we classified all the patients in the TCGA cohort into high-risk (n = 292) and low-risk groups (n = 210) based on the optimal cutoff point (cutoff value = −2.3834). We found that patients in the high-risk group showed significantly worse OS (Figure 1(b), HR 2.0435, 95% confidence interval (CI) 1.4811–2.8195, P < .0001). When we further applied the signature into different clinical stages, the results indicated that the formula still worked well. Specifically, we observed significant OS time between the high- and low-risk groups both for early-stage (stage I and II) (Figure 1(c), HR 1.9961, 95% CI 1.3641–2.9210, P = .0003) and advanced stage disease (stage III and Ⅳ) (Figure 1(d), HR 2.7529, 95% CI 1.6335–4.6394, P < .0001).To further explore whether the signature-based risk score was an independent factor in patients with LUAD, univariate, and multivariate Cox regression analyses in the TCGA database were conducted. The results of the multivariate Cox regression model confirmed that the risk score was a significant factor (HR = 1.7952, 95%CI 1.2254–2.6298, P = .0027) independent of age, sex, smoking history, clinical stage, and mutation (MUT) status (Table 3).
Table 3.
Univariable and multivariable Cox regression analysis of the costimulatory molecule-based signature and survival in TCGA dataset.
Univariable and multivariable Cox regression analysis of the costimulatory molecule-based signature and survival in TCGA dataset.Abbreviations: HR, hazard ratio; CI, confidence interval; WT, wild-type; MUT, mutation.
Evaluation of the performance of CMS in different clinical subgroups
Sex, age, smoking history, and MUT status were factors that influenced the TME, especially the expression of immune checkpoints. Consequently, patients from TCGA were then divided into different subgroups based on these parameters: sex (male or female), age [older (age ≥ 60) or younger (age<60)], smoking (smoker or nonsmoker), and MUT status [EGFR wide-type (WT), EGFR MUT, KRAS WT, KRAS MUT, or EGFR/KRAS WT]. All the patients in different subgroups were stratified into high- and low- risk groups based on the risk score with the same formula. The results showed that all the high-risk groups had significantly different OS compared to the paired low-risk groups (Supplementary Figures 2 and 3, P < .05).
Figure 2.
The association between CMS and overall survival in nine different validation cohorts. Kaplan-Meier curves were created to estimate overall survival for in high- and low-risk groups based on the risk score. (a) GSE11969 (range from −0.0685 to 0.2331); (b) GSE13213 (range from −0.7562 to 0.5577); (c) GSE19188 (range from −0.2603 to 0.3283); (d) GSE30219 (range from −2.5093 to −1.5032); (e) GSE31210 (range from −2.9149 to −1.5841); (f) GSE37745 (range from −2.5732 to −1.6179); (g) GSE41271 (range from −1.8310 to −0.0865); (h) GSE50081 (range from −2.4230 to −1.4792); (i) an independent cohort with qPCR data.
Figure 3.
Compare CMS with previous costimulatory molecules signature. (a) a meta‐analysis was performed using the prognostic results of CMS in nine public datasets. (b) a meta‐analysis was performed using the prognostic results of the B7-CD28 signature in nine public datasets.
The association between CMS and overall survival in nine different validation cohorts. Kaplan-Meier curves were created to estimate overall survival for in high- and low-risk groups based on the risk score. (a) GSE11969 (range from −0.0685 to 0.2331); (b) GSE13213 (range from −0.7562 to 0.5577); (c) GSE19188 (range from −0.2603 to 0.3283); (d) GSE30219 (range from −2.5093 to −1.5032); (e) GSE31210 (range from −2.9149 to −1.5841); (f) GSE37745 (range from −2.5732 to −1.6179); (g) GSE41271 (range from −1.8310 to −0.0865); (h) GSE50081 (range from −2.4230 to −1.4792); (i) an independent cohort with qPCR data.Compare CMS with previous costimulatory molecules signature. (a) a meta‐analysis was performed using the prognostic results of CMS in nine public datasets. (b) a meta‐analysis was performed using the prognostic results of the B7-CD28 signature in nine public datasets.
Validation of the CMS in nine independent cohorts
To identify whether the CMS derived from the TCGA cohort was robust, we first evaluated its performance in eight independent public validation cohorts. These consisted of the remaining GSE11969, GSE13213, GSE19188, GSE30219, GSE31210, GSE37745, GSE41271, and GSE50081 datasets. The CMS stratified all patients from different public cohorts into the high- and low-risk groups using the same formula [risk score = (−0.1075× CD40LG) + (0.1418× TNFRSF6B) + (−0.1603× TNFSF13) + (−0.1069× TNFRSF13C) + (−0.0803× TNFRSF19)] with the optimal cutoff points. As shown in Figure 2, significant differences between the high- and low-risk groups were found in most of the GEO datasets, including GSE13213 (cutoff value = −0.2261, HR 2.5990, 95% CI 1.3539–4.9890, P = .0029), GSE19188 (cutoff value = −0.0061, HR 2.4817, 95% CI 1.0571–5.8262, P = .0308), GSE30219 (cutoff value = −2.2083, HR 2.2955, 95% CI 1.1495–4.5839, P = .0156), GSE31210 (cutoff value = −2.5365, HR 2.2037, 95% CI 1.0960–4.4308, P = .0229), GSE41271 (cutoff value = −1.0183, HR 2.3023, 95% CI 1.4267–3.7153, P = .0004) and GSE50081 (cutoff value = −2.1254, HR 2.2958, 95% CI 1.2393–4.2530, P = .0066). Meanwhile, in the GSE11969 (cutoff value = 0.0607) and GSE37745 (cutoff value = −2.1263) datasets, the signature showed a borderline difference between the high- and low-risk groups with P values of 0.1015 and 0.1192, respectively (Figure 2(a and f)). The different performance of CMS in different datasets may be caused by the different race or the high spatial heterogeneity of the immune microenvironment.To further measure whether the signature could be used in clinical practice, we validated the signature in an independent cohort that contained 77 frozen tissue samples with qRT-PCR data. By using the same model [risk score = (−0.1075× CD40LG) + (0.1418× TNFRSF6B) + (−0.1603× TNFSF13) + (−0.1069× TNFRSF13C) + (−0.0803× TNFRSF19)] and the optimal cutoff point (cutoff value = −0.1300), patients were classified into high- (n = 32) and low-risk groups (n = 45). As expected, a significant difference in mortality was found between these two groups (Figure 2(i), HR 2.9189, 95% CI 1.1622–7.3309, P = .0169).
Compare the CMS with the previous model
Prior to the creation of our signature, Shanbo Zheng et al. constructed a signature for LUAD based on the costimulatory molecules from the B7-CD28 family (B7-CD28 signature) with a risk score of 0.3313× CD276 – 0.1559× CD28.[31] We then comprehensively assessed the prognostic significance of our CMS and the B7-CD28 signature by examining public datasets and conducting prognostic meta-analyses based on the nine groups (n = 1472) of the two different signature groups. As shown in Figure 3(a), our CMS performed very well in the different cohorts, producing HRs larger than 1. On the contrary, the B7-CD28 signature was not that stable in different cohorts and some of the HRs were smaller than 1 (Figure 3(b)). More importantly, the meta-analysis combined HR of our CMS was far larger than that of the B7-CD28 signature. These findings indicate that our signature was superior to the previous model.
CMS related biological processes and pathways
The consistent prognostic performance of the CMS was confirmed in 10 different cohorts. This prompted us to investigate the biological features of patients with different risk scores. We first filtered out 2771 low-expression genes (genes where half or more than half of the values were 0) and then extracted the genes that strongly correlated with risk score (Pearson |R| > 0.45, P < .0001) from the remaining 17759 genes in TCGA dataset. Collectively, 14 positively related genes and 399 negatively related genes were screened out (Figure 4(a)). Then, these selected genes were chosen for GO and KEGG analyses through use of the online DAVID tool (https://david.ncifcrf.gov). The results revealed that signature-related genes were more involved in the biological process of the immune response, especially B cell and T cell-related immune response (Figure 4(b)). KEGG analysis further confirmed that these genes were closely related to immune-specific pathways (Figure 4(c)).
Figure 4.
CMS-related biological pathways. (a) the most related genes of TNF family-based signature in patients with LUAD (Pearson |R| > 0.45, P < .0001). (b and c) GO and KEGG analyses of the related genes.
CMS-related biological pathways. (a) the most related genes of TNF family-based signature in patients with LUAD (Pearson |R| > 0.45, P < .0001). (b and c) GO and KEGG analyses of the related genes.
CMS-related immune cell infiltration and inflammatory activities
To further increase our understating of the CMS-related immune landscape, we first explored the relationship between CMS and immune cell infiltration. The estimated fractions of different immune cells in the TME of LUAD were calculated by CIBERSORT, in combination with the LM22. The results demonstrated that the panorama of immune cells between high- and low-risk patients were dramatically different (Figure 5(a)). In particular, high-risk patients showed a significantly higher proportion of activated NK cells, activated dendritic cells (DCs), neutrophils, macrophages M0, resting DCs, and regulator T cells (Tregs) (Figure 5(b and c)). On the contrary, low-risk patients featured a high proportion of memory B cells, resting CD4 memory T cells, and gamma delta T cells (Figure 5(b and c)).
Figure 5.
CMS-related immune cell infiltration and inflammatory activities. (a) the relative proportion of immune cell expression in high- and low-risk patients. (b and c) differentially expression immune cells in high- and low-risk patients. (d) the details of seven inflammatory metagenes and risk score. (e) correlogram of risk score, and inflammatory metagenes. *, **, ***, and **** represent P < .05, P < .01, P < .001 and P < .0001, respectively.
CMS-related immune cell infiltration and inflammatory activities. (a) the relative proportion of immune cell expression in high- and low-risk patients. (b and c) differentially expression immune cells in high- and low-risk patients. (d) the details of seven inflammatory metagenes and risk score. (e) correlogram of risk score, and inflammatory metagenes. *, **, ***, and **** represent P < .05, P < .01, P < .001 and P < .0001, respectively.Next, to increase our understanding of CMS-related inflammatory activities, we assessed the relationship between CMS and seven clusters of metagenes. These consisted of 104 genes and represented different inflammation and immune response.[32] The expression details of the collected genes and risk scores were displayed in Figure 5(d). Then, to explore the correlation between CMS and the entire metagenes of every cluster, the expression of corresponding gene clusters was calculated by Gene Sets Variation Analysis (GSVA).[33] Finally, the correlations were portrayed according to Pearson r-values between risk scores and metagenes (Figure 5(e)). The results revealed that CMS was negatively associated with HCK, LCK, MHC-I, and MHC-II. This indicated that patients with high CMS scores were characterized by an immune-suppressive state.
Association of CMS and immunotherapy response in patients with LUAD
Presently, immunotherapy is considered a first-line treatment for patients with LUAD. Costimulatory molecules are major candidates for immunotherapy. Therefore, we further assessed the association of CMS and immunotherapy response through analyzing the correlation of CMS and widely recognized immunotherapy biomarkers.[30] Totally, we enrolled eight indices, including TMB, the number of neoantigens, the number of clonal neoantigens, the number of subclonal neoantigens, the protein level of PD-L1, the TIDE score, the T cell dysfunction score, and the T cell exclusion score, to get a comprehensive evaluation. The results, as depicted in Figure 6, illustrated that high-risk patients were distinguished by a high level of TMB, neoantigens, protein level of the PD-L1 and T cell exclusion scores, and low level of the TIDE and T cell dysfunction scores. These results indicate that CMS-based high-risk patients may benefit from immunotherapy, especially ICIs.
Figure 6.
The expression pattern of immunotherapy response makers in high- and low-risk groups. The distribution of TMB (a), number of neoantigens (b), number of clonal neoantigens (c), number of subclonal neoantigens (d), protein level of PD-L1 (e), TIDE score (f), T cell dysfunction score (g) and T cell exclusion score (h) in high- and low-risk groups. *, **, ***, and **** represent P < .05, P < .01, P < .001 and P < .0001, respectively.
The expression pattern of immunotherapy response makers in high- and low-risk groups. The distribution of TMB (a), number of neoantigens (b), number of clonal neoantigens (c), number of subclonal neoantigens (d), protein level of PD-L1 (e), TIDE score (f), T cell dysfunction score (g) and T cell exclusion score (h) in high- and low-risk groups. *, **, ***, and **** represent P < .05, P < .01, P < .001 and P < .0001, respectively.
Discussion
There is plenty of evidence pointing out that the immunosuppressive TME exhausts T cells and renders them anergic. This subsequently enables tumor cells to evade host immune-mediated elimination.[34] Costimulatory molecules, especially the immune checkpoints, expressed on cancer cells or tumor-infiltrating lymphocytes play vital roles in regulating the anti-tumor immune response. Further, the blocking antibody targeting PD-L1/PD-1 has directly prolonged survival in patients with metastatic cancer.[35,36] Presently, the costimulatory molecules mainly consist of two major families: the B7-CD28 family and the TNF family.[37] In this study, we simultaneously detected the expression pattern and clinical significance of 60 costimulatory molecules in patients with LUAD. Based on the significant genes, we developed a novel survival prediction model (CMS) based on the expression of five costimulatory molecular features in the TCGA dataset. The CMS score was found as an independent risk factor for patients with LUAD. Furthermore, the CMS was well validated in eight different public GEO datasets and 77 cases from frozen tissues with qRT-PCR data. Interestingly, through prognostic meta-analysis, we proved that our CMS had better prognostic value than the previous costimulatory molecule-related signature. We also explored the immune panorama – including immune cell distribution and inflammatory activities – in CMS high- and low-risk patients. Additionally, we found that the CMS score was positively related to different immunotherapy biomarkers. To our knowledge, this is the first and most comprehensive study to date to describe the prognostic and immunotherapy response prediction value of a CMS in patients with LUAD.To get the whole picture of costimulatory molecule expression in patients with LUAD, we collected the 13 members from the B7-CD28 family and the 47 members from the TNF family into our analysis.[13,16] After the univariate Cox proportional hazards regression analysis and stepwise Cox proportional hazards regression model, we found that all five selected genes (CD40LG, TNFRSF6B, TNFSF13, TNFRSF13C, and TNFRSF19) belonged to the TNF family. This indicated that costimulatory signals and pathways in the TNF family had a more important prognostic value than those in the B7-CD28 family in patients with LUAD. CD40LG – also known as CD40L, TNFSF5, or CD154 – is a membrane-bound protein belonging to the TNFSF family. CD40LG has been a therapy target in cancer treatment because of its ability to trigger Th1-type immune responses.[38] The expression and prognostic states of the CD40LG-CD40 axis was previously reported in lung cancer.[39] TNFRSF6B, a soluble decoy receptor, is also known as Decoy receptor 3 (DcR3), belongs to the TNFRSF family.[40] TNFRSF6B inhibits apoptosis and promotes angiogenesis through binding with FASL, LIGHT, and TL1A.[41,42] Moreover, studies found that DcR3 is a potential immunotherapy target for cancer treatment.[43] TNFSF13, also known as APRIL and CD256, is a proliferation-inducing ligand that plays an important role in B cell development.[44] The clinical significance of TNFSF13 in several cancers was previously revealed and included NSCLC,[45] breast cancer,[46] B-cell chronic lymphocytic leukemia,[47] and other tumor types. TNFRSF13C (BAFFR or CD268), a receptor of BAFF, is a crucial regulatory factor in B cell proliferation, development, and maturation.[48] Hong Qin et al. reported that a novel anti-BAFFR antibody may be a promising strategy for drug-resistant B-cell malignancies.[49] TNFRSF19, also known as TROY or TAJ, is a member of the TNFRSF family and demonstrates complex and pleiotropic functions in different cellular contexts.[50] Present evidence displayed that TNFRSF19 acted as a tumor suppressor in patients with lung cancer.[51] Although the expression details of these five members in various cancer types have been described, the combination and functions of these molecules still warrants further exploration.To verify the robustness of CMS, we reproduce the model in nine different cohorts, and the significance of CMS was finally confirmed by prognosis meta-analysis. It is worth mentioning that the number of validation cohorts in our research was larger than that of any other studies in the LUAD population. This made our signature more reliable and clinically feasible. Before our study, a signature based on the expression of costimulatory molecules from the B7-CD28 family was constructed.[31] Through meta-analysis, we obtained two crucial conclusions: the CMS signature had prognostic significance across these public datasets, although some of the P-values were not statistically significant and our CMS model demonstrated an advantage over the reported B7-CD28 model. These conclusions are consistent with our finding that the TNF family has a more important prognostic value for patients with LUAD.Through analysis, the most related genes of CMS, the potential mechanisms of CMS in LUAD was proved to be closely associated with the immune-related process. Hence, the details of CMS-specific immune profiles were further analyzed. We found that there were higher proportions of DCs, NKs, and Tregs in CMS high-risk patients TME. Simultaneously, inflammatory metagene analysis revealed that CMS score was negatively related to monocyte/myeloid lineage- and T cell-specific functions (HCK and LCK). What’s more, CMS score was also found negatively related to the antigen-presenting process of T cells (MHC-I and MHC-II) in LUAD. Thus, CMS high-risk patients appear to exhibit a high immune cell infiltration microenvironment while in an immune-suppressive state.Interestingly, this research highlighted the potential role of CMS in predicting the response to immunotherapy in patients with LUAD. Because the immune checkpoint targets (PD-L1 and PD-1) are costimulatory molecules, CMS may have the ability to predict the response to ICIs-based immunotherapy. Due to the lack of details regarding mRNA expression in cases with immunotherapy, we had to evaluate the relationship indirectly. We collected TMB, the number of neoantigens, the protein level of PD-L1, and the TIDE scores. TMB is one of the classic biomarkers for immunotherapy response, and neoantigen burden is always increased by TMB. This will be useful for T cell recognition.[52,53] The PD-L1 expression level was another well-known biomarker for ICIs in lung cancer.[54] The TIDE score is a newly-developed method for immunotherapy response prediction, and considered a more accurate biomarker than TMB or PD-L1 expression.[30] Collectively, high-risk patients exhibited high TMB and PD-L1 expression. From a mechanical standpoint, this resonated with the results of the immune profile analysis. By comparing the CMS scores with these different verified biomarkers, we preliminarily speculate that CMS high-risk patients may be suitable for immunotherapy. These findings give us additional confidence that the CMS scores may act as a novel predictive biomarker for immunotherapy response.There are some limitations to this study that warrant consideration. Firstly, although we tried our best to include as many independent datasets as possible for validation, this study was retrospective. Secondly, the CMS-specific immune landscape was realized through bioinformatic methods with RNA-seq data. This analysis may have been influenced by noise. Thirdly, because the mRNA expression data from patients with immunotherapy was not available, the prediction ability of CMS for immunotherapy response was estimated indirectly. Future prospective studies could affirm the complete prediction ability and a molecular picture of the CMS signature.
Conclusions
Here, we have performed a first costimulatory molecule landscape analysis in patients with LUAD. We built a reliable, clinically feasible prognostic signature named CMS and identified the potential underlying immune-related mechanisms of this signature. Importantly, the CMS was tightly related to well validated immunotherapy biomarkers. Thus, the CMS could be a clinically useful tool for prognostic management and predicting immunotherapy response in patients with LUAD. Future validation of the predictive capability of this formula will be helpful for patients seeking counseling and individualized treatment.Click here for additional data file.
Authors: Ahmedin Jemal; Freddie Bray; Melissa M Center; Jacques Ferlay; Elizabeth Ward; David Forman Journal: CA Cancer J Clin Date: 2011-02-04 Impact factor: 508.702
Authors: Johan Botling; Karolina Edlund; Miriam Lohr; Birte Hellwig; Lars Holmberg; Mats Lambe; Anders Berglund; Simon Ekman; Michael Bergqvist; Fredrik Pontén; André König; Oswaldo Fernandes; Mats Karlsson; Gisela Helenius; Christina Karlsson; Jörg Rahnenführer; Jan G Hengstler; Patrick Micke Journal: Clin Cancer Res Date: 2012-10-02 Impact factor: 12.531
Authors: Hossein Borghaei; Luis Paz-Ares; Leora Horn; David R Spigel; Martin Steins; Neal E Ready; Laura Q Chow; Everett E Vokes; Enriqueta Felip; Esther Holgado; Fabrice Barlesi; Martin Kohlhäufl; Oscar Arrieta; Marco Angelo Burgio; Jérôme Fayette; Hervé Lena; Elena Poddubskaya; David E Gerber; Scott N Gettinger; Charles M Rudin; Naiyer Rizvi; Lucio Crinò; George R Blumenschein; Scott J Antonia; Cécile Dorange; Christopher T Harbison; Friedrich Graf Finckenstein; Julie R Brahmer Journal: N Engl J Med Date: 2015-09-27 Impact factor: 91.245
Authors: Freddie Bray; Jacques Ferlay; Isabelle Soerjomataram; Rebecca L Siegel; Lindsey A Torre; Ahmedin Jemal Journal: CA Cancer J Clin Date: 2018-09-12 Impact factor: 508.702