| Literature DB >> 33907270 |
Abstract
Lung squamous cell carcinoma (LUSC) is a common type of lung cancer with high incidence and mortality rate. Tumor mutational burden (TMB) is an emerging biomarker for selecting patients with non-small cell lung cancer (NSCLC) for immunotherapy. This study aimed to reveal TMB involved in the mechanisms of LUSC and develop a model to predict the overall survival of LUSC patients. The information of patients with LUSC were obtained from the cancer genome atlas database (TCGA). Differentially expressed genes (DEGs) between low- and the high-TMB groups were identified and taken as nodes for the protein-protein interaction (PPI) network construction. Gene oncology (GO) enrichment analysis and gene set enrichment analysis (GSEA) were used to investigate the potential molecular mechanism. Then, we identified the factors affecting the prognosis of LUSC through cox analysis, and developed a risk score signature. Kaplan-Meier method was conducted to analyze the difference in survival between the high- and low-risk groups. We constructed a nomogram based on the risk score model and clinical characteristics to predict the overall survival of patients with LUSC. Finally, the signature and nomogram were further validated by using the gene expression data downloaded from the Gene Expression Omnibus (GEO) database. 30 DEGs between high- and low-TMB groups were identified. PPI analysis identified CD22, TLR10, PIGR and SELE as the hub genes. Cox analysis indicated that FAM107A, IGLL1, SELE and T stage were independent prognostic factors of LUSC. Low-risk scores group lived longer than that of patients with high-risk scores in LUSC. Finally, we built a nomogram that integrated the clinical characteristics (TMN stage, age, gender) with the three-gene signature to predict the survival probability of LUSC patients. Further verification in the GEO dataset. TMB might contribute to the pathogenesis of LUSC. TMB-associated genes can be used to develope a model to predict the OS of lung squamous cell carcinoma patients.Entities:
Year: 2021 PMID: 33907270 PMCID: PMC8079676 DOI: 10.1038/s41598-021-88694-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Differential expressed genes between low TMB and high TMB groups.
| Gene | logFC | pValue |
|---|---|---|
| CYSLTR2 | − 1.6898991 | 0.00010269 |
| MS4A1 | − 1.1441338 | 0.00058734 |
| FAM107A | − 1.0200621 | 7.73E−05 |
| IGLL1 | − 1.7638948 | 0.00533523 |
| LRRC55 | − 1.785876 | 0.00055621 |
| MS4A8 | − 1.8295919 | 0.00021727 |
| C20orf85 | − 1.1821732 | 0.00062144 |
| SELE | − 1.2084087 | 0.00472061 |
| TNFSF8 | − 1.1427382 | 0.0032476 |
| NR5A1 | 1.9226092 | 7.87E−06 |
| CADM3 | − 1.2047907 | 5.53E−05 |
| FCRL2 | − 1.159438 | 0.0008174 |
| BPIFB1 | − 1.4510685 | 0.00011556 |
| ADH1B | − 1.1416372 | 9.09E−06 |
| INHA | 1.85340517 | 0.00362178 |
| SCGB1A1 | − 1.1115467 | 1.63E−05 |
| PIGR | − 1.0412167 | 0.00022015 |
| C1orf189 | − 1.1341047 | 0.00026482 |
| WFDC12 | − 1.2491682 | 0.00164705 |
| FAM216B | − 1.3900118 | 2.55E−06 |
| HS3ST4 | − 1.504201 | 0.00026443 |
| PIP | − 1.0171485 | 0.00026387 |
| CD22 | − 1.0772369 | 0.00013723 |
| FCER2 | − 1.3194606 | 0.00204346 |
| C2orf40 | − 1.2412141 | 3.72E−05 |
| CCL19 | − 1.073972 | 0.00058595 |
| TLR10 | − 1.0166526 | 0.00302647 |
| C1orf194 | − 1.0624677 | 0.00026991 |
| APOA1 | 6.52638412 | 0.00622402 |
| SMIM24 | 1.10145976 | 0.00793992 |
Figure 1Identification of DEGs in LUSC between tumor and normal tissues. (a) The heatmap of DEGs between the high‐TMB and low‐TMB groups in LUSC by analysis of the TCGA datasets. Each column represents a sample, and each row represents one of DEGs. The levels of DEGs are shown in different colors, which transition from green to red with increasing proportions. The lines before the heat map indicated the dendrogram of DEGs cluster analysis. (b) The protein–protein interaction network (PPI) analysis was constructed by all the 30 DEGs using STRING database. (c) Four hub genes (PIGR, TLR10, SELE and CD22) in the PPI were screened by Cytoscape based on their connectivity degree. Red circles indicated four hub genes.
Figure 5Kaplan–Meier survival curves. (a/b) Patients from the TCGA and GSE73403 dataset are stratified into two groups according to median values for the risk scores calculated by three gene based on risk score signature. (a) Kaplan–Meier survival curves of the signature in TCGA dataset. (b) Kaplan–Meier survival curves of the signature in GSE73403 dataset. (c) Kaplan–Meier survival curves of different TMB groups calculated by VarScan. (d) Kaplan–Meier survival curves of different TMB groups calculated by MuTect. (Red means high-TMB group and blue means low-TMB group).
Figure 2The expression of DEGs distributed in high-TMB and low-TMB groups.
DEGs are verified in TMB evaluation of different analysis pipelines.
| Gene | SomaticSniper | MuTect | Muse | |||
|---|---|---|---|---|---|---|
| logFC | pValue | logFC | pValue | logFC | pValue | |
| CD22 | − 0.738376 | 1.22E−09 | − 0.8605324 | 0.00718201 | − 0.4142631 | 0.00088519 |
| SELE | − 1.3754053 | 7.29E−05 | − 1.0018508 | 0.01206434 | − 0.5736792 | 0.00823645 |
| PIGR | − 1.1479969 | 1.06E−06 | − 0.7435242 | 0.00391641 | − 1.0187546 | 0.00018397 |
| TLR10 | − 1.0942878 | 1.68E−07 | − 0.9334027 | 0.03117443 | − 0.9074117 | 0.01194325 |
| FAM107A | − 1.2206673 | 1.58E−07 | − 0.6545071 | 0.00402084 | − 0.860165 | 0.00012132 |
| IGLL1 | − 2.1140051 | 2.21E−07 | − 0.9500279 | 0.02153361 | − 1.5941 | 0.01274359 |
GO enrichment analysis of TMB associated DEGs in LUSC.
| Description | geneID | |
|---|---|---|
| Lymphocyte proliferation | 0.007 | MS4A1/TNFSF8/SCGB1A1/CD22/CCL19 |
| Mononuclear cell proliferation | 0.007 | MS4A1/TNFSF8/SCGB1A1/CD22/CCL19 |
| Regulation of lymphocyte activation | 0.007 | IGLL1/TNFSF8/INHA/SCGB1A1/CD22/CCL19 |
| Leukocyte proliferation | 0.007 | MS4A1/TNFSF8/SCGB1A1/CD22/CCL19 |
| Regulation of lymphocyte proliferation | 0.018 | TNFSF8/SCGB1A1/CD22/CCL19 |
| Regulation of mononuclear cell proliferation | 0.018 | TNFSF8/SCGB1A1/CD22/CCL19 |
| Regulation of leukocyte proliferation | 0.020 | TNFSF8/SCGB1A1/CD22/CCL19 |
| Negative regulation of immune system process | 0.026 | BPIFB1/INHA/SCGB1A1/CD22/APOA1 |
| Regulation of endocytosis | 0.037 | SELE/CD22/CCL19/APOA1 |
| adrenal gland development | 0.037 | NR5A1/APOA1 |
| B cell receptor signaling pathway | 0.038 | MS4A1/IGLL1/CD22 |
| B cell activation | 0.038 | MS4A1/IGLL1/INHA/CD22 |
| Response to tumor necrosis factor | 0.038 | SELE/TNFSF8/CCL19/APOA1 |
| Negative regulation of production of molecular Mediator of immune response | 0.041 | CD22/APOA1 |
| negative regulation of Interferon-gamma production | 0.041 | INHA/SCGB1A1 |
| positive regulation of Endocytosis | 0.041 | SELE/CCL19/APOA1 |
| Mucosal immune response | 0.041 | BPIFB1/PIGR |
| Lymphocyte differentiation | 0.041 | MS4A1/TNFSF8/INHA/CCL19 |
| Regulation of T cell proliferation | 0.041 | TNFSF8/SCGB1A1/CCL19 |
Figure 3GO enrichment analysis of the DEGs between high- and low-TMB groups.
Figure 4GSEA analysis was performed to further screen the significant pathway between high TMB group and low TMB group. The q‐value < 0.05 was considered as significance. (a) Significant pathway identified in the high-TMB group. (b) Significant pathway identified in the low-TMB group.
Cox proportional hazards model analysis of prognostic factors.
| Variables | Univariate Cox analysis | Multivariate Cox analysis | ||||
|---|---|---|---|---|---|---|
| HR | 95% CI | HR | 95% CI | |||
| FAM107A | 1.015 | 1.01–1.025 | 0.0035 | 1.0156 | 1.00–1.03 | 0.044 |
| IGLL1 | 1.062 | 1.01–1.113 | 0.0124 | 1.0723 | 1.02–1.127 | 0.006 |
| SELE | 1.025 | 0.99–1.054 | 0.05 | 1.0006 | 1.01–1.045 | 0.012 |
| PIGR | 1.011 | 0.98–1.023 | 0.542 | 0.9977 | 0.99–1.004 | 0.66 |
| ADH1B | 1.022 | 0.097–1.02 | 0.548 | 1.1697 | 0.99–1.007 | 0.62 |
| T | 1.94 | 0.95–3.959 | 0.068 | 1.8135 | 0.767–1.78 | 0.03 |
| M | 1.85 | 0.59–5.841 | 0.291 | 1.7957 | 1.05–3.12 | 0.47 |
| N | 1.156 | 0.84–1.564 | 0.379 | 1.2321 | 0.84–3.86 | 0.12 |
Figure 6A prognostic nomogram predicting 1-, 2-, and 3-year overall survival of LUSC patient.
Figure 7ROC for 1-, 3-, and 5-year overall survival predictions for the nomogram. (a) In the training cohort, ROC curve for 1-year, 3-year and 5-year overall survival. (b) In the validation cohort, ROC curve for 1-year, 3-year and 5-year overall survival.