| Literature DB >> 32596344 |
Hongxia Ma1, Lihong Tong1, Qian Zhang1, Wenjun Chang1, Fengsen Li1.
Abstract
BACKGROUND: Lung squamous cell carcinoma (LSCC) is a frequently diagnosed cancer worldwide, and it has a poor prognosis. The current study is aimed at developing the prediction of LSCC prognosis by integrating multiomics data including transcriptome, copy number variation data, and mutation data analysis, so as to predict patients' survival and discover new therapeutic targets.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32596344 PMCID: PMC7298313 DOI: 10.1155/2020/6427483
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Clinical information statistics of three sets of datasets.
| Characteristic | TCGA training datasets ( | TCGA all datasets ( | GSE42127 ( | |
|---|---|---|---|---|
| Age (years) | ≤50 | 8 | 19 | 0 |
| >50 | 238 | 470 | 43 | |
|
| ||||
| Survival status | Living | 144 | 282 | 22 |
| Dead | 103 | 212 | 21 | |
|
| ||||
| Gender | Female | 62 | 128 | 18 |
| Male | 185 | 366 | 25 | |
|
| ||||
| Smoke years | ≤20 | 11 | 21 | |
| >20 | 94 | 199 | ||
|
| ||||
| pathologic_T | T 1 | 54 | 114 | |
| T 2 | 146 | 287 | ||
| T 3 | 36 | 70 | ||
| T 4 | 11 | 23 | ||
|
| ||||
| pathologic_N | N 0 | 158 | 316 | |
| N 1 | 67 | 127 | ||
| N 2 | 18 | 40 | ||
| N 3 | 1 | 5 | ||
|
| ||||
| pathologic_M | M 0 | 194 | 406 | |
| M 1/M X | 50 | 84 | ||
|
| ||||
| Tumor stage | Stage I | 120 | 242 | 23 |
| Stage II | 81 | 158 | 10 | |
| Stage III | 41 | 83 | 10 | |
| Stage IV | 4 | 7 | ||
Top 20 prognosis-related gene information.
| ENSG ID | HR | Coefficient |
|
|
|---|---|---|---|---|
| ENSG00000229859 | 1.381 | 0.323 | 4.186 | 2.84 |
| ENSG00000133055 | 1.293 | 0.257 | 4.070 | 4.71 |
| ENSG00000249158 | 1.386 | 0.327 | 3.952 | 7.74 |
| ENSG00000100632 | 0.645 | -0.439 | -3.904 | 9.48 |
| ENSG00000188467 | 1.456 | 0.376 | 3.821 | 0.000132923 |
| ENSG00000080511 | 1.283 | 0.249 | 3.794 | 0.000148025 |
| ENSG00000069509 | 0.673 | -0.396 | -3.780 | 0.000156912 |
| ENSG00000187733 | 1.265 | 0.235 | 3.759 | 0.000170307 |
| ENSG00000072657 | 1.348 | 0.298 | 3.730 | 0.000191851 |
| ENSG00000099937 | 1.341 | 0.293 | 3.718 | 0.000200903 |
| ENSG00000126752 | 1.267 | 0.237 | 3.706 | 0.000210942 |
| ENSG00000179520 | 1.244 | 0.218 | 3.702 | 0.000213691 |
| ENSG00000100994 | 1.430 | 0.358 | 3.462 | 0.000535801 |
| ENSG00000041353 | 1.409 | 0.343 | 3.448 | 0.0005653 |
| ENSG00000162551 | 1.412 | 0.345 | 3.392 | 0.000694076 |
| ENSG00000271447 | 1.328 | 0.284 | 3.383 | 0.000717711 |
| ENSG00000165762 | 1.239 | 0.214 | 3.356 | 0.000791736 |
| ENSG00000172789 | 1.306 | 0.267 | 3.335 | 0.000852719 |
| ENSG00000105650 | 1.276 | 0.244 | 3.306 | 0.000947676 |
| ENSG00000253537 | 1.371 | 0.316 | 3.305 | 0.000949615 |
Figure 1(a) A significantly amplified fragment of lung squamous cell carcinoma genome (LSCC). (b) A significant deletion fragment of the LSCC genome. (c) The distribution of the 50 genes with most significant p-valued genes in patients with LSCC; the bar chart at the top shows the total number of synonymous and nonsynonymous mutations in 50 genes in each patient, while the bar chart at the right shows the number of mutations in 50 genes in all samples.
Figure 2(a) 1385 genes with copy number variation and mutation are involved in the KEGG pathway. (b) Biological processes involve 1385 genes with copy number variation and mutation (GO bp).
Figure 3(a) The relationship between the error rate and the number of classification trees. (b) Importance order of 5 genes out-of-bag. (c) Distribution of KM survival curves of the 5-gene signature in the TCGA training set. (d) The ROC curve and AUC of the 5-gene signature classification. (e) Risk score, survival time, survival status, and expression of the 5 genes in TCGA training.
Five genes significantly associated with the overall survival in the training-set patients.
| Ensembl gene ID | Symbol | HR |
|
| Importance | Relative importance |
|---|---|---|---|---|---|---|
| ENSG00000275713 | HIST1H2BH | 0.69 | -3.239003 | 1.20 | 0.0178 | 1 |
| ENSG00000099937 | SERPIND1 | 1.34 | 3.717879 | 2.01 | 0.0115 | 0.648 |
| ENSG00000169436 | COL22A1 | 1.29 | 2.732945 | 6.28 | 0.0105 | 0.593 |
| ENSG00000244057 | LCE3C | 1.26 | 3.234898 | 1.22 | 0.0093 | 0.5216 |
| ENSG00000140470 | ADAMTS17 | 0.74 | -2.686823 | 7.21 | 0.0049 | 0.2729 |
Figure 4(a) Distribution of 5-gene signature's Kaplan-Meier (KM) survival curve in the TCGA test. (b) ROC curve and AUC of the 5-gene signature classification. (c) TCGA test focused on risk score, survival time and survival status, and the expressions of 5 genes.
Figure 5(a) The 5-gene signature's KM survival curve distribution in GSE42127. (b) ROC curve and AUC of the 5-gene signature classification. (c) Risk score, survival time, survival status, and expression of 5 genes in GSE42127.
Figure 6(a) ROC curve and AUC of the 5-gene signature in GSE37745. (b) risk score, survival time, survival status, and expressions of the 5 genes in GSE37745.
Univariate and multivariate Cox regression analysis to identify prognostic clinical factors and clinical independence.
| Variables | Univariate analysis | Multivariable analysis | ||||
|---|---|---|---|---|---|---|
| HR | 95% CI of HR |
| HR | 95% CI of HR |
| |
| TCGA training datasets | ||||||
| 5-gene risk score | ||||||
| Low-risk group | 1 (reference) | 1 (reference) | ||||
| High-risk group | 2.72 | 2.01-3.66 | 5.310 | 2.27 | 1.30601-3.936 | 0.004 |
| Age | 1.00 | 0.97-1.02 | 9.980 | 1.03 | 0.97001-1.093 | 0.33624 |
| Gender female | 1 (reference) | 1 (reference) | ||||
| Gender male | 0.92 | 0.58-1.43 | 0.70 | 0.63 | 0.30321-1.301 | 0.21085 |
| Smoke years | 0.98 | 0.96-1.01 | 0.37 | 0.99 | 0.96278-1.015 | 0.40262 |
| Pathologic T1 | 1 (reference) | 1 (reference) | ||||
| Pathologic T2 | 1.41 | 0.8314-2.381 | 0.20341 | 1.48 | 0.52948-4.114 | 0.45674 |
| Pathologic T3 | 1.69 | 0.8693-3.283 | 0.122 | 1.74 | 0.2188-13.78 | 0.602 |
| Pathologic T4 | 3.90 | 1.4234-10.69 | 0.008 | 13.87 | 0.25483-754.666 | 0.197 |
| Pathologic N0 | 1 (reference) | 1 (reference) | ||||
| Pathologic N1 | 1.21 | 0.7879-1.857 | 0.384 | 0.54 | 0.10297-2.854 | 0.470 |
| Pathologic N2/N3 | 0.77 | 0.3517-1.686 | 0.513 | 0.91 | 0.02173-37.89 | 0.959 |
| Pathologic M0 | 1 (reference) | 1 (reference) | ||||
| Pathologic M1/MX | 1.70 | 1.059-2.717 | 2.80E-02 | 3.65 | 1.58507-8.395 | 2.34 |
| Tumor stage I | 1 (reference) | 1 (reference) | ||||
| Tumor stage II | 1.02 | 0.6474-1.615 | 0.924 | 1.46 | 0.32211-6.599 | 0.62452 |
| Tumor stage III | 1.27 | 0.7519-2.132 | 0.3749 | 0.89 | 0.0297-26.834 | 0.94791 |
| Tumor stage IV | 3.23 | 0.9983-10.429 | 0.050 | 1.31 | 0.08057-21.187 | 0.85 |
|
| ||||||
|
| ||||||
|
| ||||||
| 5-gene risk score | ||||||
| Low-risk group | 1 (reference) | 1 (reference) | ||||
| High-risk group | 1.85 | 1.501-2.283 | 8.89 | 1.737 | 1.2456-2.423 | 1.14 |
| Age | 1.02 | 0.9995-1.033 | 0.058 | 1.029 | 0.9916-1.067 | 0.132 |
| Gender female | 1 (reference) | 1 (reference) | ||||
| Gender male | 1.20 | 0.8669-1.646 | 0.277 | 1.240 | 0.7443-2.067 | 0.408 |
| Smoke years | 0.99 | 0.9739-1.007 | 0.26 | 0.99 | 0.973-1.01 | 0.349 |
| Pathologic T1 | 1 (reference) | 1 (reference) | ||||
| Pathologic T2 | 1.25 | 0.8779-1.765 | 0.219 | 0.89 | 0.4878-1.629 | 0.708 |
| Pathologic T3 | 1.82 | 1.1618-2.847 | 0.009 | 0.897 | 0.27-2.98 | 0.859 |
| Pathologic T4 | 2.32 | 1.2481-4.327 | 0.008 | 1.014 | 0.2338-4.396 | 0.985 |
| Pathologic N0 | 1 (reference) | 1 (reference) | ||||
| Pathologic N1 | 1.07 | 0.7824-1.466 | 0.669 | 0.687 | 0.2744-1.721 | 0.423 |
| Pathologic N2 | 1.32 | 0.831-2.093 | 2.40 | 1.475 | 0.3304-6.588 | 0.611 |
| Pathologic N3 | 2.51 | 0.6183-10.212 | 1.98 | 3.030 | 0.3645-25.18 | 0.305 |
| Pathologic M0 | 1 (reference) | 1 (reference) | ||||
| Pathologic M1 | 3.18 | 1.3-7.778 | 1.13 | 1.472 | 0.1424-15.22 | 0.746 |
| Pathologic MX | 1.55 | 1.049-2.299 | 0.028 | 2.250 | 1.1797-4.293 | 0.014 |
| Tumor stage | 1 (reference) | 1 (reference) | ||||
| Tumor stage II | 1.13 | 0.8234-1.559 | 4.43 | 1.011 | 0.4121-2.48 | 0.981 |
| Tumor stage III | 1.64 | 1.1622-2.311 | 4.84 | 1.351 | 0.2684-6.796 | 0.716 |
|
| ||||||
|
| ||||||
| 5-gene risk score | ||||||
| Low risk group | 1 (reference) | 1 (reference) | ||||
| High risk group | 2.07 | 1.133-3.778 | 0.018 | 2.33 | 1.1539-4.697 | 0.018 |
| Age | 1.03 | 0.9751-1.084 | 0.306 | 1.0127 | 0.9497-1.08 | 0.7002 |
| Gender female | 1 (reference) | 1 (reference) | ||||
| Gender male | 1.196 | 0.4723-3.029 | 0.706 | 1.1037 | 0.4032-3.021 | 0.848 |
| Tumor stage I | 1 (reference) | 1 (reference) | ||||
| Tumor stage II | 0.82 | 0.2549-2.643 | 0.741 | 1.00 | 0.2773-3.576 | 0.995 |
| Tumor stage III | 2.0311 | 0.7574-5.447 | 0.159 | 2.7395 | 0.9225-8.136 | 0.070 |
Figure 7Comparison and analysis of the 5-gene signature model and other existing models. (a) AUC and KM curves of autophagy-related gene prognostic signature by Zhu et al. (b) AUC and KM curves of immune-related signature by Zhang et al. (c) AUC and KM curves of sixteen-gene prognostic biomarker by Zhang et al. (d) AUC and KM curves of glycolysis-related gene signature by Zhang et al. (e) RMS curves of four models and the 5-gene signature. (f) DCA curves of four models and the 5-gene signature.
GSEA analyzed significantly enriched KEGG pathways in high-risk and low-risk groups.
| Name | Size | ES | NES | NOM | FDR | FWER |
|---|---|---|---|---|---|---|
| KEGG_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION | 243 | -0.609 | -2.002 | 0.0001 | 0.055 | 0.037 |
| KEGG_LEISHMANIA_INFECTION | 64 | -0.709 | -1.985 | 0.0001 | 0.039 | 0.052 |
| KEGG_COMPLEMENT_AND_COAGULATION_CASCADES | 68 | -0.726 | -1.969 | 0.0001 | 0.035 | 0.064 |
| KEGG_HEMATOPOIETIC_CELL_LINEAGE | 84 | -0.708 | -1.967 | 0.0001 | 0.026 | 0.064 |
| KEGG_CELL_ADHESION_MOLECULES_CAMS | 122 | -0.623 | -1.961 | 0.002 | 0.023 | 0.07 |
| KEGG_VIRAL_MYOCARDITIS | 67 | -0.636 | -1.933 | 0.0001 | 0.028 | 0.1 |
| KEGG_LEUKOCYTE_TRANSENDOTHELIAL_MIGRATION | 108 | -0.551 | -1.920 | 0.002 | 0.029 | 0.115 |
| KEGG_AUTOIMMUNE_THYROID_DISEASE | 49 | -0.710 | -1.817 | 0.002 | 0.075 | 0.259 |
| KEGG_CHEMOKINE_SIGNALING_PATHWAY | 178 | -0.491 | -1.799 | 0.020 | 0.080 | 0.3 |
| KEGG_ASTHMA | 27 | -0.790 | -1.795 | 0.0001 | 0.075 | 0.306 |
| KEGG_TYPE_I_DIABETES_MELLITUS | 40 | -0.747 | -1.791 | 0.008 | 0.070 | 0.316 |
| KEGG_GLYCOSPHINGOLIPID_BIOSYNTHESIS_GANGLIO_SERIES | 14 | -0.681 | -1.732 | 0.008 | 0.107 | 0.454 |
| KEGG_INTESTINAL_IMMUNE_NETWORK_FOR_IGA_PRODUCTION | 45 | -0.727 | -1.720 | 0.009 | 0.109 | 0.476 |
| KEGG_NATURAL_KILLER_CELL_MEDIATED_CYTOTOXICITY | 127 | -0.510 | -1.710 | 0.018 | 0.112 | 0.505 |
| KEGG_ALLOGRAFT_REJECTION | 34 | -0.807 | -1.709 | 0.006 | 0.105 | 0.508 |
| KEGG_ECM_RECEPTOR_INTERACTION | 82 | -0.592 | -1.708 | 0.023 | 0.100 | 0.511 |
| KEGG_JAK_STAT_SIGNALING_PATHWAY | 147 | -0.468 | -1.701 | 0.014 | 0.100 | 0.525 |
| KEGG_ANTIGEN_PROCESSING_AND_PRESENTATION | 78 | -0.581 | -1.692 | 0.033 | 0.102 | 0.546 |
| KEGG_LYSOSOME | 115 | -0.483 | -1.683 | 0.024 | 0.103 | 0.562 |
| KEGG_GRAFT_VERSUS_HOST_DISEASE | 36 | -0.780 | -1.669 | 0.016 | 0.109 | 0.595 |
| KEGG_FOCAL_ADHESION | 189 | -0.484 | -1.664 | 0.031 | 0.109 | 0.607 |
| KEGG_PRION_DISEASES | 35 | -0.487 | -1.662 | 0.006 | 0.106 | 0.613 |
| KEGG_RENIN_ANGIOTENSIN_SYSTEM | 16 | -0.603 | -1.655 | 0.028 | 0.107 | 0.624 |
| KEGG_NOD_LIKE_RECEPTOR_SIGNALING_PATHWAY | 54 | -0.489 | -1.608 | 0.041 | 0.142 | 0.722 |
| KEGG_HISTIDINE_METABOLISM | 26 | -0.484 | -1.592 | 0.046 | 0.154 | 0.757 |
| KEGG_HYPERTROPHIC_CARDIOMYOPATHY_HCM | 82 | -0.445 | -1.583 | 0.037 | 0.157 | 0.776 |
| KEGG_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY | 99 | -0.441 | -1.574 | 0.048 | 0.160 | 0.787 |
| KEGG_OTHER_GLYCAN_DEGRADATION | 15 | -0.587 | -1.532 | 0.060 | 0.201 | 0.852 |
| KEGG_PATHOGENIC_ESCHERICHIA_COLI_INFECTION | 47 | -0.415 | -1.515 | 0.026 | 0.216 | 0.874 |
| KEGG_BASAL_TRANSCRIPTION_FACTORS | 34 | 0.584 | 1.895 | 0.004 | 0.141 | 0.131 |
| KEGG_NUCLEOTIDE_EXCISION_REPAIR | 44 | 0.607 | 1.855 | 0.004 | 0.124 | 0.2 |
| KEGG_CELL_CYCLE | 112 | 0.531 | 1.841 | 0.004 | 0.100 | 0.231 |
| KEGG_HOMOLOGOUS_RECOMBINATION | 24 | 0.700 | 1.831 | 0.004 | 0.082 | 0.246 |
| KEGG_SPLICEOSOME | 90 | 0.605 | 1.811 | 0.004 | 0.081 | 0.283 |
| KEGG_DNA_REPLICATION | 32 | 0.732 | 1.767 | 0.004 | 0.106 | 0.392 |
| KEGG_MISMATCH_REPAIR | 23 | 0.674 | 1.718 | 0.016 | 0.141 | 0.503 |
| KEGG_BASE_EXCISION_REPAIR | 33 | 0.609 | 1.680 | 0.024 | 0.165 | 0.59 |
| KEGG_RNA_DEGRADATION | 47 | 0.530 | 1.676 | 0.012 | 0.152 | 0.6 |
| KEGG_GLYCOSYLPHOSPHATIDYLINOSITOL_GPI_ANCHOR_BIOSYNTHESIS | 22 | 0.570 | 1.638 | 0.024 | 0.179 | 0.675 |
| KEGG_RNA_POLYMERASE | 28 | 0.556 | 1.622 | 0.033 | 0.181 | 0.711 |
Figure 8Five-gene signature-enriched pathways in high-risk and low-risk groups.371