| Literature DB >> 32444802 |
Yifan Wang1,2,3, Ying Wang1,5,6, Ying Wang1,5,6, Yongjun Zhang7,8,9.
Abstract
Non-small lung cancer (NSCLC) is a common malignant disease with very poor outcome. Accurate prediction of prognosis can better guide patient risk stratification and treatment decision making, and could optimize the outcome. Utilizing clinical and methylation/expression data in The Cancer Genome Atlas (TCGA), we conducted comprehensive evaluation of early-stage NSCLC to identify a methylation signature for survival prediction. 349 qualified cases of NSCLC with curative surgery were included and further grouped into the training and validation cohorts. We identified 4000 methylation loci with prognostic influence on univariate and multivariate regression analysis in the training cohort. KEGG pathway analysis was conducted to identify the key pathway. Hierarchical clustering and WGCNA co-expression analysis was performed to classify the sample phenotype and molecular subtypes. Hub 5'-C-phosphate-G-3' (CpG) loci were identified by network analysis and then further applied for the construction of the prognostic signature. The predictive power of the prognostic model was further validated in the validation cohort. Based on clustering analysis, we identified 6 clinical molecular subtypes, which were associated with different clinical characteristics and overall survival; clusters 4 and 6 demonstrated the best and worst outcomes. We identified 17 hub CpG loci, and their weighted combination was used for the establishment of a prognostic model (RiskScore). The RiskScore significantly correlated with post-surgical outcome; patients with a higher RiskScore have worse overall survival in both the training and validation cohorts (P < 0.01). We developed a novel methylation signature that can reliably predict prognosis for patients with NSCLC.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32444802 PMCID: PMC7244759 DOI: 10.1038/s41598-020-65479-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Basic clinical information of Training cohort and Validation cohort.
| Training cohort | Validation cohort | P value (Chi-square) | ||
|---|---|---|---|---|
| Gender | Male | 129 | 129 | 0.9282 |
| Female | 45 | 46 | ||
| Age (year) | 40–49 | 7 | 7 | 0.7341 |
| 50–59 | 24 | 26 | ||
| 60–69 | 53 | 63 | ||
| 70–79 | 79 | 66 | ||
| 80–84 | 8 | 11 | ||
| Not Available | 3 | 2 | ||
| T | T1 | 18 | 13 | 0.5684 |
| T1a | 9 | 11 | ||
| T1b | 17 | 18 | ||
| T2 | 38 | 46 | ||
| T2a | 43 | 35 | ||
| T2b | 17 | 15 | ||
| T3 | 29 | 28 | ||
| T4 | 3 | 9 | ||
| N | N0 | 112 | 111 | 0.4453 |
| N1 | 44 | 50 | ||
| N2 | 17 | 11 | ||
| NX | 1 | 3 | ||
| M | M0 | 138 | 135 | 0.5505 |
| M1 | 0 | 1 | ||
| M1a | 0 | 1 | ||
| M1b | 0 | 1 | ||
| MX | 36 | 37 | ||
| smoking | smoked | 156 | 153 | 0.5138 |
| Not Available | 18 | 22 |
Figure 1(A) Curve of cumulative distribution function (CDF), (B) CDF delta area curve of consensus clustering, with the x axis representing the category k, and the y axis denoting the relative change in area under CDF curve of category k when compared with category k − 1.
Figure 2(A,B) Network topology analysis for different soft-thresholding powers; (C) gene dendrogram and module colors; and (D) correlation between gene module and characteristic clusters.
Top 20 methylation loci with significant prognostic influence.
| CpGs | HR | Low 95%CI | High 95%CI | |
|---|---|---|---|---|
| cg15804782 | 4.86E − 07 | 8.59E + 14 | 1.31E + 09 | 5.64E + 20 |
| cg05767633 | 7.58E − 07 | 8.45E + 18 | 2.67E + 11 | 2.67E + 26 |
| cg09038676 | 8.63E − 07 | 3166428 | 8152.608 | 1.23E + 09 |
| cg01097611 | 1.12E − 06 | 1.44E + 10 | 1177287 | 1.77E + 14 |
| cg21348997 | 1.55E − 06 | 1.52E + 12 | 16342722 | 1.42E + 17 |
| cg04216397 | 2.59E − 06 | 1.27E + 20 | 5.29E + 11 | 3.07E + 28 |
| cg06894812 | 4.25E − 06 | 2.23E + 12 | 12164167 | 4.08E + 17 |
| cg05324014 | 4.45E − 06 | 6.32E-07 | 1.42E − 09 | 0.000281 |
| cg27628312 | 7.92E − 06 | 3.82E + 18 | 2.69E + 10 | 5.42E + 26 |
| cg09110402 | 9.79E − 06 | 85.08886 | 11.87047 | 609.9263 |
| cg02726924 | 1.12E − 05 | 3.09E + 19 | 6.19E + 10 | 1.54E + 28 |
| cg22294241 | 2.43E − 05 | 654985 | 1305.684 | 3.29E + 08 |
| cg26820911 | 2.46E − 05 | 1.88E + 11 | 1086987 | 3.26E + 16 |
| cg00191629 | 2.59E − 05 | 3545.787 | 78.70704 | 159739.3 |
| cg06742044 | 2.86E − 05 | 2.62E + 10 | 345090.2 | 1.99E + 15 |
| cg17074000 | 3.11E − 05 | 82.28059 | 10.32936 | 655.4223 |
| cg10070969 | 3.29E − 05 | 108264.7 | 455.2979 | 25744130 |
| cg03862040 | 3.48E − 05 | 7.86E + 31 | 6.22E + 16 | 9.94E + 46 |
| cg02442412 | 3.51E − 05 | 18951994 | 6766.073 | 5.31E + 10 |
| cg26944011 | 3.68E − 05 | 7.62E + 09 | 154362.6 | 3.76E + 14 |
Figure 3(A) Clustering heatmap in the case of consensus k = 6; (B) methylation heatmap of 4000 methylation loci in the training cohort.
Figure 4(A) Prognostic differences among 6 models; (B) proportion of different T stages in 6 models; (C) The proportion of different N stages in 6 models; (D) The proportion of different clinical stages in 6 models; and E. age distribution in 6 models.
Figure 5(A) KEGG pathway enrichment analysis of 4000 methylation with prognostic significance; (B) expression profile of 2747 genes corresponding to 4000 methylation with prognostic significance.
Number of CpG loci in each module.
| Module | CpG count |
|---|---|
| Brown | 187 |
| Green | 144 |
| Greenish yellow | 62 |
| Magenta | 93 |
| Pink | 99 |
| Purple | 71 |
| Red | 117 |
| Tan | 35 |
| Turquoise | 570 |
| Yellow | 145 |
| Black | 111 |
| Blue | 199 |
Figure 6(A) Association between hub CpG loci and different modules; (B) association between hub CpG loci and characteristic clusters.
Annotation information of the 17 hub CpG loci.
| CpG | Chrom | Start | End | GeneSymbol | Feature_Type | MM | GS | Module |
|---|---|---|---|---|---|---|---|---|
| cg02606808 | chr5 | 72107675 | 72107676 | MAP1B | Island | 0.947354 | 0.24915 | black |
| cg19940437 | chr14 | 89954878 | 89954879 | EFCAB11 | S_Shore | 0.934014 | 0.275199 | tan |
| cg18901116 | chr10 | 71397101 | 71397102 | CDH23 | Island | 0.939698 | 0.259864 | tan |
| cg19940437 | chr14 | 89954878 | 89954879 | TDP1 | S_Shore | 0.934014 | 0.275199 | tan |
| cg00919016 | chr7 | 1.39E + 08 | 1.39E + 08 | KLRG2 | Island | 0.930312 | 0.289498 | black |
| cg25191850 | chr1 | 2.34E + 08 | 2.34E + 08 | KCNK1 | Island | 0.900188 | 0.49867 | turquoise |
| cg14831838 | chr2 | 2.19E + 08 | 2.19E + 08 | CDK5R2 | Island | 0.909691 | 0.264816 | black |
| cg26682866 | chr2 | 2.19E + 08 | 2.19E + 08 | CDK5R2 | Island | 0.935384 | 0.262505 | black |
| cg19584875 | chr14 | 90061869 | 90061870 | KCNK13 | Island | 0.942315 | 0.279599 | tan |
| cg21231789 | chr14 | 90061855 | 90061856 | KCNK13 | Island | 0.932377 | 0.254928 | tan |
| cg16581536 | chr14 | 37595644 | 37595645 | TTC6 | Island | 0.916525 | 0.46462 | turquoise |
| cg16581536 | chr14 | 37595644 | 37595645 | FOXA1 | Island | 0.916525 | 0.46462 | turquoise |
| cg06706183 | chr6 | 53545058 | 53545059 | GCLC | Island | 0.923999 | 0.204057 | black |
| cg26752263 | chr6 | 53545055 | 53545056 | GCLC | Island | 0.930967 | 0.217814 | black |
| cg19940437 | chr14 | 89954878 | 89954879 | RP11-33N16.3 | S_Shore | 0.934014 | 0.275199 | tan |
| cg23466060 | chr4 | 13544858 | 13544859 | NKX3-2 | Island | 0.908867 | 0.326585 | black |
| cg06061966 | chr11 | 46345093 | 46345094 | DGKZ | N_Shore | 0.905048 | 0.551368 | green |
| cg01244124 | chr15 | 70763776 | 70763777 | UACA | Island | 0.925507 | 0.263363 | black |
| cg09272849 | chr15 | 70763496 | 70763497 | UACA | Island | 0.917523 | 0.35689 | black |
| cg07436991 | chr20 | 11890663 | 11890664 | BTBD3 | N_Shore | 0.911026 | 0.45042 | turquoise |
Figure 7(A) Correlation of RiskScore with methylation pattern and overall survival in the training cohort; (B) Kaplan–Meier survival analysis of patients with high RiskScore vs low RiskScore in the training cohort.
Figure 8(A) Correlation of RiskScore with the methylation pattern and overall survival in the validation cohort; (B) Kaplan–Meier survival analysis of patient with high RiskScore vs low RiskScore in the validation cohort.