| Literature DB >> 31180552 |
Jiabao Ma1, Rui Li1, Jie Wang1.
Abstract
Head and neck squamous cell carcinoma (HNSCC) remains one of the most common malignancies associated with poor prognosis. DNA methylation has emerged as an important mechanism underlying the radio‑resistance of tumors. Prognostic biomarkers based on radiotherapy‑related aberrant DNA methylation are limited. Methylation profiles of 388 patients with HNSCC were acquired from The Cancer Genome Atlas (TCGA) portal. Genes with differentially methylated CpG sites (DMGs) were screened between patients with a favorable and poor prognosis with or without radiotherapy. A weight gene co‑methylation network was constructed using a Weighted Gene Co‑expression Network Analysis (WGCNA) package. A lasso Cox‑PH model was used to identify the optimal panel of genes with the ability to predict survival in these patients. Prognostic performance of the multi‑gene methylation signature was assessed in a training set and confirmed in a validation set. A total of 976 DMGs were observed between favorable and poor prognostic samples. Four DMG‑enriched co‑methylation modules were identified. A four‑gene methylation signature was determined by the lasso Cox‑PH model that consisted of ZNF10, TMPRSS12, ERGIC2, and RNF215. The risk score based on the four‑gene signature was able to divide the training or validation set into two risk groups with significantly different overall survival. Thus, the present study revealed a radiotherapy‑related four‑gene methylation signature to predict survival outcomes of patients with HNSCC, providing candidate therapeutic targets for novel therapy against HNSCC. However, substantial validation experiments are required.Entities:
Mesh:
Year: 2019 PMID: 31180552 PMCID: PMC6579992 DOI: 10.3892/mmr.2019.10294
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Clinical covariates of patients in the training set and the validation set.
| Clinical covariates | Training set (n=388) | Validation set (n=53) |
|---|---|---|
| Age (mean ± SD, years) | 60.81±11.66 | 49.36±13.47 |
| Sex (male/female) | 288/100 | 42/11 |
| Death (dead/alive/-) | 117/271 | 15/38 |
| OS time (mean ± SD, months) | 26.43±26.21 | 30.46±26.98 |
OS, overall survival; SD, standard deviation.
Figure 1.Analysis of DMGs between favorable and poor prognostic samples of the training set. (A) Volcano plot of effect size (log2[fold changes]) against -log10(FDR) of DMGs. The red spots stand for DMGs with FDR <0.05 and |log2FC|>0.1, while black spots stand for non-significant DMGs. The horizontal green dash line denotes FDR <0.05; the vertical green dash lines denote |log2FC|>0.1. (B) Kernel density plot of log2 (fold changes) of DMGs. (C) Two-way hierarchical clustering analysis of favorable and poor prognostic samples based on methylation levels of the top 100 DMGs. Color mapping from green to red indicates methylation level from low to high. DMGs, differentially methylated CpG sites; FDR, false discovery rate.
Top 20 DMGs between favorable and poor prognostic samples.
| ID | Chr | Position | Gene | Location | β-favorable | β-poor | Effect | Pnominal | FDR |
|---|---|---|---|---|---|---|---|---|---|
| cg26934993 | chr17 | 37528248 | KAT2A | TSS200 | 0.3667 | 0.6062 | −0.7253 | 5.43×10−8 | 3.980×10−6 |
| cg26370886 | chr19 | 53930240 | RASIP1 | Body | 0.3655 | 0.5390 | −0.5604 | 5.12×10−7 | 3.750×10−5 |
| cg24351916 | chr3 | 39427639 | RPSA | Body | 0.7753 | 0.8339 | −0.1051 | 5.82×10−7 | 4.270×10−5 |
| cg09938490 | chr15 | 41876101 | SERINC4 | Body | 0.8672 | 0.7346 | 0.2394 | 8.47×10−7 | 6.210×10−5 |
| cg08069338 | chr6 | 160127888 | SNORA29 | TSS1500 | 0.8294 | 0.7166 | 0.2110 | 9.56×10−7 | 7.010×10−5 |
| cg27211576 | chr15 | 72880218 | CSK | Body | 0.6376 | 0.5534 | 0.2045 | 1.60×10−6 | 1.171×10−4 |
| cg26264697 | chr19 | 3529064 | HMG20B | Body | 0.4428 | 0.5928 | −0.4210 | 1.78×10−6 | 1.307×10−4 |
| cg27266479 | chr1 | 9217469 | H6PD | Promoter | 0.1574 | 0.1290 | 0.2868 | 1.95×10−6 | 1.432×10−4 |
| cg21657521 | chr19 | 7650447 | C19orf59 | TSS1500 | 0.0907 | 0.1250 | −0.4637 | 2.75×10−6 | 2.017×10−4 |
| cg05657416 | chr6 | 27213771 | HIST1H4I | TSS1500 | 0.8716 | 0.8073 | 0.1106 | 2.86×10−6 | 2.099×10−4 |
| cg21205305 | chr19 | 54884437 | C19orf76 | Promoter | 0.2889 | 0.3970 | −0.4583 | 3.09×10−6 | 2.263×10−4 |
| cg27537591 | chr10 | 116687888 | TRUB1 | Promoter | 0.0501 | 0.0402 | 0.3154 | 3.30×10−6 | 2.422×10−4 |
| cg25739003 | chr11 | 62097836 | EEF1G | Promoter | 0.0751 | 0.0606 | 0.3092 | 4.23×10−6 | 3.100×10−4 |
| cg26740494 | chr1 | 1556994 | MMP23B | TSS1500 | 0.4795 | 0.6275 | −0.3881 | 4.68×10−6 | 3.434×10−4 |
| cg26222042 | chr5 | 31567957 | C5orf22 | Promoter | 0.0580 | 0.0447 | 0.3762 | 4.68×10−6 | 3.434×10−4 |
| cg26615259 | chr1 | 154977769 | MRPL24 | Promoter | 0.0558 | 0.0447 | 0.3208 | 4.75×10−6 | 3.483×10−4 |
| cg27085584 | chr5 | 61735043 | DIMT1L | Promoter | 0.0392 | 0.0496 | −0.3411 | 4.85×10−6 | 3.555×10−4 |
| cg27535410 | chr19 | 797354 | PRTN3 | Body | 0.8207 | 0.9326 | −0.1844 | 5.20×10−6 | 3.812×10−4 |
| cg27065374 | chrX | 67976906 | EFNB1 | Body | 0.5659 | 0.4863 | 0.2185 | 5.20×10−6 | 3.816×10−4 |
| cg22961457 | chr20 | 61840754 | SLC2A4RG | 3′UTR | 0.2397 | 0.4364 | −0.8641 | 5.41×10−6 | 3.965×10−4 |
β-favorable and β-poor represent the mean methylation level of favorable and poor prognostic samples, respectively. Chr, chromosome; FDR, false discovery rate; DMG, genes with differentially methylated CpG sites.
WGCNA network analysis identified gene modules with co-methylated CpG sites.
| Module color | Count of CpGs | Correlation | Pcorr | Count of DM CpGs | Enrichment fold (95% CI) | Phyper |
|---|---|---|---|---|---|---|
| Black | 301 | 0.702 | 7.01×10−32 | 30 | 1.033 (0.679–1.521) | 8.43×10−1 |
| Blue | 562 | 0.514 | 1.70×10−6 | 38 | 0.701 (0.486–0.985) | 4.01×10−2 |
| Brown | 469 | 0.689 | 6.68×10−11 | 28 | 0.619 (0.403–0.915) | 1.31×10−2 |
| Green | 397 | 0.561 | 1.33×10−7 | 46 | 1.201 (0.856–1.652) | 2.63×10−1 |
| Green-yellow | 196 | 0.687 | 2.14×10−33 | 44 | 2.327 (1.622–3.277) | 5.59×10−6 |
| Grey | 2,623 | 0.215 | 8.56×10−2 | 180 | 0.711 (0.596–0.846) | 7.10×10−5 |
| Magenta | 221 | 0.642 | 7.05×10−10 | 92 | 4.314 (3.301–5.604) | 2.20×10−16 |
| Pink | 248 | 0.565 | 1.59×10−4 | 2 | 0.084 (0.010–0.306) | 9.75×10−8 |
| Purple | 208 | 0.687 | 2.71×10−23 | 45 | 2.243 (1.571–3.142) | 1.00×10−5 |
| Red | 324 | 0.701 | 5.87×10−4 | 13 | 0.416 (0.218–0.727) | 6.86×10−4 |
| Tan | 182 | 0.772 | 4.53×10−26 | 16 | 0.911 (0.507–1.532) | 8.99×10−1 |
| Turquoise | 649 | 0.645 | 1.47×10−27 | 113 | 1.805 (1.442–2.245) | 2.96×10−7 |
| Yellow | 431 | 0.538 | 1.66×10−15 | 10 | 0.241 (0.114–0.449) | 4.86×10−8 |
WGCNA, Weighted Gene Co-expression Network Analysis; Count of CpGs, the number of CpGs in a module; Count of DM CpGs, the number of differentially methylated CpGs enriched in a module; Pcorr, P-value for correlation coefficient; Phyper, P-value for enrichment analysis.
Figure 2.Results of the WGCNA. (A) Clustering dendrograms of gene modules associated with co-methylated CpG sites. Genes in the same branch were highly connected. Each color indicates a certain gene. (B) The count of significant DMGs mapped in each gene module. Each color represents a gene module, and the number represents the genes with differentially methylated CpGs. (C) Fold-enrichment value of each module. The vertical axis stands for the fold enrichment value. The horizontal black dash line denotes fold enrichment=1. *P<0.05. WGCNA, Weighted Gene Co-expression Network Analysis; DMGs, differentially methylated CpG sites.
Significantly enriched GO terms for genes with differentially methylated CpGs in four important gene modules.
| GO term | Count of genes | P-value | Genes |
|---|---|---|---|
| Protein-DNA complex assembly | 8 | 2.71×10−4 | HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, HIST1H2AH, MIS12 |
| Nucleosome organization | 8 | 3.10×10−4 | HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, SUPT16H, HIST1H2AH |
| Protein folding | 10 | 8.25×10−4 | GRPEL1, CRYAA, PFDN5, CCT8, C19ORF2, CCT3, CCT6A, DNAJC2, CLPX, PIN1 |
| Nucleosome assembly | 7 | 1.10×10−3 | HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, HIST1H2AH |
| DNA packaging | 8 | 1.23×10−3 | HIST2H2AA3, CHMP1A, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, HIST1H2AH |
| Chromatin assembly | 7 | 1.32×10−3 | HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, HIST1H2AH |
| Chromatin assembly or disassembly | 8 | 1.97×10−3 | HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, SUPT16H, HIST1H2AH |
| Translation | 13 | 2.31×10−3 | MRPL24, EIF4G1, RPSA, MRPS16, MRPL27, RARS, EIF2S2, RPL35, MARS2, EIF5A, DPH1, RPL10A, MRPL34 |
| Chromosome organization | 16 | 3.19×10−3 | KAT2A, HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BG, NDC80, LIG4, MIS12, C20ORF20, KDM1A, CHMP1A, HIST1H2BL, CENPA, HIST1H3A, SUPT16H, BRE, HIST1H2AH |
| Cell cycle | 21 | 5.67×10−3 | CCNT2, MAD1L1, CRYAA, NDC80, PMF1, PBK, LIG4, TACC3, ESCO2, UHMK1, MIS12, PIN1, CHMP1A, PSMB6, CENPA, PSMA3, SKA2, RAD51L3, MAPK7, KPNA2, DNAJC2 |
| DNA metabolic process | 15 | 1.11×10−2 | NEIL3, SMC5, PPT1, LIG4, ESCO2, PCNA, SUPT16H, PSIP1, BRE, DDB2, RAD51L3, KPNA2, DNAJC2, APEX1, DUT |
| Cell cycle process | 16 | 1.24×10−2 | MAD1L1, CRYAA, NDC80, PMF1, PBK, TACC3, UHMK1, MIS12, CHMP1A, PSMB6, CENPA, PSMA3, SKA2, RAD51L3, KPNA2, DNAJC2 |
| Mitotic cell cycle | 12 | 1.44×10−2 | MAD1L1, CHMP1A, PSMB6, CENPA, PSMA3, NDC80, SKA2, PMF1, PBK, DNAJC2, KPNA2, MIS12 |
| Response to DNA damage stimulus | 12 | 1.52×10−2 | NEIL3, DDB2, BRE, SUPT16H, SMC5, PCNA, AATF, RAD51L3, LIG4, ATMIN, APEX1, ESCO2 |
| Chromatin organization | 12 | 1.67×10−2 | KAT2A, KDM1A, HIST2H2AA3, HIST4H4, HIST1H2AG, HIST1H2BL, CENPA, HIST1H2BG, HIST1H3A, BRE, SUPT16H, HIST1H2AH, C20ORF20 |
| M phase | 11 | 1.68×10−2 | MAD1L1, CHMP1A, CRYAA, NDC80, SKA2, RAD51L3, PMF1, PBK, TACC3, KPNA2, MIS12 |
Figure 3.Scatter plot of the overall correlation between methylation and gene expression levels of the paired samples in TCGA portal for the 294 genes contained in the significantly enriched four gene modules, yellow-green, magenta, purple, and turquoise modules. The black spots represent genes. The red line is the trend line of the genes. Cor. denotes Pearson's correlation coefficient of methylation and gene expression. The P-value indicates the significance of the correlation.
Figure 4.A Venn diagram depicting overlap between the genes significantly related to prognosis in methylation level (left) and the genes significantly associated with prognosis in gene expression (right). Two groups of genes were identified by univariate Cox regression analysis.
Figure 5.Kaplan-Meier estimates for patients grouped based on (A) median methylation level or (B) median gene expression level of a gene in the training cohort.
Risk score model based on a four-gene methylation signature.
| ID | Gene | Chr. | Position | Location | Coef | Hazard ratio (95% CI) | P-value |
|---|---|---|---|---|---|---|---|
| cg25577680 | ZNF10 | chr12 | 132217652 | Promoter | 0.964 | 6.259 (1.274–10.74) | 0.0230 |
| cg27261219 | TMPRSS12 | chr12 | 49522905 | TSS | 1.035 | 3.179 (1.131–8.935) | 0.0277 |
| cg25338581 | ERGIC2 | chr12 | 29425828 | TSS | 7.166 | 7.576 (4.142–10.506) | 0.0023 |
| cg25964984 | RNF215 | chr22 | 29113371 | TSS | −4.896 | 0.437 (0.0405–0.719) | 0.0223 |
Chr, chromosome; coef, Cox-PH coefficient.
Figure 6.Kaplan-Meier and ROC curves for the four-gene methylation signature in (A) the training set and (B) the validation set. Patients were classified by methylation risk score into high-risk and low-risk groups. Differences between the two groups were evaluated by log-rank test.
Figure 7.Kaplan-Meier curves for high- and low-risk groups of the patients (A) without or (B) with radiotherapy.