| Literature DB >> 29619118 |
Yongyue Wei1,2,3, Junya Liang1,2, Ruyang Zhang1,2,3, Yichen Guo3, Sipeng Shen1,2,3, Li Su2, Xihong Lin4, Sebastian Moran5, Åslaug Helland6, Maria M Bjaanæs6, Anna Karlsson7, Maria Planck7, Manel Esteller5, Thomas Fleischer6, Johan Staaf7, Yang Zhao1,2,3, Feng Chen1,2,3, David C Christiani2,3,8.
Abstract
Background: KDM lysine demethylase family members are related to lung cancer clinical outcomes and are potential biomarkers for chemotherapeutics. However, little is known about epigenetic alterations in KDM genes and their roles in lung cancer survival.Entities:
Keywords: DNA methylation; KDM; Lung cancer; Lysine demethylase; Survival
Mesh:
Substances:
Year: 2018 PMID: 29619118 PMCID: PMC5879927 DOI: 10.1186/s13148-018-0474-3
Source DB: PubMed Journal: Clin Epigenetics ISSN: 1868-7075 Impact factor: 6.551
Demographic and clinic pathological descriptions for study populations
| Variables | Discovery set | Validation set | ||||
|---|---|---|---|---|---|---|
| Harvard | Spain | Norway | Sweden | All | TCGA | |
| Age (years), mean ± SD | 67.67 ± 9.92 | 65.67 ± 10.58 | 65.52 ± 9.34 | 67.54 ± 9.99 | 66.44 ± 10.08 | 66.51 ± 9.47 |
| Gender, | * | |||||
| Female | 67 (44.37) | 105 (46.46) | 71 (53.38) | 54 (52.43) | 297 (48.45) | 255 (41.33) |
| Male | 84 (55.63) | 121 (53.54) | 62 (46.62) | 49 (47.57) | 316 (51.55) | 362 (58.67) |
| Smoking status, | * | |||||
| Never | 18 (11.92) | 30 (13.57) | 17 (12.78) | 18 (17.48) | 83 (13.65) | 55 (9.18) |
| Former | 81 (53.64) | 120 (54.30) | 74 (55.64) | 54 (52.43) | 329 (54.11) | 376 (62.77) |
| Current | 52 (34.44) | 71 (32.13) | 42 (31.58) | 31 (30.10) | 196 (32.24) | 168 (28.05) |
| Unknown | 0 | 5 | 0 | 0 | 5 | 18 |
| TNM stage, | * | |||||
| I | 104 (68.87) | 183 (80.97) | 93 (69.92) | 95 (92.23) | 475 (77.49) | 393 (63.70) |
| II | 47 (31.13) | 43 (19.03) | 40 (30.08) | 8 (7.77) | 138 (22.51) | 224 (36.30) |
| Histology, | * | |||||
| Adenocarcinoma (AC) | 96 (63.58) | 183 (80.97) | 133 (100.00) | 80 (77.67) | 492 (80.26) | 332 (53.81) |
| Squamous cell carcinoma (SCC) | 55 (36.42) | 43 (19.03) | 0 (0.00) | 23 (22.33) | 121 (19.74) | 285 (46.19) |
| Chemotherapy, | * | |||||
| No | 142 (94.04) | 177 (90.77) | 102 (76.69) | 67 (90.54) | 488 (88.25) | 194 (76.98) |
| Yes | 9 (5.96) | 18 (9.23) | 31 (23.31) | 7 (9.46) | 65 (11.75) | 58 (23.02) |
| Unknown | 0 | 31 | 0 | 29 | 60 | 365 |
| Radiotherapy, | ||||||
| No | 132 (87.42) | 184 (94.36) | 132 (99.25) | 74 (100.00) | 522 (94.39) | 239 (94.84) |
| Yes | 19 (12.58) | 11 (5.64) | 1 (0.75) | 0 (0.00) | 31 (5.61) | 13 (5.16) |
| Unknown | 0 | 31 | 0 | 29 | 60 | 365 |
| Adjuvant therapy, | * | |||||
| No | 127 (84.11) | 168 (86.15) | 101 (75.94) | 67 (90.54) | 463 (83.73) | 187 (74.21) |
| Yes | 24 (15.89) | 27 (13.85) | 32 (24.06) | 7 (9.46) | 90 (16.27) | 65 (25.79) |
| Unknown | 0 | 31 | 0 | 29 | 60 | 365 |
| Survival year, month | ||||||
| Median (95%CI) | 6.66 (5.41–7.87) | 7.12 (5.06–9.63) | 7.36 (6.77–7.95)* | 7.39 (4.98–9.12) | 7.39 (6.50–8.23) | 4.54 (3.68–5.41) |
| Dead, | 122 (80.79) | 101 (44.69) | 42 (31.58) | 58 (31.58) | 323 (52.69) | 142 (23.01) |
*Statistically significant difference (P ≤ 0.05) was observed between combined discovery set and validation set (TCGA)
Fig. 1Analysis work flow. Adenocarcinoma and squamous cell carcinoma samples from Harvard, Spain, Norway, and Sweden cohorts were used for the discovery phase of analysis. Data from The Cancer Genome Atlas (TCGA) were used for validation. Ranger is a weighted version of random forest for controlling for the covariates including age, gender, smoking status, and histological stage. Variable importance score (VIS) was estimated for each CpG site and was ranked in descending order. CpG sites ranked in top 5% in both discovery and validation sets were selected for further evaluation by Cox regression. Multiple testing correction by false discovery rate (FDR) method was used if necessary
Fig. 2Ranger in discovery (a) or validation (b) in adenocarcinomas. Weighted random forest (Ranger, where confounders like age, gender, smoking status, and stage are adjusted with given 100% weight) was employed in discovery phase (a) and validation set (b) to evaluate the importance of variables. Ranger provides VIS (variable importance score) for each methylation sites. Variables that were in top 5% (red lollipop) or top 10% (yellow lollipop, a flexible criterion) in both discovery phase (a) and validation (b) would be carried forward
Fig. 3Ranger in discovery (a) or validation (b) in squamous cell carcinomas. Weighted random forest (Ranger, where confounders like age, gender, smoking status, and stage are adjusted with given 100% weight) was employed in discovery phase (a) and validation set (b) to evaluate the importance of variables. Ranger provides VIS (variable importance score) for each methylation sites. Variables that were in top 5% (red lollipop) or top 10% (yellow lollipop, a flexible criterion) in both discovery phase (a) and validation (b) would be carried forward
Fig. 4Association between DNA methylation at sites cg11637544 in KDM2A and cg26662347 in KDM1A with overall survival of squamous cell carcinomas and correlation between these two sites and their corresponding gene expression. Fixed-effects meta-analysis was used to combine the results from discovery and validation sets for squamous cell carcinomas (SCC) (a cg11637544; b cg26662347). I2and corresponding P value were used to evaluate heterogeneity across studies. DNA methylation level was categorized to six quantiles, and box plot for gene expression was drawn for each quantile. Pearson correlation was used to estimate the correlation coefficient (r) and the P value; gene expression was log2 transformed before analysis (c cg11637544; d cg26662347)
Fig. 5Survival classification tree for adenocarcinomas. Survival classification tree was built with seven CpG sites as well as covariates using the merged data of discovery and validation sets among adenocarcinoma cases (a), which identified five clusters with significantly different survival curves (b). Cox regression was used to compare the outcomes among clusters (cluster 4 as reference) and represented by hazard ratio (HR), 95% confidence interval (95%CI), and the P value (c)
Fig. 6Survival classification tree for squamous cell carcinomas. Survival classification tree was built with five CpG sites as well as covariates using the merged data of discovery and validation sets among adenocarcinoma cases (a), which identified four clusters with significantly different survival curves (b). Cox regression was used to compare the outcomes among clusters (cluster 4 as reference) and represented by hazard ratio (HR), 95% confidence interval (95%CI), and the P value (c)