| Literature DB >> 35595947 |
Dusan Petrovic1,2,3, Barbara Bodinier1, Florence Guida1,4, Marc Chadeau-Hyam5,6, Sonia Dagnino1, Matthew Whitaker1, Maryam Karimi1,7,8, Gianluca Campanella1, Therese Haugdahl Nøst9, Silvia Polidoro10, Domenico Palli11, Vittorio Krogh12, Rosario Tumino13, Carlotta Sacerdote14, Salvatore Panico15, Eiliv Lund9,16, Pierre-Antoine Dugué17,18,19, Graham G Giles17,18,19, Gianluca Severi20, Melissa Southey17,19,21, Paolo Vineis1, Silvia Stringhini2,3, Murielle Bochud2, Torkjel M Sandanger9, Roel C H Vermeulen1,22,23.
Abstract
Smoking-related epigenetic changes have been linked to lung cancer, but the contribution of epigenetic alterations unrelated to smoking remains unclear. We sought for a sparse set of CpG sites predicting lung cancer and explored the role of smoking in these associations. We analysed CpGs in relation to lung cancer in participants from two nested case-control studies, using (LASSO)-penalised regression. We accounted for the effects of smoking using known smoking-related CpGs, and through conditional-independence network. We identified 29 CpGs (8 smoking-related, 21 smoking-unrelated) associated with lung cancer. Models additionally adjusted for Comprehensive Smoking Index-(CSI) selected 1 smoking-related and 49 smoking-unrelated CpGs. Selected CpGs yielded excellent discriminatory performances, outperforming information provided by CSI only. Of the 8 selected smoking-related CpGs, two captured lung cancer-relevant effects of smoking that were missed by CSI. Further, the 50 CpGs identified in the CSI-adjusted model complementarily explained lung cancer risk. These markers may provide further insight into lung cancer carcinogenesis and help improving early identification of high-risk patients.Entities:
Keywords: DNA methylation; Lung cancer; Partial correlation network; Smoking; Variable selection
Mesh:
Year: 2022 PMID: 35595947 PMCID: PMC9288379 DOI: 10.1007/s10654-022-00877-2
Source DB: PubMed Journal: Eur J Epidemiol ISSN: 0393-2990 Impact factor: 12.434
Characteristics of study participants stratified by cohort and future lung cancer status. The mean (standard deviation) and counts (percentage) are reported for continuous and categorical variables respectively
| EPIC-Italy | NOWAC | Full population | |||
|---|---|---|---|---|---|
| Controls (N = 512) | Cases (N = 185) | Controls (N = 314) | Cases (N = 128) | ||
| Sex (women) | 331 (65%) | 81 (44%) | 314 (100%) | 128 (100%) | 854 (75%) |
| Age (years) | 54 (6.8) | 54.5 (6.3) | 51.1 (6.9) | 56 (4.2) | 53.5 (6.7) |
| Never | 257 (50%) | 26 (14%) | 136 (43%) | 14 (11%) | 433 (38%) |
| Former | 143 (28%) | 59 (32%) | 97 (31%) | 34 (27%) | 333 (29%) |
| Current | 112 (22%) | 100 (54%) | 81 (26%) | 80 (62%) | 373 (33%) |
| Smoking duration (years) | 12.1 (14.2) | 27.3 (14.3) | 15.9 (16.5) | 31.6 (14.8) | 17.8 (16.6) |
| Smoking intensity (cig./day) | 6 (8.9) | 14.4 (9.4) | 5.5 (5.8) | 10.3 (5.5) | 7.7 (8.6) |
| Comprehensive Smoking Index (CSI) | 0.5 (0.7) | 1.4 (0.8) | 0.7 (0.8) | 1.4 (0.7) | 0.8 (0.8) |
| Time to diagnosis (years) | 7.2 (3.7) | 3.9 (2.0) | 5.9 (3.5) | ||
| Florence | 92 (18%) | 63 (34%) | |||
| Naples | 11 (2%) | 3 (2%) | |||
| Ragusa | 29 (6%) | 14 (8%) | |||
| Turin | 246 (48%) | 60 (32%) | |||
| Varese | 134 (26%) | 45 (24%) | |||
Fig. 1Stability selection models exploring the joint associations between CpG sites and the future risk of lung cancer. Selection proportions of stably selected CpG sites are derived from LASSO-penalised logistic models for the risk of lung cancer including all N = 443,150 CpG sites as predictors and adjusted for age, sex (A) and CSI (B). Comparison of selection proportions (C) or β-coefficients (D) from the base versus CSI-adjusted models for CpG sites that are stably selected in at least one of these two models. The list of stably selected CpG sites is reported at the bottom, with overlapping signals in bold. CpG sites related to smoking at a Bonferroni-corrected significance level ensuring a family-wise error rate below 0.05 are presented in red, and sites unrelated to smoking in blue
Fig. 2Receiver Operating Characteristic curve for lung cancer prediction. Mean and 5th–95th percentiles of the Area Under the Curve (AUC) were calculated across the 1000 recalibrated models including (i) age and sex (green), (ii) age, sex and stably selected CpG sites from the base model (dark blue), (iii) age, sex and CSI (orange), and (iv) age, sex, CSI and the stably selected CpG sites from the adjusted model (dark red) (A). Mean and 5th–95th percentiles of the AUC are reported for models sequentially including the first 50 CpG sites by order of selection proportion in the base (B) and adjusted (C) models. Calibrated stability selection models are indicated by a black dashed vertical line. CpG sites related to smoking at a Bonferroni-corrected significance level ensuring a family-wise error rate below 0.05 are presented in red, and sites unrelated to smoking are presented in blue
Fig. 3Conditional independence network including stably selected CpGs identified in relation to lung cancer in the base LASSO model (N = 29 stably selected CpG markers) and CSI (black square). CpG sites related to smoking at a Bonferroni-corrected significance level ensuring a family-wise error rate below 0.05 are presented in red, and sites unrelated to smoking are presented in blue