| Literature DB >> 32113923 |
Ruyang Zhang1, Chao Chen2, Xuesi Dong3, Sipeng Shen4, Linjing Lai2, Jieyu He2, Dongfang You5, Lijuan Lin5, Ying Zhu2, Hui Huang2, Jiajin Chen2, Liangmin Wei2, Xin Chen2, Yi Li6, Yichen Guo7, Weiwei Duan8, Liya Liu9, Li Su10, Andrea Shafer11, Thomas Fleischer12, Maria Moksnes Bjaanæs12, Anna Karlsson13, Maria Planck13, Rui Wang14, Johan Staaf13, Åslaug Helland15, Manel Esteller16, Yongyue Wei4, Feng Chen17, David C Christiani18.
Abstract
BACKGROUND: DNA methylation and gene expression are promising biomarkers of various cancers, including non-small cell lung cancer (NSCLC). Besides the main effects of biomarkers, the progression of complex diseases is also influenced by gene-gene (G×G) interactions. RESEARCH QUESTION: Would screening the functional capacity of biomarkers on the basis of main effects or interactions, using multiomics data, improve the accuracy of cancer prognosis? STUDY DESIGN AND METHODS: Biomarker screening and model validation were used to construct and validate a prognostic prediction model. NSCLC prognosis-associated biomarkers were identified on the basis of either their main effects or interactions with two types of omics data. A prognostic score incorporating epigenetic and transcriptional biomarkers, as well as clinical information, was independently validated.Entities:
Keywords: early stage; interaction; multiomics; non-small cell lung cancer; prognostic score
Mesh:
Substances:
Year: 2020 PMID: 32113923 PMCID: PMC7417380 DOI: 10.1016/j.chest.2020.01.048
Source DB: PubMed Journal: Chest ISSN: 0012-3692 Impact factor: 9.410
Figure 1Flow chart of study design and statistical analyses. In the epigenetic analysis, patients with lung adenocarcinoma and lung squamous cell carcinoma from the Harvard, Spain, Norway, and Sweden cohorts were used in the discovery phase for screening, whereas data from the Cancer Genome Atlas (TCGA) was used for validation. In transcriptional analysis, gene expression data from Gene Expression Omnibus and TCGA were used in the discovery phase and the validation phase, respectively. Both main effect and G×G interaction analyses were performed. G×G = gene by gene; NSCLC = non-small cell lung cancer.
Figure 2Estimated survival curves for patients grouped by various biomarker-based scores. A, Epigenetic score of DNA methylation. B, Transcriptional score of gene expression. C, Integrative score of DNA methylation and gene expression. D, Prognostic score of DNA methylation, gene expression, and clinical information. Patients were categorized into low-, medium-, and high-score groups by using the tertiles of each score as the cutoffs. E, Discriminative ability of the prognostic score. Results of 3- and 5-year survival rate, median survival time, and hazard ratio (HR) were compared across five groups, defined by using the quintiles of the prognostic score as the cutoffs. F, HR and P values were derived from the Cox proportional hazards model for patients with different quintile levels of the prognostic score. HRH vs L = HRHigh vs Low; HRM vs L = HRMedium vs Low.
Figure 3Forest plots of results from stratification analysis of prognostic score. HR with 95% CI of the prognostic score on non-small cell lung cancer survival in various subgroups is stratified by clinical characteristics. LUAD = lung adenocarcinoma; LUSC = lung squamous cell carcinoma. See Figure 2 legend for expansion of other abbreviation.
Figure 4Receiver operating characteristic curves for various predictive models using the clinical information (C), the main and interaction effects of DNA methylation (M), and gene expression (E). A, Three-year survival prediction. B, Five-year survival prediction. The AUC increase (%) was evaluated by comparing the model with that with only the clinical information. P values and 95% CIs were calculated by using 1,000 bootstrap samples. AUC = area under the receiver operating characteristic curve; ROC = receiver operating characteristic. See Figure 1 legend for expansion of other abbreviations.
Figure 5Gene network and gene enrichment analysis of 49 genes to which 25 pairs of CpG probes with interaction and one CpG probe with main effect are mapped. A, The gene network plot constructed by GeneMANIA. Central nodes with boldface outline represent hub genes, and the size represents the connectivity degree of each node. B, Barplot of gene pathways enriched with significant genes, and colored by P values. C, The pathway network plot of these pathways enriched with significant genes. Significant pathways with a similarity > 0.3 are connected by edges. Each node represents an enriched term and is colored by its cluster identification. The size of the node represents the number of genes in the pathway. The edge represents potential biologic relationships between two pathways. GO = Gene Ontology.