Haixiang Zhang1, Yinan Zheng2, Lifang Hou2, Cheng Zheng3, Lei Liu4. 1. Center for Applied Mathematics, Tianjin University, Tianjin, 300072, China. 2. Department of Preventive Medicine, Northwestern University, Chicago, IL, 60611, USA. 3. Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE, 68198, USA. 4. Division of Biostatistics, Washington University in St. Louis, St. Louis, MO, 63110, USA.
Abstract
MOTIVATION: Mediation analysis has become a prevalent method to identify causal pathway(s) between an independent variable and a dependent variable through intermediate variable(s). However, little work has been done when the intermediate variables (mediators) are high-dimensional and the outcome is a survival endpoint. In this paper, we introduce a novel method to identify potential mediators in a causal framework of high-dimensional Cox regression. RESULTS: We first reduce the data dimension through a mediation-based sure independence screening (SIS) method. A de-biased Lasso inference procedure is used for Cox's regression parameters. We adopt a multiple-testing procedure to accurately control the false discovery rate (FDR) when testing high-dimensional mediation hypotheses. Simulation studies are conducted to demonstrate the performance of our method. We apply this approach to explore the mediation mechanisms of 379,330 DNA methylation markers between smoking and overall survival among lung cancer patients in the TCGA lung cancer cohort. Two methylation sites (cg08108679 and cg26478297) are identified as potential mediating epigenetic markers. AVAILABILITY: Our proposed method is available with the R package HIMA at https://cran.r-project.org/web/packages/HIMA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Mediation analysis has become a prevalent method to identify causal pathway(s) between an independent variable and a dependent variable through intermediate variable(s). However, little work has been done when the intermediate variables (mediators) are high-dimensional and the outcome is a survival endpoint. In this paper, we introduce a novel method to identify potential mediators in a causal framework of high-dimensional Cox regression. RESULTS: We first reduce the data dimension through a mediation-based sure independence screening (SIS) method. A de-biased Lasso inference procedure is used for Cox's regression parameters. We adopt a multiple-testing procedure to accurately control the false discovery rate (FDR) when testing high-dimensional mediation hypotheses. Simulation studies are conducted to demonstrate the performance of our method. We apply this approach to explore the mediation mechanisms of 379,330 DNA methylation markers between smoking and overall survival among lung cancer patients in the TCGA lung cancer cohort. Two methylation sites (cg08108679 and cg26478297) are identified as potential mediating epigenetic markers. AVAILABILITY: Our proposed method is available with the R package HIMA at https://cran.r-project.org/web/packages/HIMA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Yi Zhe Wang; Wei Zhao; Farah Ammous; Yanyi Song; Jiacong Du; Lulu Shang; Scott M Ratliff; Kari Moore; Kristen M Kelly; Belinda L Needham; Ana V Diez Roux; Yongmei Liu; Kenneth R Butler; Sharon L R Kardia; Bhramar Mukherjee; Xiang Zhou; Jennifer A Smith Journal: Front Cardiovasc Med Date: 2022-05-19