| Literature DB >> 34725857 |
Mingyang Ren1,2, Sanguo Zhang1,2, Shuangge Ma3, Qingzhao Zhang4.
Abstract
In high-throughput cancer studies, gene-environment interactions associated with outcomes have important implications. Some commonly adopted identification methods do not respect the "main effect, interaction" hierarchical structure. In addition, they can be challenged by data contamination and/or long-tailed distributions, which are not uncommon. In this article, robust methods based on γ $\gamma$ -divergence and density power divergence are proposed to accommodate contaminated data/long-tailed distributions. A hierarchical sparse group penalty is adopted for regularized estimation and selection and can identify important gene-environment interactions and respect the "main effect, interaction" hierarchical structure. The proposed methods are implemented using an effective group coordinate descent algorithm. Simulation shows that when contamination occurs, the proposed methods can significantly outperform the existing alternatives with more accurate identification. The proposed approach is applied to the analysis of The Cancer Genome Atlas (TCGA) triple-negative breast cancer data and Gene Environment Association Studies (GENEVA) Type 2 Diabetes data.Entities:
Keywords: divergence; gene-environment interaction; hierarchical structure; penalized identification; robustness
Mesh:
Year: 2021 PMID: 34725857 PMCID: PMC9386692 DOI: 10.1002/bimj.202000157
Source DB: PubMed Journal: Biom J ISSN: 0323-3847 Impact factor: 1.715