| Literature DB >> 29954287 |
Jiyun Choi1, Kipoong Kim1, Hokeun Sun1.
Abstract
In genetic association studies, regularization methods are often used due to their computational efficiency for analysis of high-dimensional genomic data. DNA methylation data generated from Infinium HumanMethylation450 BeadChip Kit have a group structure where an individual gene consists of multiple Cytosine-phosphate-Guanine (CpG) sites. Consequently, group-based regularization can precisely detect outcome-related CpG sites. Representative examples are sparse group lasso (SGL) and network-based regularization. The former is powerful when most of the CpG sites within the same gene are associated with a phenotype outcome. In contrast, the latter is preferred when only a few of the CpG sites within the same gene are related to the outcome. In this paper, we propose new variable selection strategy based on a selection probability that measures selection frequency of individual variables selected by both SGL and network-based regularization. In extensive simulation study, we demonstrated that the proposed strategy can show relatively outstanding selection performance under any situation, compared with both SGL and network-based regularization. Also, we applied the proposed strategy to identify differentially methylated CpG sites and their corresponding genes from ovarian cancer data.Entities:
Keywords: DNA methylation; SGL; network-based regularization; selection probability
Mesh:
Year: 2018 PMID: 29954287 DOI: 10.1142/S0219720018500105
Source DB: PubMed Journal: J Bioinform Comput Biol ISSN: 0219-7200 Impact factor: 1.122