| Literature DB >> 31585953 |
Juho A J Kontio1, Mikko J Sillanpää2,3.
Abstract
Gaussian process (GP)-based automatic relevance determination (ARD) is known to be an efficient technique for identifying determinants of gene-by-gene interactions important to trait variation. However, the estimation of GP models is feasible only for low-dimensional datasets (∼200 variables), which severely limits application of the GP-based ARD method for high-throughput sequencing data. In this paper, we provide a nonparametric prescreening method that preserves virtually all the major benefits of the GP-based ARD method and extends its scalability to the typical high-dimensional datasets used in practice. In several simulated test scenarios, the proposed method compared favorably with existing nonparametric dimension reduction/prescreening methods suitable for higher-order interaction searches. As a real-data example, the proposed method was applied to a high-throughput dataset downloaded from the cancer genome atlas (TCGA) with measured expression levels of 16,976 genes (after preprocessing) from patients diagnosed with acute myeloid leukemia.Entities:
Keywords: Gaussian kernel models; Gaussian process regression; Haseman-Elston regression; acute myeloid leukemia; higher-order gene-by-gene interactions; nonlinear dimension reduction
Mesh:
Year: 2019 PMID: 31585953 PMCID: PMC6893368 DOI: 10.1534/genetics.119.302658
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562