Literature DB >> 21029849

Multigenic modeling of complex disease by random forests.

Yan V Sun1.   

Abstract

The genetics and heredity of complex human traits have been studied for over a century. Many genes have been implicated in these complex traits. Genome-wide association studies (GWAS) were designed to investigate the association between common genetic variation and complex human traits using high-throughput platforms that measured hundreds of thousands of common single-nucleotide polymorphisms (SNPs). GWAS have successfully identified many novel genetic loci associated with complex traits using a univariate regression-based approach. Even for traits with a large number of identified variants, only a small fraction of the interindividual variation in risk phenotypes has been explained. In biological systems, protein, DNA, RNA, and metabolites frequently interact to each other to perform their biological functions, and to respond to environmental factors. The complex interactions among genes and between the genes and environment may partially explain the "missing heritability." The traditional regression-based methods are limited to address the complex interactions among the hundreds of thousands of SNPs and their environmental context by both the modeling and computational challenge. Random Forests (RF), one of the powerful machine learning methods, is regarded as a useful alternative to capture the complex interaction effects among the GWAS data, and potentially address the genetic heterogeneity underlying these complex traits using a computationally efficient framework. In this chapter, the features of prediction and variable selection, and their applications in genetic association studies are reviewed and discussed. Additional improvements of the original RF method are warranted to make the applications in GWAS to be more successful.
Copyright © 2010 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2010        PMID: 21029849     DOI: 10.1016/B978-0-12-380862-2.00004-7

Source DB:  PubMed          Journal:  Adv Genet        ISSN: 0065-2660            Impact factor:   1.944


  18 in total

1.  Correction for population stratification in random forest analysis.

Authors:  Yang Zhao; Feng Chen; Rihong Zhai; Xihong Lin; Zhaoxi Wang; Li Su; David C Christiani
Journal:  Int J Epidemiol       Date:  2012-11-12       Impact factor: 7.196

Review 2.  Random forests for genetic association studies.

Authors:  Benjamin A Goldstein; Eric C Polley; Farren B S Briggs
Journal:  Stat Appl Genet Mol Biol       Date:  2011-07-12

3.  Development and assessment of machine learning algorithms for predicting remission after transsphenoidal surgery among patients with acromegaly.

Authors:  Yanghua Fan; Yansheng Li; Yichao Li; Shanshan Feng; Xinjie Bao; Ming Feng; Renzhi Wang
Journal:  Endocrine       Date:  2019-10-30       Impact factor: 3.633

Review 4.  Systems biology data analysis methodology in pharmacogenomics.

Authors:  Andrei S Rodin; Grigoriy Gogoshin; Eric Boerwinkle
Journal:  Pharmacogenomics       Date:  2011-09       Impact factor: 2.533

Review 5.  Brief review of regression-based and machine learning methods in genetic epidemiology: the Genetic Analysis Workshop 17 experience.

Authors:  Abhijit Dasgupta; Yan V Sun; Inke R König; Joan E Bailey-Wilson; James D Malley
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

6.  Extending the Distributed Lag Model framework to handle chemical mixtures.

Authors:  Ghalib A Bello; Manish Arora; Christine Austin; Megan K Horton; Robert O Wright; Chris Gennings
Journal:  Environ Res       Date:  2017-04-03       Impact factor: 6.498

Review 7.  Genomics and bioinformatics of Parkinson's disease.

Authors:  Sonja W Scholz; Tim Mhyre; Habtom Ressom; Salim Shah; Howard J Federoff
Journal:  Cold Spring Harb Perspect Med       Date:  2012-07       Impact factor: 6.915

Review 8.  Integrative systems biology approaches in asthma pharmacogenomics.

Authors:  Amber Dahlin; Kelan G Tantisira
Journal:  Pharmacogenomics       Date:  2012-09       Impact factor: 2.533

9.  Beyond the fourth wave of genome-wide obesity association studies.

Authors:  C H Sandholt; T Hansen; O Pedersen
Journal:  Nutr Diabetes       Date:  2012-07-30       Impact factor: 5.097

10.  SNP interaction detection with Random Forests in high-dimensional genetic data.

Authors:  Stacey J Winham; Colin L Colby; Robert R Freimuth; Xin Wang; Mariza de Andrade; Marianne Huebner; Joanna M Biernacka
Journal:  BMC Bioinformatics       Date:  2012-07-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.