Literature DB >> 18499695

EM-random forest and new measures of variable importance for multi-locus quantitative trait linkage analysis.

Sophia S F Lee1, Lei Sun, Rafal Kustra, Shelley B Bull.   

Abstract

MOTIVATION: We developed an EM-random forest (EMRF) for Haseman-Elston quantitative trait linkage analysis that accounts for marker ambiguity and weighs each sib-pair according to the posterior identical by descent (IBD) distribution. The usual random forest (RF) variable importance (VI) index used to rank markers for variable selection is not optimal when applied to linkage data because of correlation between markers. We define new VI indices that borrow information from linked markers using the correlation structure inherent in IBD linkage data.
RESULTS: Using simulations, we find that the new VI indices in EMRF performed better than the original RF VI index and performed similarly or better than EM-Haseman-Elston regression LOD score for various genetic models. Moreover, tree size and markers subset size evaluated at each node are important considerations in RFs. AVAILABILITY: The source code for EMRF written in C is available at www.infornomics.utoronto.ca/downloads/EMRF.

Entities:  

Mesh:

Year:  2008        PMID: 18499695      PMCID: PMC2638262          DOI: 10.1093/bioinformatics/btn239

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  26 in total

1.  A note on the power provided by sibships of sizes 2, 3, and 4 in genetic covariance modeling of a codominant QTL.

Authors:  C V Dolan; D I Boomsma; M C Neale
Journal:  Behav Genet       Date:  1999-05       Impact factor: 2.805

2.  Application of the random forest classification algorithm to a SELDI-TOF proteomics study in the setting of a cancer prevention trial.

Authors:  Grant Izmirlian
Journal:  Ann N Y Acad Sci       Date:  2004-05       Impact factor: 5.691

3.  Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma.

Authors:  Tao Shi; David Seligson; Arie S Belldegrun; Aarno Palotie; Steve Horvath
Journal:  Mod Pathol       Date:  2005-04       Impact factor: 7.842

4.  Relating HIV-1 sequence variation to replication capacity via trees and forests.

Authors:  Mark R Segal; Jason D Barbour; Robert M Grant
Journal:  Stat Appl Genet Mol Biol       Date:  2004-02-12

5.  ROCR: visualizing classifier performance in R.

Authors:  Tobias Sing; Oliver Sander; Niko Beerenwinkel; Thomas Lengauer
Journal:  Bioinformatics       Date:  2005-08-11       Impact factor: 6.937

6.  Parametric and nonparametric linkage analysis: a unified multipoint approach.

Authors:  L Kruglyak; M J Daly; M P Reeve-Daly; E S Lander
Journal:  Am J Hum Genet       Date:  1996-06       Impact factor: 11.025

7.  A general model for the genetic analysis of pedigree data.

Authors:  R C Elston; J Stewart
Journal:  Hum Hered       Date:  1971       Impact factor: 0.444

8.  Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study.

Authors:  D Levy; A L DeStefano; M G Larson; C J O'Donnell; R P Lifton; H Gavras; L A Cupples; R H Myers
Journal:  Hypertension       Date:  2000-10       Impact factor: 10.190

9.  Construction of multilocus genetic linkage maps in humans.

Authors:  E S Lander; P Green
Journal:  Proc Natl Acad Sci U S A       Date:  1987-04       Impact factor: 11.205

10.  Screening large-scale association study data: exploiting interactions using random forests.

Authors:  Kathryn L Lunetta; L Brooke Hayward; Jonathan Segal; Paul Van Eerdewegh
Journal:  BMC Genet       Date:  2004-12-10       Impact factor: 2.797

View more
  8 in total

1.  Genome-wide strategies for discovering genetic influences on cognition and cognitive disorders: methodological considerations.

Authors:  Steven G Potkin; Jessica A Turner; Guia Guffanti; Anita Lakatos; Federica Torri; David B Keator; Fabio Macciardi
Journal:  Cogn Neuropsychiatry       Date:  2009       Impact factor: 1.871

Review 2.  Random forests for genetic association studies.

Authors:  Benjamin A Goldstein; Eric C Polley; Farren B S Briggs
Journal:  Stat Appl Genet Mol Biol       Date:  2011-07-12

3.  A novel targeted learning method for quantitative trait loci mapping.

Authors:  Hui Wang; Zhongyang Zhang; Sherri Rose; Mark van der Laan
Journal:  Genetics       Date:  2014-09-24       Impact factor: 4.562

4.  An enhanced machine learning tool for cis-eQTL mapping with regularization and confounder adjustments.

Authors:  Kang K Yan; Hongyu Zhao; Joseph T Wu; Herbert Pang
Journal:  Genet Epidemiol       Date:  2020-07-22       Impact factor: 2.135

5.  Data-driven assessment of eQTL mapping methods.

Authors:  Jacob J Michaelson; Rudi Alberts; Klaus Schughart; Andreas Beyer
Journal:  BMC Genomics       Date:  2010-09-17       Impact factor: 3.969

6.  Impact of natural genetic variation on gene expression dynamics.

Authors:  Marit Ackermann; Weronika Sikora-Wohlfeld; Andreas Beyer
Journal:  PLoS Genet       Date:  2013-06-06       Impact factor: 5.917

7.  An experimental study of the intrinsic stability of random forest variable importance measures.

Authors:  Huazhen Wang; Fan Yang; Zhiyuan Luo
Journal:  BMC Bioinformatics       Date:  2016-02-03       Impact factor: 3.169

8.  Application of data mining for predicting hemodynamics instability during pheochromocytoma surgery.

Authors:  Yueyang Zhao; Li Fang; Lei Cui; Song Bai
Journal:  BMC Med Inform Decis Mak       Date:  2020-07-20       Impact factor: 2.796

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.