Literature DB >> 22128059

Brief review of regression-based and machine learning methods in genetic epidemiology: the Genetic Analysis Workshop 17 experience.

Abhijit Dasgupta1, Yan V Sun, Inke R König, Joan E Bailey-Wilson, James D Malley.   

Abstract

Genetics Analysis Workshop 17 provided common and rare genetic variants from exome sequencing data and simulated binary and quantitative traits in 200 replicates. We provide a brief review of the machine learning and regression-based methods used in the analyses of these data. Several regression and machine learning methods were used to address different problems inherent in the analyses of these data, which are high-dimension, low-sample-size data typical of many genetic association studies. Unsupervised methods, such as cluster analysis, were used for data segmentation and, subset selection. Supervised learning methods, which include regression-based methods (e.g., generalized linear models, logic regression, and regularized regression) and tree-based methods (e.g., decision trees and random forests), were used for variable selection (selecting genetic and clinical features most associated or predictive of outcome) and prediction (developing models using common and rare genetic variants to accurately predict outcome), with the outcome being case-control status or quantitative trait value. We include a discussion of cross-validation for model selection and assessment, and a description of available software resources for these methods.
© 2011 Wiley Periodicals, Inc.

Entities:  

Mesh:

Year:  2011        PMID: 22128059      PMCID: PMC3345521          DOI: 10.1002/gepi.20642

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  7 in total

1.  On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data.

Authors:  Daniel F Schwarz; Inke R König; Andreas Ziegler
Journal:  Bioinformatics       Date:  2010-05-26       Impact factor: 6.937

2.  Genome-wide association analysis by lasso penalized logistic regression.

Authors:  Tong Tong Wu; Yi Fang Chen; Trevor Hastie; Eric Sobel; Kenneth Lange
Journal:  Bioinformatics       Date:  2009-01-28       Impact factor: 6.937

3.  Machine learning in genome-wide association studies.

Authors:  Silke Szymczak; Joanna M Biernacka; Heather J Cordell; Oscar González-Recio; Inke R König; Heping Zhang; Yan V Sun
Journal:  Genet Epidemiol       Date:  2009       Impact factor: 2.135

Review 4.  Multigenic modeling of complex disease by random forests.

Authors:  Yan V Sun
Journal:  Adv Genet       Date:  2010       Impact factor: 1.944

5.  Lessons learned from Genetic Analysis Workshop 17: transitioning from genome-wide association studies to whole-genome statistical genetic analysis.

Authors:  Alexander F Wilson; Andreas Ziegler
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

6.  Genetic Analysis Workshop 17 mini-exome simulation.

Authors:  Laura Almasy; Thomas D Dyer; Juan Manuel Peralta; Jack W Kent; Jac C Charlesworth; Joanne E Curran; John Blangero
Journal:  BMC Proc       Date:  2011-11-29

7.  GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest.

Authors:  Ramón Diaz-Uriarte
Journal:  BMC Bioinformatics       Date:  2007-09-03       Impact factor: 3.169

  7 in total
  39 in total

1.  Multiple testing in high-throughput sequence data: experiences from Group 8 of Genetic Analysis Workshop 17.

Authors:  Inke R König; Jeremie Nsengimana; Charalampos Papachristou; Matthew A Simonson; Kai Wang; Jason A Weisburd
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

2.  Automated grouping of action potentials of human embryonic stem cell-derived cardiomyocytes.

Authors:  Giann Gorospe; Renjun Zhu; Michal A Millrod; Elias T Zambidis; Leslie Tung; Rene Vidal
Journal:  IEEE Trans Biomed Eng       Date:  2014-09       Impact factor: 4.538

3.  Detecting rare variant associations: methods for testing haplotypes and multiallelic genotypes.

Authors:  Rita M Cantor; Marsha Wilcox
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

4.  Quality control issues and the identification of rare functional variants with next-generation sequencing data.

Authors:  Claudia Hemmelmann; E Warwick Daw; Alexander F Wilson
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

5.  Regression and data mining methods for analyses of multiple rare variants in the Genetic Analysis Workshop 17 mini-exome data.

Authors:  Joan E Bailey-Wilson; Jennifer S Brennan; Shelley B Bull; Robert Culverhouse; Yoonhee Kim; Yuan Jiang; Jeesun Jung; Qing Li; Claudia Lamina; Ying Liu; Reedik Mägi; Yue S Niu; Claire L Simpson; Libo Wang; Yildiz E Yilmaz; Heping Zhang; Zhaogong Zhang
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

6.  Identification of genetic association of multiple rare variants using collapsing methods.

Authors:  Yan V Sun; Yun Ju Sung; Nathan Tintle; Andreas Ziegler
Journal:  Genet Epidemiol       Date:  2011       Impact factor: 2.135

7.  13C NMR metabolomic evaluation of immediate and delayed mild hypothermia in cerebrocortical slices after oxygen-glucose deprivation.

Authors:  Jia Liu; Mark R Segal; Mark J S Kelly; Jeffrey G Pelton; Myungwon Kim; Thomas L James; Lawrence Litt
Journal:  Anesthesiology       Date:  2013-11       Impact factor: 7.892

Review 8.  Genetic interactions effects for cancer disease identification using computational models: a review.

Authors:  R Manavalan; S Priya
Journal:  Med Biol Eng Comput       Date:  2021-04-11       Impact factor: 2.602

9.  A Genome-Wide Association Study and Machine-Learning Algorithm Analysis on the Prediction of Facial Phenotypes by Genotypes in Korean Women.

Authors:  Hye-Young Yoo; Ki-Chan Lee; Ji-Eun Woo; Sung-Ha Park; Sunghoon Lee; Joungsu Joo; Jin-Sik Bae; Hyuk-Jung Kwon; Byoung-Jun Park
Journal:  Clin Cosmet Investig Dermatol       Date:  2022-03-11

10.  An algorithm for candidate sequencing in non-dystrophic skeletal muscle channelopathies.

Authors:  Tai-Seung Nam; Christoph Lossin; Dong-Uk Kim; Myeong-Kyu Kim; Young-Ok Kim; Kang-Ho Choi; Seok-Yong Choi; Sang-Cheol Park; In-Seop Na
Journal:  J Neurol       Date:  2013-03-03       Impact factor: 4.849

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.