Literature DB >> 24348893

Penalized Regression and Risk Prediction in Genome-Wide Association Studies.

Erin Austin1, Wei Pan1, Xiaotong Shen2.   

Abstract

An important task in personalized medicine is to predict disease risk based on a person's genome, e.g. on a large number of single-nucleotide polymorphisms (SNPs). Genome-wide association studies (GWAS) make SNP and phenotype data available to researchers. A critical question for researchers is how to best predict disease risk. Penalized regression equipped with variable selection, such as LASSO and SCAD, is deemed to be promising in this setting. However, the sparsity assumption taken by the LASSO, SCAD and many other penalized regression techniques may not be applicable here: it is now hypothesized that many common diseases are associated with many SNPs with small to moderate effects. In this article, we use the GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) to investigate the performance of various unpenalized and penalized regression approaches under true sparse or non-sparse models. We find that in general penalized regression outperformed unpenalized regression; SCAD, TLP and LASSO performed best for sparse models, while elastic net regression was the winner, followed by ridge, TLP and LASSO, for non-sparse models.

Entities:  

Keywords:  AUC; Elastic Net; GWAS; LASSO; Logistic regression; MLE; Ridge; SCAD; SNP; TLP

Year:  2013        PMID: 24348893      PMCID: PMC3859439          DOI: 10.1002/sam.11183

Source DB:  PubMed          Journal:  Stat Anal Data Min        ISSN: 1932-1864            Impact factor:   1.051


  21 in total

Review 1.  New genes in inflammatory bowel disease: lessons for complex diseases?

Authors:  Daniel R Gaya; Richard K Russell; Elaine R Nimmo; Jack Satsangi
Journal:  Lancet       Date:  2006-04-15       Impact factor: 79.321

2.  Assessing the probability that a positive report is false: an approach for molecular epidemiology studies.

Authors:  Sholom Wacholder; Stephen Chanock; Montserrat Garcia-Closas; Laure El Ghormli; Nathaniel Rothman
Journal:  J Natl Cancer Inst       Date:  2004-03-17       Impact factor: 13.506

3.  Likelihood-based selection and sharp parameter estimation.

Authors:  Xiaotong Shen; Wei Pan; Yunzhang Zhu
Journal:  J Am Stat Assoc       Date:  2012-06-11       Impact factor: 5.033

4.  Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk.

Authors:  David M Evans; Peter M Visscher; Naomi R Wray
Journal:  Hum Mol Genet       Date:  2009-06-24       Impact factor: 6.150

5.  Prediction of individual genetic risk to disease from genome-wide association studies.

Authors:  Naomi R Wray; Michael E Goddard; Peter M Visscher
Journal:  Genome Res       Date:  2007-09-04       Impact factor: 9.043

6.  Common SNPs explain a large proportion of the heritability for human height.

Authors:  Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal:  Nat Genet       Date:  2010-06-20       Impact factor: 38.330

7.  Risk prediction using genome-wide association studies.

Authors:  Charles Kooperberg; Michael LeBlanc; Valerie Obenchain
Journal:  Genet Epidemiol       Date:  2010-11       Impact factor: 2.135

8.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

9.  Screen and clean: a tool for identifying interactions in genome-wide association studies.

Authors:  Jing Wu; Bernie Devlin; Steven Ringquist; Massimo Trucco; Kathryn Roeder
Journal:  Genet Epidemiol       Date:  2010-04       Impact factor: 2.135

10.  From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes.

Authors:  Zhi Wei; Kai Wang; Hui-Qi Qu; Haitao Zhang; Jonathan Bradfield; Cecilia Kim; Edward Frackleton; Cuiping Hou; Joseph T Glessner; Rosetta Chiavacci; Charles Stanley; Dimitri Monos; Struan F A Grant; Constantin Polychronakos; Hakon Hakonarson
Journal:  PLoS Genet       Date:  2009-10-09       Impact factor: 5.917

View more
  12 in total

1.  Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach.

Authors:  Michelle Carlsen; Guifang Fu; Shaun Bushman; Christopher Corcoran
Journal:  Genetics       Date:  2015-12-12       Impact factor: 4.562

2.  Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease.

Authors:  Zhi Wei; Wei Wang; Jonathan Bradfield; Jin Li; Christopher Cardinale; Edward Frackelton; Cecilia Kim; Frank Mentch; Kristel Van Steen; Peter M Visscher; Robert N Baldassano; Hakon Hakonarson
Journal:  Am J Hum Genet       Date:  2013-05-23       Impact factor: 11.025

3.  Genetic risk models: Influence of model size on risk estimates and precision.

Authors:  Ying Shan; Gerard Tromp; Helena Kuivaniemi; Diane T Smelser; Shefali S Verma; Marylyn D Ritchie; James R Elmore; David J Carey; Yvette P Conley; Michael B Gorin; Daniel E Weeks
Journal:  Genet Epidemiol       Date:  2017-02-15       Impact factor: 2.135

4.  Generic Feature Selection with Short Fat Data.

Authors:  B Clarke; J-H Chu
Journal:  J Indian Soc Agric Stat       Date:  2014

5.  A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information.

Authors:  Ting-Huei Chen; Nilanjan Chatterjee; Maria Teresa Landi; Jianxin Shi
Journal:  J Am Stat Assoc       Date:  2020-10-12       Impact factor: 5.033

6.  Penalized regression approaches to testing for quantitative trait-rare variant association.

Authors:  Sunkyung Kim; Wei Pan; Xiaotong Shen
Journal:  Front Genet       Date:  2014-05-13       Impact factor: 4.599

Review 7.  Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events.

Authors:  Menelaos Pavlou; Gareth Ambler; Shaun Seaman; Maria De Iorio; Rumana Z Omar
Journal:  Stat Med       Date:  2015-10-29       Impact factor: 2.373

8.  Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index.

Authors:  Sunghwan Bae; Sungkyoung Choi; Sung Min Kim; Taesung Park
Journal:  Genomics Inform       Date:  2016-12-30

9.  Evaluation of polygenic risk scores for ovarian cancer risk prediction in a prospective cohort study.

Authors:  Xin Yang; Goska Leslie; Aleksandra Gentry-Maharaj; Andy Ryan; Maria Intermaggio; Andrew Lee; Jatinderpal K Kalsi; Jonathan Tyrer; Faiza Gaba; Ranjit Manchanda; Paul D P Pharoah; Simon A Gayther; Susan J Ramus; Ian Jacobs; Usha Menon; Antonis C Antoniou
Journal:  J Med Genet       Date:  2018-05-05       Impact factor: 6.318

10.  A method combining a random forest-based technique with the modeling of linkage disequilibrium through latent variables, to run multilocus genome-wide association studies.

Authors:  Christine Sinoquet
Journal:  BMC Bioinformatics       Date:  2018-03-27       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.