Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Penalized Regression and Risk Prediction in Genome-Wide Association Studies.

Literature DB >> 24348893

Penalized Regression and Risk Prediction in Genome-Wide Association Studies.

Abstract

An important task in personalized medicine is to predict disease risk based on a person's genome, e.g. on a large number of single-nucleotide polymorphisms (SNPs). Genome-wide association studies (GWAS) make SNP and phenotype data available to researchers. A critical question for researchers is how to best predict disease risk. Penalized regression equipped with variable selection, such as LASSO and SCAD, is deemed to be promising in this setting. However, the sparsity assumption taken by the LASSO, SCAD and many other penalized regression techniques may not be applicable here: it is now hypothesized that many common diseases are associated with many SNPs with small to moderate effects. In this article, we use the GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) to investigate the performance of various unpenalized and penalized regression approaches under true sparse or non-sparse models. We find that in general penalized regression outperformed unpenalized regression; SCAD, TLP and LASSO performed best for sparse models, while elastic net regression was the winner, followed by ridge, TLP and LASSO, for non-sparse models.

Entities: Chemical Disease Gene Species

Keywords: AUC; Elastic Net; GWAS; LASSO; Logistic regression; MLE; Ridge; SCAD; SNP; TLP

Year: 2013 PMID： 24348893 PMCID： PMC3859439 DOI： 10.1002/sam.11183

Source DB: PubMed Journal: Stat Anal Data Min ISSN： 1932-1864 Impact factor: 1.051

21 in total

Review 1. New genes in inflammatory bowel disease: lessons for complex diseases?

Authors: Daniel R Gaya; Richard K Russell; Elaine R Nimmo; Jack Satsangi
Journal: Lancet Date: 2006-04-15 Impact factor: 79.321

2. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies.

Authors: Sholom Wacholder; Stephen Chanock; Montserrat Garcia-Closas; Laure El Ghormli; Nathaniel Rothman
Journal: J Natl Cancer Inst Date: 2004-03-17 Impact factor: 13.506

3. Likelihood-based selection and sharp parameter estimation.

Authors: Xiaotong Shen; Wei Pan; Yunzhang Zhu
Journal: J Am Stat Assoc Date: 2012-06-11 Impact factor: 5.033

4. Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk.

Authors: David M Evans; Peter M Visscher; Naomi R Wray
Journal: Hum Mol Genet Date: 2009-06-24 Impact factor: 6.150

5. Prediction of individual genetic risk to disease from genome-wide association studies.

Authors: Naomi R Wray; Michael E Goddard; Peter M Visscher
Journal: Genome Res Date: 2007-09-04 Impact factor: 9.043

6. Common SNPs explain a large proportion of the heritability for human height.

Authors: Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330

7. Risk prediction using genome-wide association studies.

Authors: Charles Kooperberg; Michael LeBlanc; Valerie Obenchain
Journal: Genet Epidemiol Date: 2010-11 Impact factor: 2.135

8. Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors: Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal: J Stat Softw Date: 2010 Impact factor: 6.440

9. Screen and clean: a tool for identifying interactions in genome-wide association studies.

Authors: Jing Wu; Bernie Devlin; Steven Ringquist; Massimo Trucco; Kathryn Roeder
Journal: Genet Epidemiol Date: 2010-04 Impact factor: 2.135

10. From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes.

Authors: Zhi Wei; Kai Wang; Hui-Qi Qu; Haitao Zhang; Jonathan Bradfield; Cecilia Kim; Edward Frackleton; Cuiping Hou; Joseph T Glessner; Rosetta Chiavacci; Charles Stanley; Dimitri Monos; Struan F A Grant; Constantin Polychronakos; Hakon Hakonarson
Journal: PLoS Genet Date: 2009-10-09 Impact factor: 5.917

12 in total

1. Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach.

Authors: Michelle Carlsen; Guifang Fu; Shaun Bushman; Christopher Corcoran
Journal: Genetics Date: 2015-12-12 Impact factor: 4.562

2. Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease.

Authors: Zhi Wei; Wei Wang; Jonathan Bradfield; Jin Li; Christopher Cardinale; Edward Frackelton; Cecilia Kim; Frank Mentch; Kristel Van Steen; Peter M Visscher; Robert N Baldassano; Hakon Hakonarson
Journal: Am J Hum Genet Date: 2013-05-23 Impact factor: 11.025

3. Genetic risk models: Influence of model size on risk estimates and precision.

Authors: Ying Shan; Gerard Tromp; Helena Kuivaniemi; Diane T Smelser; Shefali S Verma; Marylyn D Ritchie; James R Elmore; David J Carey; Yvette P Conley; Michael B Gorin; Daniel E Weeks
Journal: Genet Epidemiol Date: 2017-02-15 Impact factor: 2.135

4. Generic Feature Selection with Short Fat Data.

Authors: B Clarke; J-H Chu
Journal: J Indian Soc Agric Stat Date: 2014

5. A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information.

Authors: Ting-Huei Chen; Nilanjan Chatterjee; Maria Teresa Landi; Jianxin Shi
Journal: J Am Stat Assoc Date: 2020-10-12 Impact factor: 5.033

6. Penalized regression approaches to testing for quantitative trait-rare variant association.

Authors: Sunkyung Kim; Wei Pan; Xiaotong Shen
Journal: Front Genet Date: 2014-05-13 Impact factor: 4.599

Review 7. Review and evaluation of penalised regression methods for risk prediction in low-dimensional data with few events.

Authors: Menelaos Pavlou; Gareth Ambler; Shaun Seaman; Maria De Iorio; Rumana Z Omar
Journal: Stat Med Date: 2015-10-29 Impact factor: 2.373

8. Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index.

Authors: Sunghwan Bae; Sungkyoung Choi; Sung Min Kim; Taesung Park
Journal: Genomics Inform Date: 2016-12-30

9. Evaluation of polygenic risk scores for ovarian cancer risk prediction in a prospective cohort study.

Authors: Xin Yang; Goska Leslie; Aleksandra Gentry-Maharaj; Andy Ryan; Maria Intermaggio; Andrew Lee; Jatinderpal K Kalsi; Jonathan Tyrer; Faiza Gaba; Ranjit Manchanda; Paul D P Pharoah; Simon A Gayther; Susan J Ramus; Ian Jacobs; Usha Menon; Antonis C Antoniou
Journal: J Med Genet Date: 2018-05-05 Impact factor: 6.318

10. A method combining a random forest-based technique with the modeling of linkage disequilibrium through latent variables, to run multilocus genome-wide association studies.

Authors: Christine Sinoquet
Journal: BMC Bioinformatics Date: 2018-03-27 Impact factor: 3.169