Literature DB >> 32491161

Iterative hard thresholding in genome-wide association studies: Generalized linear models, prior weights, and double sparsity.

Benjamin B Chu1, Kevin L Keys2,3, Christopher A German4, Hua Zhou4, Jin J Zhou5, Eric M Sobel1,6, Janet S Sinsheimer1,4,6, Kenneth Lange1,6.   

Abstract

BACKGROUND: Consecutive testing of single nucleotide polymorphisms (SNPs) is usually employed to identify genetic variants associated with complex traits. Ideally one should model all covariates in unison, but most existing analysis methods for genome-wide association studies (GWAS) perform only univariate regression.
RESULTS: We extend and efficiently implement iterative hard thresholding (IHT) for multiple regression, treating all SNPs simultaneously. Our extensions accommodate generalized linear models, prior information on genetic variants, and grouping of variants. In our simulations, IHT recovers up to 30% more true predictors than SNP-by-SNP association testing and exhibits a 2-3 orders of magnitude decrease in false-positive rates compared with lasso regression. We also test IHT on the UK Biobank hypertension phenotypes and the Northern Finland Birth Cohort of 1966 cardiovascular phenotypes. We find that IHT scales to the large datasets of contemporary human genetics and recovers the plausible genetic variants identified by previous studies.
CONCLUSIONS: Our real data analysis and simulation studies suggest that IHT can (i) recover highly correlated predictors, (ii) avoid over-fitting, (iii) deliver better true-positive and false-positive rates than either marginal testing or lasso regression, (iv) recover unbiased regression coefficients, (v) exploit prior information and group-sparsity, and (vi) be used with biobank-sized datasets. Although these advances are studied for genome-wide association studies inference, our extensions are pertinent to other regression problems with large numbers of predictors.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  GWAS; biobank; high dimensional inference; iterative hard thresholding; multiple regression

Year:  2020        PMID: 32491161      PMCID: PMC7268817          DOI: 10.1093/gigascience/giaa044

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  29 in total

1.  Stability selection for genome-wide association.

Authors:  David H Alexander; Kenneth Lange
Journal:  Genet Epidemiol       Date:  2011-08-26       Impact factor: 2.135

2.  Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies.

Authors:  Buhm Han; Eleazar Eskin
Journal:  Am J Hum Genet       Date:  2011-05-13       Impact factor: 11.025

3.  Iterative hard thresholding for model selection in genome-wide association studies.

Authors:  Kevin L Keys; Gary K Chen; Kenneth Lange
Journal:  Genet Epidemiol       Date:  2017-09-06       Impact factor: 2.135

4.  Identification of a novel risk locus for progressive supranuclear palsy by a pooled genomewide scan of 500,288 single-nucleotide polymorphisms.

Authors:  Stacey Melquist; David W Craig; Matthew J Huentelman; Richard Crook; John V Pearson; Matt Baker; Victoria L Zismann; Jennifer Gass; Jennifer Adamson; Szabolcs Szelinger; Jason Corneveaux; Ashley Cannon; Keith D Coon; Sarah Lincoln; Charles Adler; Paul Tuite; Donald B Calne; Eileen H Bigio; Ryan J Uitti; Zbigniew K Wszolek; Lawrence I Golbe; Richard J Caselli; Neill Graff-Radford; Irene Litvan; Matthew J Farrer; Dennis W Dickson; Mike Hutton; Dietrich A Stephan
Journal:  Am J Hum Genet       Date:  2007-03-08       Impact factor: 11.025

Review 5.  OPENMENDEL: a cooperative programming project for statistical genetics.

Authors:  Hua Zhou; Janet S Sinsheimer; Douglas M Bates; Benjamin B Chu; Christopher A German; Sarah S Ji; Kevin L Keys; Juhyun Kim; Seyoon Ko; Gordon D Mosher; Jeanette C Papp; Eric M Sobel; Jing Zhai; Jin J Zhou; Kenneth Lange
Journal:  Hum Genet       Date:  2019-03-26       Impact factor: 4.132

6.  Genome-wide association analysis of metabolic traits in a birth cohort from a founder population.

Authors:  Chiara Sabatti; Susan K Service; Anna-Liisa Hartikainen; Anneli Pouta; Samuli Ripatti; Jae Brodsky; Chris G Jones; Noah A Zaitlen; Teppo Varilo; Marika Kaakinen; Ulla Sovio; Aimo Ruokonen; Jaana Laitinen; Eveliina Jakkula; Lachlan Coin; Clive Hoggart; Andrew Collins; Hannu Turunen; Stacey Gabriel; Paul Elliot; Mark I McCarthy; Mark J Daly; Marjo-Riitta Järvelin; Nelson B Freimer; Leena Peltonen
Journal:  Nat Genet       Date:  2008-12-07       Impact factor: 38.330

7.  PUMA: a unified framework for penalized multiple regression analysis of GWAS data.

Authors:  Gabriel E Hoffman; Benjamin A Logsdon; Jason G Mezey
Journal:  PLoS Comput Biol       Date:  2013-06-27       Impact factor: 4.475

8.  Second-generation PLINK: rising to the challenge of larger and richer datasets.

Authors:  Christopher C Chang; Carson C Chow; Laurent Cam Tellier; Shashaank Vattikuti; Shaun M Purcell; James J Lee
Journal:  Gigascience       Date:  2015-02-25       Impact factor: 6.524

9.  Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection.

Authors:  Armin P Schoech; Daniel M Jordan; Po-Ru Loh; Steven Gazal; Luke J O'Connor; Daniel J Balick; Pier F Palamara; Hilary K Finucane; Shamil R Sunyaev; Alkes L Price
Journal:  Nat Commun       Date:  2019-02-15       Impact factor: 14.919

10.  Applying compressed sensing to genome-wide association studies.

Authors:  Shashaank Vattikuti; James J Lee; Christopher C Chang; Stephen D H Hsu; Carson C Chow
Journal:  Gigascience       Date:  2014-06-16       Impact factor: 6.524

View more
  4 in total

1.  Polygenic risk for prostate cancer: Decreasing relative risk with age but little impact on absolute risk.

Authors:  Daniel J Schaid; Jason P Sinnwell; Anthony Batzler; Shannon K McDonnell
Journal:  Am J Hum Genet       Date:  2022-03-29       Impact factor: 11.043

2.  CLIN_SKAT: an R package to conduct association analysis using functionally relevant variants.

Authors:  Amrita Chattopadhyay; Ching-Yu Shih; Yu-Chen Hsu; Jyh-Ming Jimmy Juang; Eric Y Chuang; Tzu-Pin Lu
Journal:  BMC Bioinformatics       Date:  2022-10-23       Impact factor: 3.307

3.  Computationally scalable regression modeling for ultrahigh-dimensional omics data with ParProx.

Authors:  Seyoon Ko; Ginny X Li; Hyungwon Choi; Joong-Ho Won
Journal:  Brief Bioinform       Date:  2021-11-05       Impact factor: 11.622

Review 4.  Suitability of GWAS as a Tool to Discover SNPs Associated with Tick Resistance in Cattle: A Review.

Authors:  Nelisiwe Mkize; Azwihangwisi Maiwashe; Kennedy Dzama; Bekezela Dube; Ntanganedzeni Mapholi
Journal:  Pathogens       Date:  2021-12-09
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.