Literature DB >> 35278218

Cross-trait prediction accuracy of summary statistics in genome-wide association studies.

Bingxin Zhao1, Fei Zou1, Hongtu Zhu1.   

Abstract

In the era of big data, univariate models have widely been used as a workhorse tool for quickly producing marginal estimators; and this is true even when in a high-dimensional dense setting, in which many features are "true," but weak signals. Genome-wide association studies (GWAS) epitomize this type of setting. Although the GWAS marginal estimator is popular, it has long been criticized for ignoring the correlation structure of genetic variants (i.e., the linkage disequilibrium [LD] pattern). In this paper, we study the effects of LD pattern on the GWAS marginal estimator and investigate whether or not additionally accounting for the LD can improve the prediction accuracy of complex traits. We consider a general high-dimensional dense setting for GWAS and study a class of ridge-type estimators, including the popular marginal estimator and the best linear unbiased prediction (BLUP) estimator as two special cases. We show that the performance of GWAS marginal estimator depends on the LD pattern through the first three moments of its eigenvalue distribution. Furthermore, we uncover that the relative performance of GWAS marginal and BLUP estimators highly depends on the ratio of GWAS sample size over the number of genetic variants. Particularly, our finding reveals that the marginal estimator can easily become near-optimal within this class when the sample size is relatively small, even though it ignores the LD pattern. On the other hand, BLUP estimator has substantially better performance than the marginal estimator as the sample size increases toward the number of genetic variants, which is typically in millions. Therefore, adjusting for the LD (such as in the BLUP) is most needed when GWAS sample size is large. We illustrate the importance of our results by using the simulated data and real GWAS.
© 2022 The International Biometric Society.

Entities:  

Keywords:  BLUP; GWAS; high-dimension prediction; marginal estimator; polygenic risk score; ridge-type estimator

Year:  2022        PMID: 35278218      PMCID: PMC9464799          DOI: 10.1111/biom.13661

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   1.701


  17 in total

1.  GCTA: a tool for genome-wide complex trait analysis.

Authors:  Jian Yang; S Hong Lee; Michael E Goddard; Peter M Visscher
Journal:  Am J Hum Genet       Date:  2010-12-17       Impact factor: 11.025

2.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

3.  Leveraging Multi-ethnic Evidence for Risk Assessment of Quantitative Traits in Minority Populations.

Authors:  Marc A Coram; Huaying Fang; Sophie I Candille; Themistocles L Assimes; Hua Tang
Journal:  Am J Hum Genet       Date:  2017-07-27       Impact factor: 11.025

4.  A powerful fine-mapping method for transcriptome-wide association studies.

Authors:  Chong Wu; Wei Pan
Journal:  Hum Genet       Date:  2019-12-16       Impact factor: 4.132

Review 5.  Polygenic Risk Scores in Clinical Psychology: Bridging Genomic Risk to Individual Differences.

Authors:  Ryan Bogdan; David A A Baranger; Arpana Agrawal
Journal:  Annu Rev Clin Psychol       Date:  2018-05-07       Impact factor: 18.561

6.  Multimodal population brain imaging in the UK Biobank prospective epidemiological study.

Authors:  Karla L Miller; Fidel Alfaro-Almagro; Neal K Bangerter; David L Thomas; Essa Yacoub; Junqian Xu; Andreas J Bartsch; Saad Jbabdi; Stamatios N Sotiropoulos; Jesper L R Andersson; Ludovica Griffanti; Gwenaëlle Douaud; Thomas W Okell; Peter Weale; Iulius Dragonu; Steve Garratt; Sarah Hudson; Rory Collins; Mark Jenkinson; Paul M Matthews; Stephen M Smith
Journal:  Nat Neurosci       Date:  2016-09-19       Impact factor: 24.884

7.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics.

Authors:  Alvaro N Barbeira; Scott P Dickinson; Rodrigo Bonazzola; Jiamao Zheng; Heather E Wheeler; Jason M Torres; Eric S Torstenson; Kaanan P Shah; Tzintzuni Garcia; Todd L Edwards; Eli A Stahl; Laura M Huckins; Dan L Nicolae; Nancy J Cox; Hae Kyung Im
Journal:  Nat Commun       Date:  2018-05-08       Impact factor: 14.919

8.  Polygenic prediction via Bayesian regression and continuous shrinkage priors.

Authors:  Tian Ge; Chia-Yen Chen; Yang Ni; Yen-Chen Anne Feng; Jordan W Smoller
Journal:  Nat Commun       Date:  2019-04-16       Impact factor: 14.919

9.  A statistical framework for cross-tissue transcriptome-wide association analysis.

Authors:  Yiming Hu; Mo Li; Qiongshi Lu; Haoyi Weng; Jiawei Wang; Seyedeh M Zekavat; Zhaolong Yu; Boyang Li; Jianlei Gu; Sydney Muchnik; Yu Shi; Brian W Kunkle; Shubhabrata Mukherjee; Pradeep Natarajan; Adam Naj; Amanda Kuzma; Yi Zhao; Paul K Crane; Hui Lu; Hongyu Zhao
Journal:  Nat Genet       Date:  2019-02-25       Impact factor: 38.330

10.  Power and predictive accuracy of polygenic risk scores.

Authors:  Frank Dudbridge
Journal:  PLoS Genet       Date:  2013-03-21       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.