Literature DB >> 32053016

A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies.

Yue Wu1, Eleazar Eskin1,2,3, Sriram Sankararaman1,2,3.   

Abstract

Methods to impute missing data are routinely used to increase power in genome-wide association studies. There are two broad classes of imputation methods. The first class imputes genotypes at the untyped variants, given those at the typed variants, and then performs a statistical test of association at the imputed variants. The second class, summary statistic imputation (SSI), directly imputes association statistics at the untyped variants, given the association statistics observed at the typed variants. The second class is appealing as it tends to be computationally efficient while only requiring the summary statistics from a study, while the former class requires access to individual-level data that can be difficult to obtain. The statistical properties of these two classes of imputation methods have not been fully understood. In this study, we show that the two classes of imputation methods yield association statistics with similar distributions for sufficiently large sample sizes. Using this relationship, we can understand the effect of the imputation method on power. We show that a commonly used approach to SSI that we term SSI with variance reweighting generally leads to a loss in power. On the contrary, our proposed method for SSI that does not perform variance reweighting fully accounts for imputation uncertainty, while achieving better power.

Keywords:  genome-wide association studies; imputation; summary statistics

Mesh:

Year:  2020        PMID: 32053016      PMCID: PMC7081249          DOI: 10.1089/cmb.2019.0449

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  27 in total

1.  Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data.

Authors:  Na Li; Matthew Stephens
Journal:  Genetics       Date:  2003-12       Impact factor: 4.562

Review 2.  Genotype imputation for genome-wide association studies.

Authors:  Jonathan Marchini; Bryan Howie
Journal:  Nat Rev Genet       Date:  2010-07       Impact factor: 53.242

3.  A genome-wide association study identifies novel risk loci for type 2 diabetes.

Authors:  Robert Sladek; Ghislain Rocheleau; Johan Rung; Christian Dina; Lishuang Shen; David Serre; Philippe Boutin; Daniel Vincent; Alexandre Belisle; Samy Hadjadj; Beverley Balkau; Barbara Heude; Guillaume Charpentier; Thomas J Hudson; Alexandre Montpetit; Alexey V Pshezhetsky; Marc Prentki; Barry I Posner; David J Balding; David Meyre; Constantin Polychronakos; Philippe Froguel
Journal:  Nature       Date:  2007-02-11       Impact factor: 49.962

4.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

Authors:  Paul Scheet; Matthew Stephens
Journal:  Am J Hum Genet       Date:  2006-02-17       Impact factor: 11.025

5.  Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms.

Authors:  Emrah Kostem; Jose A Lozano; Eleazar Eskin
Journal:  Genetics       Date:  2011-04-05       Impact factor: 4.562

6.  Identifying causal variants at loci with multiple signals of association.

Authors:  Farhad Hormozdiari; Emrah Kostem; Eun Yong Kang; Bogdan Pasaniuc; Eleazar Eskin
Journal:  Genetics       Date:  2014-08-07       Impact factor: 4.562

7.  Genome partitioning of genetic variation for complex traits using common SNPs.

Authors:  Jian Yang; Teri A Manolio; Louis R Pasquale; Eric Boerwinkle; Neil Caporaso; Julie M Cunningham; Mariza de Andrade; Bjarke Feenstra; Eleanor Feingold; M Geoffrey Hayes; William G Hill; Maria Teresa Landi; Alvaro Alonso; Guillaume Lettre; Peng Lin; Hua Ling; William Lowe; Rasika A Mathias; Mads Melbye; Elizabeth Pugh; Marilyn C Cornelis; Bruce S Weir; Michael E Goddard; Peter M Visscher
Journal:  Nat Genet       Date:  2011-05-08       Impact factor: 38.330

8.  USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA.

Authors:  Xiaoquan Wen; Matthew Stephens
Journal:  Ann Appl Stat       Date:  2010-09       Impact factor: 2.083

9.  Genome-wide association analysis of metabolic traits in a birth cohort from a founder population.

Authors:  Chiara Sabatti; Susan K Service; Anna-Liisa Hartikainen; Anneli Pouta; Samuli Ripatti; Jae Brodsky; Chris G Jones; Noah A Zaitlen; Teppo Varilo; Marika Kaakinen; Ulla Sovio; Aimo Ruokonen; Jaana Laitinen; Eveliina Jakkula; Lachlan Coin; Clive Hoggart; Andrew Collins; Hannu Turunen; Stacey Gabriel; Paul Elliot; Mark I McCarthy; Mark J Daly; Marjo-Riitta Järvelin; Nelson B Freimer; Leena Peltonen
Journal:  Nat Genet       Date:  2008-12-07       Impact factor: 38.330

10.  DIST: direct imputation of summary statistics for unmeasured SNPs.

Authors:  Donghyung Lee; T Bernard Bigdeli; Brien P Riley; Ayman H Fanous; Silviu-Alin Bacanu
Journal:  Bioinformatics       Date:  2013-08-28       Impact factor: 6.937

View more
  1 in total

1.  Plant-ImputeDB: an integrated multiple plant reference panel database for genotype imputation.

Authors:  Yingjie Gao; Zhiquan Yang; Wenqian Yang; Yanbo Yang; Jing Gong; Qing-Yong Yang; Xiaohui Niu
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.