Literature DB >> 34508597

A data harmonization pipeline to leverage external controls and boost power in GWAS.

Danfeng Chen, Katherine Tashman, Duncan S Palmer, Benjamin Neale, Kathryn Roeder, Alex Bloemendal, Claire Churchhouse, Zheng Tracy Ke.   

Abstract

The use of external controls in genome-wide association study (GWAS) can significantly increase the size and diversity of the control sample, enabling high-resolution ancestry matching and enhancing the power to detect association signals. However, the aggregation of controls from multiple sources is challenging due to batch effects, difficulty in identifying genotyping errors and the use of different genotyping platforms. These obstacles have impeded the use of external controls in GWAS and can lead to spurious results if not carefully addressed. We propose a unified data harmonization pipeline that includes an iterative approach to quality control and imputation, implemented before and after merging cohorts and arrays. We apply this harmonization pipeline to aggregate 27 517 European control samples from 16 collections within dbGaP. We leverage these harmonized controls to conduct a GWAS of Crohn's disease. We demonstrate a boost in power over using the cohort samples alone, and that our procedure results in summary statistics free of any significant batch effects. This harmonization pipeline for aggregating genotype data from multiple sources can also serve other applications where individual level genotypes, rather than summary statistics, are required.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2022        PMID: 34508597      PMCID: PMC8825237          DOI: 10.1093/hmg/ddab261

Source DB:  PubMed          Journal:  Hum Mol Genet        ISSN: 0964-6906            Impact factor:   5.121


  32 in total

1.  GCTA: a tool for genome-wide complex trait analysis.

Authors:  Jian Yang; S Hong Lee; Michael E Goddard; Peter M Visscher
Journal:  Am J Hum Genet       Date:  2010-12-17       Impact factor: 11.025

2.  The NCBI dbGaP database of genotypes and phenotypes.

Authors:  Matthew D Mailman; Michael Feolo; Yumi Jin; Masato Kimura; Kimberly Tryka; Rinat Bagoutdinov; Luning Hao; Anne Kiang; Justin Paschall; Lon Phan; Natalia Popova; Stephanie Pretel; Lora Ziyabari; Moira Lee; Yu Shao; Zhen Y Wang; Karl Sirotkin; Minghong Ward; Michael Kholodov; Kerry Zbicz; Jeffrey Beck; Michael Kimelman; Sergey Shevelev; Don Preuss; Eugene Yaschenko; Alan Graeff; James Ostell; Stephen T Sherry
Journal:  Nat Genet       Date:  2007-10       Impact factor: 38.330

Review 3.  Predicting genetic predisposition in humans: the promise of whole-genome markers.

Authors:  Gustavo de los Campos; Daniel Gianola; David B Allison
Journal:  Nat Rev Genet       Date:  2010-11-03       Impact factor: 53.242

4.  Genome-wide association database developed in the Japanese Integrated Database Project.

Authors:  Asako Koike; Nao Nishida; Ituro Inoue; Shoji Tsuji; Katsushi Tokunaga
Journal:  J Hum Genet       Date:  2009-07-24       Impact factor: 3.172

5.  Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations.

Authors:  Alicia R Martin; Christopher R Gignoux; Raymond K Walters; Genevieve L Wojcik; Benjamin M Neale; Simon Gravel; Mark J Daly; Carlos D Bustamante; Eimear E Kenny
Journal:  Am J Hum Genet       Date:  2017-03-30       Impact factor: 11.025

6.  Improving power for rare-variant tests by integrating external controls.

Authors:  Seunggeun Lee; Sehee Kim; Christian Fuchsberger
Journal:  Genet Epidemiol       Date:  2017-06-28       Impact factor: 2.135

7.  Next-generation genotype imputation service and methods.

Authors:  Sayantan Das; Lukas Forer; Sebastian Schönherr; Carlo Sidore; Adam E Locke; Alan Kwong; Scott I Vrieze; Emily Y Chew; Shawn Levy; Matt McGue; David Schlessinger; Dwight Stambolian; Po-Ru Loh; William G Iacono; Anand Swaroop; Laura J Scott; Francesco Cucca; Florian Kronenberg; Michael Boehnke; Gonçalo R Abecasis; Christian Fuchsberger
Journal:  Nat Genet       Date:  2016-08-29       Impact factor: 38.330

8.  Loci associated with ischaemic stroke and its subtypes (SiGN): a genome-wide association study.

Authors: 
Journal:  Lancet Neurol       Date:  2015-12-19       Impact factor: 44.182

9.  Novel score test to increase power in association test by integrating external controls.

Authors:  Yatong Li; Seunggeun Lee
Journal:  Genet Epidemiol       Date:  2020-11-08       Impact factor: 2.344

10.  Biological insights from 108 schizophrenia-associated genetic loci.

Authors: 
Journal:  Nature       Date:  2014-07-22       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.