Literature DB >> 25566314

Imputation and quality control steps for combining multiple genome-wide datasets.

Shefali S Verma1, Mariza de Andrade2, Gerard Tromp3, Helena Kuivaniemi3, Elizabeth Pugh4, Bahram Namjou-Khales5, Shubhabrata Mukherjee6, Gail P Jarvik6, Leah C Kottyan5, Amber Burt6, Yuki Bradford1, Gretta D Armstrong1, Kimberly Derr3, Dana C Crawford7, Jonathan L Haines8, Rongling Li9, David Crosslin6, Marylyn D Ritchie1.   

Abstract

The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R (2) (estimated correlation between the imputed and true genotypes), and the relationship between allelic R (2) and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2) were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.

Entities:  

Keywords:  eMERGE; electronic health records; genome-wide association; imputation

Year:  2014        PMID: 25566314      PMCID: PMC4263197          DOI: 10.3389/fgene.2014.00370

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


  38 in total

1.  Is 'forward' the same as 'plus'?…and other adventures in SNP allele nomenclature.

Authors:  Sarah C Nelson; Kimberly F Doheny; Cathy C Laurie; Daniel B Mirel
Journal:  Trends Genet       Date:  2012-06-02       Impact factor: 11.639

2.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

3.  Interpreting principal component analyses of spatial population genetic variation.

Authors:  John Novembre; Matthew Stephens
Journal:  Nat Genet       Date:  2008-04-20       Impact factor: 38.330

4.  Pitfalls of merging GWAS data: lessons learned in the eMERGE network and quality control procedures to maintain high data quality.

Authors:  Rebecca L Zuvich; Loren L Armstrong; Suzette J Bielinski; Yuki Bradford; Christopher S Carlson; Dana C Crawford; Andrew T Crenshaw; Mariza de Andrade; Kimberly F Doheny; Jonathan L Haines; M Geoffrey Hayes; Gail P Jarvik; Lan Jiang; Iftikhar J Kullo; Rongling Li; Hua Ling; Teri A Manolio; Martha E Matsumoto; Catherine A McCarty; Andrew N McDavid; Daniel B Mirel; Lana M Olson; Justin E Paschall; Elizabeth W Pugh; Luke V Rasmussen; Laura J Rasmussen-Torvik; Stephen D Turner; Russell A Wilke; Marylyn D Ritchie
Journal:  Genet Epidemiol       Date:  2011-12       Impact factor: 2.135

5.  The UCSC Genome Browser.

Authors:  Donna Karolchik; Angie S Hinrichs; W James Kent
Journal:  Curr Protoc Hum Genet       Date:  2011-10

6.  The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

Authors:  Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf
Journal:  BMC Med Genomics       Date:  2011-01-26       Impact factor: 3.063

7.  Adjustment for population stratification via principal components in association analysis of rare variants.

Authors:  Yiwei Zhang; Weihua Guan; Wei Pan
Journal:  Genet Epidemiol       Date:  2012-10-12       Impact factor: 2.135

8.  Genotype imputation with thousands of genomes.

Authors:  Bryan Howie; Jonathan Marchini; Matthew Stephens
Journal:  G3 (Bethesda)       Date:  2011-11-01       Impact factor: 3.154

9.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

10.  The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.

Authors:  Benjamin F Voight; Hyun Min Kang; Jun Ding; Cameron D Palmer; Carlo Sidore; Peter S Chines; Noël P Burtt; Christian Fuchsberger; Yanming Li; Jeanette Erdmann; Timothy M Frayling; Iris M Heid; Anne U Jackson; Toby Johnson; Tuomas O Kilpeläinen; Cecilia M Lindgren; Andrew P Morris; Inga Prokopenko; Joshua C Randall; Richa Saxena; Nicole Soranzo; Elizabeth K Speliotes; Tanya M Teslovich; Eleanor Wheeler; Jared Maguire; Melissa Parkin; Simon Potter; N William Rayner; Neil Robertson; Kathleen Stirrups; Wendy Winckler; Serena Sanna; Antonella Mulas; Ramaiah Nagaraja; Francesco Cucca; Inês Barroso; Panos Deloukas; Ruth J F Loos; Sekar Kathiresan; Patricia B Munroe; Christopher Newton-Cheh; Arne Pfeufer; Nilesh J Samani; Heribert Schunkert; Joel N Hirschhorn; David Altshuler; Mark I McCarthy; Gonçalo R Abecasis; Michael Boehnke
Journal:  PLoS Genet       Date:  2012-08-02       Impact factor: 5.917

View more
  74 in total

1.  Population-specific genotype imputations using minimac or IMPUTE2.

Authors:  Elisabeth M van Leeuwen; Alexandros Kanterakis; Patrick Deelen; Mathijs V Kattenberg; P Eline Slagboom; Paul I W de Bakker; Cisca Wijmenga; Morris A Swertz; Dorret I Boomsma; Cornelia M van Duijn; Lennart C Karssen; Jouke Jan Hottenga
Journal:  Nat Protoc       Date:  2015-07-30       Impact factor: 13.491

2.  Vitamin D Receptor Gene Polymorphisms Are Associated with Abdominal Visceral Adipose Tissue Volume and Serum Adipokine Concentrations but Not with Body Mass Index or Waist Circumference in African Americans: The Jackson Heart Study.

Authors:  Rumana J Khan; Pia Riestra; Samson Y Gebreab; James G Wilson; Amadou Gaye; Ruihua Xu; Sharon K Davis
Journal:  J Nutr       Date:  2016-06-29       Impact factor: 4.798

3.  The phenotypic legacy of admixture between modern humans and Neandertals.

Authors:  Corinne N Simonti; Benjamin Vernot; Lisa Bastarache; Erwin Bottinger; David S Carrell; Rex L Chisholm; David R Crosslin; Scott J Hebbring; Gail P Jarvik; Iftikhar J Kullo; Rongling Li; Jyotishman Pathak; Marylyn D Ritchie; Dan M Roden; Shefali S Verma; Gerard Tromp; Jeffrey D Prato; William S Bush; Joshua M Akey; Joshua C Denny; John A Capra
Journal:  Science       Date:  2016-02-12       Impact factor: 47.728

4.  Genetic risk models: Influence of model size on risk estimates and precision.

Authors:  Ying Shan; Gerard Tromp; Helena Kuivaniemi; Diane T Smelser; Shefali S Verma; Marylyn D Ritchie; James R Elmore; David J Carey; Yvette P Conley; Michael B Gorin; Daniel E Weeks
Journal:  Genet Epidemiol       Date:  2017-02-15       Impact factor: 2.135

Review 5.  Genome-wide and candidate gene approaches of clopidogrel efficacy using pharmacodynamic and clinical end points-Rationale and design of the International Clopidogrel Pharmacogenomics Consortium (ICPC).

Authors:  Thomas O Bergmeijer; Jean-Luc Reny; Ruth E Pakyz; Li Gong; Joshua P Lewis; Eun-Young Kim; Daniel Aradi; Israel Fernandez-Cadenas; Richard B Horenstein; Ming Ta Michael Lee; Ryan M Whaley; Joan Montaner; Gian Franco Gensini; John H Cleator; Kiyuk Chang; Lene Holmvang; Willibald Hochholzer; Dan M Roden; Stefan Winter; Russ B Altman; Dimitrios Alexopoulos; Ho-Sook Kim; Jean-Pierre Déry; Meinrad Gawaz; Kevin Bliden; Marco Valgimigli; Rossella Marcucci; Gianluca Campo; Elke Schaeffeler; Nadia P Dridi; Ming-Shien Wen; Jae Gook Shin; Tabassome Simon; Pierre Fontana; Betti Giusti; Tobias Geisler; Michiaki Kubo; Dietmar Trenk; Jolanta M Siller-Matula; Jurriën M Ten Berg; Paul A Gurbel; Jean-Sebastien Hulot; Braxton D Mitchell; Matthias Schwab; Marylyn DeRiggi Ritchie; Teri E Klein; Alan R Shuldiner
Journal:  Am Heart J       Date:  2017-12-17       Impact factor: 4.749

6.  Integration of genetic and clinical information to improve imputation of data missing from electronic health records.

Authors:  Ruowang Li; Yong Chen; Jason H Moore
Journal:  J Am Med Inform Assoc       Date:  2019-10-01       Impact factor: 4.497

7.  Brain neurotransmitter transporter/receptor genomics and efavirenz central nervous system adverse events.

Authors:  David W Haas; Yuki Bradford; Anurag Verma; Shefali S Verma; Joseph J Eron; Roy M Gulick; Sharon A Riddler; Paul E Sax; Eric S Daar; Gene D Morse; Edward P Acosta; Marylyn D Ritchie
Journal:  Pharmacogenet Genomics       Date:  2018-07       Impact factor: 2.089

8.  Contrasting Association Results between Existing PheWAS Phenotype Definition Methods and Five Validated Electronic Phenotypes.

Authors:  Joseph B Leader; Sarah A Pendergrass; Anurag Verma; David J Carey; Dustin N Hartzel; Marylyn D Ritchie; H Lester Kirchner
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

9.  Penetrance of Polygenic Obesity Susceptibility Loci across the Body Mass Index Distribution.

Authors:  Arkan Abadi; Akram Alyass; Sebastien Robiou du Pont; Ben Bolker; Pardeep Singh; Viswanathan Mohan; Rafael Diaz; James C Engert; Salim Yusuf; Hertzel C Gerstein; Sonia S Anand; David Meyre
Journal:  Am J Hum Genet       Date:  2017-12-07       Impact factor: 11.025

10.  Genetic heterogeneity of Alzheimer's disease in subjects with and without hypertension.

Authors:  Alireza Nazarian; Konstantin G Arbeev; Arseniy P Yashkin; Alexander M Kulminski
Journal:  Geroscience       Date:  2019-05-05       Impact factor: 7.713

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.