Literature DB >> 34184051

BIGwas: Single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data.

Jan Christian Kässens1,2, Lars Wienbrandt1, David Ellinghaus1,3.   

Abstract

BACKGROUND: Genome-wide association studies (GWAS) and phenome-wide association studies (PheWAS) involving 1 million GWAS samples from dozens of population-based biobanks present a considerable computational challenge and are carried out by large scientific groups under great expenditure of time and personnel. Automating these processes requires highly efficient and scalable methods and software, but so far there is no workflow solution to easily process 1 million GWAS samples.
RESULTS: Here we present BIGwas, a portable, fully automated quality control and association testing pipeline for large-scale binary and quantitative trait GWAS data provided by biobank resources. By using Nextflow workflow and Singularity software container technology, BIGwas performs resource-efficient and reproducible analyses on a local computer or any high-performance compute (HPC) system with just 1 command, with no need to manually install a software execution environment or various software packages. For a single-command GWAS analysis with 974,818 individuals and 92 million genetic markers, BIGwas takes ∼16 days on a small HPC system with only 7 compute nodes to perform a complete GWAS QC and association analysis protocol. Our dynamic parallelization approach enables shorter runtimes for large HPCs.
CONCLUSIONS: Researchers without extensive bioinformatics knowledge and with few computer resources can use BIGwas to perform multi-cohort GWAS with 1 million GWAS samples and, if desired, use it to build their own (genome-wide) PheWAS resource. BIGwas is freely available for download from http://github.com/ikmb/gwas-qc and http://github.com/ikmb/gwas-assoc.
© The Author(s) 2021. Published by Oxford University Press GigaScience.

Entities:  

Keywords:  GWAS; Nextflow; PheWAS; Singularity; association testing; biobank; pipeline; quality control; scalability

Mesh:

Year:  2021        PMID: 34184051      PMCID: PMC8239664          DOI: 10.1093/gigascience/giab047

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  38 in total

1.  QCGWAS: A flexible R package for automated quality control of genome-wide association results.

Authors:  Peter J van der Most; Ahmad Vaez; Bram P Prins; M Loretto Munoz; Harold Snieder; Behrooz Z Alizadeh; Ilja M Nolte
Journal:  Bioinformatics       Date:  2014-01-05       Impact factor: 6.937

2.  Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts.

Authors:  Wei Zhou; Zhangchen Zhao; Jonas B Nielsen; Lars G Fritsche; Jonathon LeFaive; Sarah A Gagliano Taliun; Wenjian Bi; Maiken E Gabrielsen; Mark J Daly; Benjamin M Neale; Kristian Hveem; Goncalo R Abecasis; Cristen J Willer; Seunggeun Lee
Journal:  Nat Genet       Date:  2020-05-18       Impact factor: 38.330

3.  Data quality control in genetic case-control association studies.

Authors:  Carl A Anderson; Fredrik H Pettersson; Geraldine M Clarke; Lon R Cardon; Andrew P Morris; Krina T Zondervan
Journal:  Nat Protoc       Date:  2010-08-26       Impact factor: 13.491

4.  Next-generation genotype imputation service and methods.

Authors:  Sayantan Das; Lukas Forer; Sebastian Schönherr; Carlo Sidore; Adam E Locke; Alan Kwong; Scott I Vrieze; Emily Y Chew; Shawn Levy; Matt McGue; David Schlessinger; Dwight Stambolian; Po-Ru Loh; William G Iacono; Anand Swaroop; Laura J Scott; Francesco Cucca; Florian Kronenberg; Michael Boehnke; Gonçalo R Abecasis; Christian Fuchsberger
Journal:  Nat Genet       Date:  2016-08-29       Impact factor: 38.330

5.  The variant call format and VCFtools.

Authors:  Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin
Journal:  Bioinformatics       Date:  2011-06-07       Impact factor: 6.937

6.  Twelve years of SAMtools and BCFtools.

Authors:  Petr Danecek; James K Bonfield; Jennifer Liddle; John Marshall; Valeriu Ohan; Martin O Pollard; Andrew Whitwham; Thomas Keane; Shane A McCarthy; Robert M Davies; Heng Li
Journal:  Gigascience       Date:  2021-02-16       Impact factor: 6.524

7.  A reference panel of 64,976 haplotypes for genotype imputation.

Authors:  Shane McCarthy; Sayantan Das; Warren Kretzschmar; Olivier Delaneau; Andrew R Wood; Alexander Teumer; Hyun Min Kang; Christian Fuchsberger; Petr Danecek; Kevin Sharp; Yang Luo; Carlo Sidore; Alan Kwong; Nicholas Timpson; Seppo Koskinen; Scott Vrieze; Laura J Scott; He Zhang; Anubha Mahajan; Jan Veldink; Ulrike Peters; Carlos Pato; Cornelia M van Duijn; Christopher E Gillies; Ilaria Gandin; Massimo Mezzavilla; Arthur Gilly; Massimiliano Cocca; Michela Traglia; Andrea Angius; Jeffrey C Barrett; Dorrett Boomsma; Kari Branham; Gerome Breen; Chad M Brummett; Fabio Busonero; Harry Campbell; Andrew Chan; Sai Chen; Emily Chew; Francis S Collins; Laura J Corbin; George Davey Smith; George Dedoussis; Marcus Dorr; Aliki-Eleni Farmaki; Luigi Ferrucci; Lukas Forer; Ross M Fraser; Stacey Gabriel; Shawn Levy; Leif Groop; Tabitha Harrison; Andrew Hattersley; Oddgeir L Holmen; Kristian Hveem; Matthias Kretzler; James C Lee; Matt McGue; Thomas Meitinger; David Melzer; Josine L Min; Karen L Mohlke; John B Vincent; Matthias Nauck; Deborah Nickerson; Aarno Palotie; Michele Pato; Nicola Pirastu; Melvin McInnis; J Brent Richards; Cinzia Sala; Veikko Salomaa; David Schlessinger; Sebastian Schoenherr; P Eline Slagboom; Kerrin Small; Timothy Spector; Dwight Stambolian; Marcus Tuke; Jaakko Tuomilehto; Leonard H Van den Berg; Wouter Van Rheenen; Uwe Volker; Cisca Wijmenga; Daniela Toniolo; Eleftheria Zeggini; Paolo Gasparini; Matthew G Sampson; James F Wilson; Timothy Frayling; Paul I W de Bakker; Morris A Swertz; Steven McCarroll; Charles Kooperberg; Annelot Dekker; David Altshuler; Cristen Willer; William Iacono; Samuli Ripatti; Nicole Soranzo; Klaudia Walter; Anand Swaroop; Francesco Cucca; Carl A Anderson; Richard M Myers; Michael Boehnke; Mark I McCarthy; Richard Durbin
Journal:  Nat Genet       Date:  2016-08-22       Impact factor: 38.330

8.  The UCSC genome browser and associated tools.

Authors:  Robert M Kuhn; David Haussler; W James Kent
Journal:  Brief Bioinform       Date:  2012-08-20       Impact factor: 11.622

9.  Genetic analysis of over 1 million people identifies 535 new loci associated with blood pressure traits.

Authors:  Evangelos Evangelou; Helen R Warren; David Mosen-Ansorena; Borbala Mifsud; Raha Pazoki; He Gao; Georgios Ntritsos; Niki Dimou; Claudia P Cabrera; Ibrahim Karaman; Fu Liang Ng; Marina Evangelou; Katarzyna Witkowska; Evan Tzanis; Jacklyn N Hellwege; Ayush Giri; Digna R Velez Edwards; Yan V Sun; Kelly Cho; J Michael Gaziano; Peter W F Wilson; Philip S Tsao; Csaba P Kovesdy; Tonu Esko; Reedik Mägi; Lili Milani; Peter Almgren; Thibaud Boutin; Stéphanie Debette; Jun Ding; Franco Giulianini; Elizabeth G Holliday; Anne U Jackson; Ruifang Li-Gao; Wei-Yu Lin; Jian'an Luan; Massimo Mangino; Christopher Oldmeadow; Bram Peter Prins; Yong Qian; Muralidharan Sargurupremraj; Nabi Shah; Praveen Surendran; Sébastien Thériault; Niek Verweij; Sara M Willems; Jing-Hua Zhao; Philippe Amouyel; John Connell; Renée de Mutsert; Alex S F Doney; Martin Farrall; Cristina Menni; Andrew D Morris; Raymond Noordam; Guillaume Paré; Neil R Poulter; Denis C Shields; Alice Stanton; Simon Thom; Gonçalo Abecasis; Najaf Amin; Dan E Arking; Kristin L Ayers; Caterina M Barbieri; Chiara Batini; Joshua C Bis; Tineka Blake; Murielle Bochud; Michael Boehnke; Eric Boerwinkle; Dorret I Boomsma; Erwin P Bottinger; Peter S Braund; Marco Brumat; Archie Campbell; Harry Campbell; Aravinda Chakravarti; John C Chambers; Ganesh Chauhan; Marina Ciullo; Massimiliano Cocca; Francis Collins; Heather J Cordell; Gail Davies; Martin H de Borst; Eco J de Geus; Ian J Deary; Joris Deelen; Fabiola Del Greco M; Cumhur Yusuf Demirkale; Marcus Dörr; Georg B Ehret; Roberto Elosua; Stefan Enroth; A Mesut Erzurumluoglu; Teresa Ferreira; Mattias Frånberg; Oscar H Franco; Ilaria Gandin; Paolo Gasparini; Vilmantas Giedraitis; Christian Gieger; Giorgia Girotto; Anuj Goel; Alan J Gow; Vilmundur Gudnason; Xiuqing Guo; Ulf Gyllensten; Anders Hamsten; Tamara B Harris; Sarah E Harris; Catharina A Hartman; Aki S Havulinna; Andrew A Hicks; Edith Hofer; Albert Hofman; Jouke-Jan Hottenga; Jennifer E Huffman; Shih-Jen Hwang; Erik Ingelsson; Alan James; Rick Jansen; Marjo-Riitta Jarvelin; Roby Joehanes; Åsa Johansson; Andrew D Johnson; Peter K Joshi; Pekka Jousilahti; J Wouter Jukema; Antti Jula; Mika Kähönen; Sekar Kathiresan; Bernard D Keavney; Kay-Tee Khaw; Paul Knekt; Joanne Knight; Ivana Kolcic; Jaspal S Kooner; Seppo Koskinen; Kati Kristiansson; Zoltan Kutalik; Maris Laan; Marty Larson; Lenore J Launer; Benjamin Lehne; Terho Lehtimäki; David C M Liewald; Li Lin; Lars Lind; Cecilia M Lindgren; YongMei Liu; Ruth J F Loos; Lorna M Lopez; Yingchang Lu; Leo-Pekka Lyytikäinen; Anubha Mahajan; Chrysovalanto Mamasoula; Jaume Marrugat; Jonathan Marten; Yuri Milaneschi; Anna Morgan; Andrew P Morris; Alanna C Morrison; Peter J Munson; Mike A Nalls; Priyanka Nandakumar; Christopher P Nelson; Teemu Niiranen; Ilja M Nolte; Teresa Nutile; Albertine J Oldehinkel; Ben A Oostra; Paul F O'Reilly; Elin Org; Sandosh Padmanabhan; Walter Palmas; Aarno Palotie; Alison Pattie; Brenda W J H Penninx; Markus Perola; Annette Peters; Ozren Polasek; Peter P Pramstaller; Quang Tri Nguyen; Olli T Raitakari; Meixia Ren; Rainer Rettig; Kenneth Rice; Paul M Ridker; Janina S Ried; Harriëtte Riese; Samuli Ripatti; Antonietta Robino; Lynda M Rose; Jerome I Rotter; Igor Rudan; Daniela Ruggiero; Yasaman Saba; Cinzia F Sala; Veikko Salomaa; Nilesh J Samani; Antti-Pekka Sarin; Reinhold Schmidt; Helena Schmidt; Nick Shrine; David Siscovick; Albert V Smith; Harold Snieder; Siim Sõber; Rossella Sorice; John M Starr; David J Stott; David P Strachan; Rona J Strawbridge; Johan Sundström; Morris A Swertz; Kent D Taylor; Alexander Teumer; Martin D Tobin; Maciej Tomaszewski; Daniela Toniolo; Michela Traglia; Stella Trompet; Jaakko Tuomilehto; Christophe Tzourio; André G Uitterlinden; Ahmad Vaez; Peter J van der Most; Cornelia M van Duijn; Anne-Claire Vergnaud; Germaine C Verwoert; Veronique Vitart; Uwe Völker; Peter Vollenweider; Dragana Vuckovic; Hugh Watkins; Sarah H Wild; Gonneke Willemsen; James F Wilson; Alan F Wright; Jie Yao; Tatijana Zemunik; Weihua Zhang; John R Attia; Adam S Butterworth; Daniel I Chasman; David Conen; Francesco Cucca; John Danesh; Caroline Hayward; Joanna M M Howson; Markku Laakso; Edward G Lakatta; Claudia Langenberg; Olle Melander; Dennis O Mook-Kanamori; Colin N A Palmer; Lorenz Risch; Robert A Scott; Rodney J Scott; Peter Sever; Tim D Spector; Pim van der Harst; Nicholas J Wareham; Eleftheria Zeggini; Daniel Levy; Patricia B Munroe; Christopher Newton-Cheh; Morris J Brown; Andres Metspalu; Adriana M Hung; Christopher J O'Donnell; Todd L Edwards; Bruce M Psaty; Ioanna Tzoulaki; Michael R Barnes; Louise V Wain; Paul Elliott; Mark J Caulfield
Journal:  Nat Genet       Date:  2018-09-17       Impact factor: 41.307

10.  Genetics of blood lipids among ~300,000 multi-ethnic participants of the Million Veteran Program.

Authors:  Derek Klarin; Scott M Damrauer; Kelly Cho; Yan V Sun; Tanya M Teslovich; Jacqueline Honerlaw; David R Gagnon; Scott L DuVall; Jin Li; Gina M Peloso; Mark Chaffin; Aeron M Small; Jie Huang; Hua Tang; Julie A Lynch; Yuk-Lam Ho; Dajiang J Liu; Connor A Emdin; Alexander H Li; Jennifer E Huffman; Jennifer S Lee; Pradeep Natarajan; Rajiv Chowdhury; Danish Saleheen; Marijana Vujkovic; Aris Baras; Saiju Pyarajan; Emanuele Di Angelantonio; Benjamin M Neale; Aliya Naheed; Amit V Khera; John Danesh; Kyong-Mi Chang; Gonçalo Abecasis; Cristen Willer; Frederick E Dewey; David J Carey; John Concato; J Michael Gaziano; Christopher J O'Donnell; Philip S Tsao; Sekar Kathiresan; Daniel J Rader; Peter W F Wilson; Themistocles L Assimes
Journal:  Nat Genet       Date:  2018-10-01       Impact factor: 38.330

View more
  3 in total

1.  Clinical autism subscales have common genetic liabilities that are heritable, pleiotropic, and generalizable to the general population.

Authors:  Taylor R Thomas; Tanner Koomar; Lucas G Casten; Ashton J Tener; Ethan Bahl; Jacob J Michaelson
Journal:  Transl Psychiatry       Date:  2022-06-13       Impact factor: 7.989

2.  The use of genomic variants to drive drug repurposing for chronic hepatitis B.

Authors:  Lalu Muhammad Irham; Wirawan Adikusuma; Dyah Aryani Perwitasari; Haafizah Dania; Rita Maliza; Imaniar Noor Faridah; Ichtiarini Nurullita Santri; Yohane Vincent Abero Phiri; Rocky Cheung
Journal:  Biochem Biophys Rep       Date:  2022-07-08

3.  BIGwas: Single-command quality control and association testing for multi-cohort and biobank-scale GWAS/PheWAS data.

Authors:  Jan Christian Kässens; Lars Wienbrandt; David Ellinghaus
Journal:  Gigascience       Date:  2021-06-29       Impact factor: 6.524

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.