Literature DB >> 26286716

Molgenis-impute: imputation pipeline in a box.

Alexandros Kanterakis1, Patrick Deelen2, Freerk van Dijk3, Heorhiy Byelas4, Martijn Dijkstra5, Morris A Swertz6.   

Abstract

BACKGROUND: Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters.
RESULTS: Here we present MOLGENIS-impute, an 'imputation in a box' solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment.
CONCLUSIONS: MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26286716      PMCID: PMC4541731          DOI: 10.1186/s13104-015-1309-3

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


  35 in total

1.  The Rotterdam Study: 2014 objectives and design update.

Authors:  Albert Hofman; Sarwa Darwish Murad; Cornelia M van Duijn; Oscar H Franco; André Goedegebure; M Arfan Ikram; Caroline C W Klaver; Tamar E C Nijsten; Robin P Peeters; Bruno H Ch Stricker; Henning W Tiemeier; André G Uitterlinden; Meike W Vernooij
Journal:  Eur J Epidemiol       Date:  2013-11-21       Impact factor: 8.082

2.  Whole-genome sequence variation, population structure and demographic history of the Dutch population.

Authors: 
Journal:  Nat Genet       Date:  2014-06-29       Impact factor: 38.330

3.  Excess of rare variants in genes identified by genome-wide association study of hypertriglyceridemia.

Authors:  Christopher T Johansen; Jian Wang; Matthew B Lanktree; Henian Cao; Adam D McIntyre; Matthew R Ban; Rebecca A Martins; Brooke A Kennedy; Reina G Hassell; Maartje E Visser; Stephen M Schwartz; Benjamin F Voight; Roberto Elosua; Veikko Salomaa; Christopher J O'Donnell; Geesje M Dallinga-Thie; Sonia S Anand; Salim Yusuf; Murray W Huff; Sekar Kathiresan; Robert A Hegele
Journal:  Nat Genet       Date:  2010-07-25       Impact factor: 38.330

4.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud.

Authors:  Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria P Balcazar Vargas; Shoaib Sufi; Carole Goble
Journal:  Nucleic Acids Res       Date:  2013-05-02       Impact factor: 16.971

5.  Best practices for scientific computing.

Authors:  Greg Wilson; D A Aruliah; C Titus Brown; Neil P Chue Hong; Matt Davis; Richard T Guy; Steven H D Haddock; Kathryn D Huff; Ian M Mitchell; Mark D Plumbley; Ben Waugh; Ethan P White; Paul Wilson
Journal:  PLoS Biol       Date:  2014-01-07       Impact factor: 8.029

6.  Assessment of genotype imputation performance using 1000 Genomes in African American studies.

Authors:  Dana B Hancock; Joshua L Levy; Nathan C Gaddis; Laura J Bierut; Nancy L Saccone; Grier P Page; Eric O Johnson
Journal:  PLoS One       Date:  2012-11-30       Impact factor: 3.240

7.  A general approach for haplotype phasing across the full spectrum of relatedness.

Authors:  Jared O'Connell; Deepti Gurdasani; Olivier Delaneau; Nicola Pirastu; Sheila Ulivi; Massimiliano Cocca; Michela Traglia; Jie Huang; Jennifer E Huffman; Igor Rudan; Ruth McQuillan; Ross M Fraser; Harry Campbell; Ozren Polasek; Gershim Asiki; Kenneth Ekoru; Caroline Hayward; Alan F Wright; Veronique Vitart; Pau Navarro; Jean-Francois Zagury; James F Wilson; Daniela Toniolo; Paolo Gasparini; Nicole Soranzo; Manjinder S Sandhu; Jonathan Marchini
Journal:  PLoS Genet       Date:  2014-04-17       Impact factor: 5.917

8.  The Genome of the Netherlands: design, and project goals.

Authors:  Dorret I Boomsma; Cisca Wijmenga; Eline P Slagboom; Morris A Swertz; Lennart C Karssen; Abdel Abdellaoui; Kai Ye; Victor Guryev; Martijn Vermaat; Freerk van Dijk; Laurent C Francioli; Jouke Jan Hottenga; Jeroen F J Laros; Qibin Li; Yingrui Li; Hongzhi Cao; Ruoyan Chen; Yuanping Du; Ning Li; Sujie Cao; Jessica van Setten; Androniki Menelaou; Sara L Pulit; Jayne Y Hehir-Kwa; Marian Beekman; Clara C Elbers; Heorhiy Byelas; Anton J M de Craen; Patrick Deelen; Martijn Dijkstra; Johan T den Dunnen; Peter de Knijff; Jeanine Houwing-Duistermaat; Vyacheslav Koval; Karol Estrada; Albert Hofman; Alexandros Kanterakis; David van Enckevort; Hailiang Mai; Mathijs Kattenberg; Elisabeth M van Leeuwen; Pieter B T Neerincx; Ben Oostra; Fernanodo Rivadeneira; Eka H D Suchiman; Andre G Uitterlinden; Gonneke Willemsen; Bruce H Wolffenbuttel; Jun Wang; Paul I W de Bakker; Gert-Jan van Ommen; Cornelia M van Duijn
Journal:  Eur J Hum Genet       Date:  2013-05-29       Impact factor: 4.246

9.  Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration.

Authors:  Patrick Deelen; Marc Jan Bonder; K Joeri van der Velde; Harm-Jan Westra; Erwin Winder; Dennis Hendriksen; Lude Franke; Morris A Swertz
Journal:  BMC Res Notes       Date:  2014-12-11

10.  Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of The Netherlands'.

Authors:  Patrick Deelen; Androniki Menelaou; Elisabeth M van Leeuwen; Alexandros Kanterakis; Freerk van Dijk; Carolina Medina-Gomez; Laurent C Francioli; Jouke Jan Hottenga; Lennart C Karssen; Karol Estrada; Eskil Kreiner-Møller; Fernando Rivadeneira; Jessica van Setten; Javier Gutierrez-Achury; Harm-Jan Westra; Lude Franke; David van Enckevort; Martijn Dijkstra; Heorhiy Byelas; Cornelia M van Duijn; Paul I W de Bakker; Cisca Wijmenga; Morris A Swertz
Journal:  Eur J Hum Genet       Date:  2014-06-04       Impact factor: 4.246

View more
  3 in total

1.  Cohort Profile: The Nijmegen Biomedical Study (NBS).

Authors:  Tessel E Galesloot; Sita H Vermeulen; Dorine W Swinkels; F de Vegt; B Franke; M den Heijer; J de Graaf; André L M Verbeek; Lambertus A L M Kiemeney
Journal:  Int J Epidemiol       Date:  2017-08-01       Impact factor: 7.196

2.  Depression genetic risk score is associated with anhedonia-related markers across units of analysis.

Authors:  Guia Guffanti; Poornima Kumar; Roee Admon; Michael T Treadway; Mei H Hall; Malavika Mehta; Samuel Douglas; Amanda R Arulpragasam; Diego A Pizzagalli
Journal:  Transl Psychiatry       Date:  2019-09-19       Impact factor: 6.222

3.  Genome-wide association analysis identifies novel loci for chronotype in 100,420 individuals from the UK Biobank.

Authors:  Jacqueline M Lane; Irma Vlasac; Simon G Anderson; Simon D Kyle; William G Dixon; David A Bechtold; Shubhroz Gill; Max A Little; Annemarie Luik; Andrew Loudon; Richard Emsley; Frank A J L Scheer; Deborah A Lawlor; Susan Redline; David W Ray; Martin K Rutter; Richa Saxena
Journal:  Nat Commun       Date:  2016-03-09       Impact factor: 14.919

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.