Literature DB >> 34900230

snpQT: flexible, reproducible, and comprehensive quality control and imputation of genomic data.

Christina Vasilopoulou1, Benjamin Wingfield2, Andrew P Morris3, William Duddy1.   

Abstract

Quality control of genomic data is an essential but complicated multi-step procedure, often requiring separate installation and expert familiarity with a combination of different bioinformatics tools. Software incompatibilities, and inconsistencies across computing environments, are recurrent challenges, leading to poor reproducibility. Existing semi-automated or automated solutions lack comprehensive quality checks, flexible workflow architecture, and user control. To address these challenges, we have developed snpQT: a scalable, stand-alone software pipeline using nextflow and BioContainers, for comprehensive, reproducible and interactive quality control of human genomic data. snpQT offers some 36 discrete quality filters or correction steps in a complete standardised pipeline, producing graphical reports to demonstrate the state of data before and after each quality control procedure. This includes human genome build conversion, population stratification against data from the 1,000 Genomes Project, automated population outlier removal, and built-in imputation with its own pre- and post- quality controls. Common input formats are used, and a synthetic dataset and comprehensive online tutorial are provided for testing, educational purposes, and demonstration. The snpQT pipeline is designed to run with minimal user input and coding experience; quality control steps are implemented with numerous user-modifiable thresholds, and workflows can be flexibly combined in custom combinations. snpQT is open source and freely available at https://github.com/nebfield/snpQT. A comprehensive online tutorial and installation guide is provided through to GWAS (https://snpqt.readthedocs.io/en/latest/), introducing snpQT using a synthetic demonstration dataset and a real-world Amyotrophic Lateral Sclerosis SNP-array dataset. Copyright:
© 2021 Vasilopoulou C et al.

Entities:  

Keywords:  Anaconda; BioContainers; GWAS; GWAS pipeline; Genomic Variants; Imputation; Nextflow; Population Stratification; QC; Quality Control; SNPs

Mesh:

Year:  2021        PMID: 34900230      PMCID: PMC8637247.2          DOI: 10.12688/f1000research.53821.2

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


  27 in total

1.  Genetic Epidemiology of Complex Phenotypes.

Authors:  Darren D O'Rielly; Proton Rahman
Journal:  Methods Mol Biol       Date:  2021

2.  Data quality control in genetic case-control association studies.

Authors:  Carl A Anderson; Fredrik H Pettersson; Geraldine M Clarke; Lon R Cardon; Andrew P Morris; Krina T Zondervan
Journal:  Nat Protoc       Date:  2010-08-26       Impact factor: 13.491

3.  Genome-wide Analyses Identify KIF5A as a Novel ALS Gene.

Authors:  Aude Nicolas; Kevin P Kenna; Alan E Renton; Nicola Ticozzi; Faraz Faghri; Ruth Chia; Janice A Dominov; Brendan J Kenna; Mike A Nalls; Pamela Keagle; Alberto M Rivera; Wouter van Rheenen; Natalie A Murphy; Joke J F A van Vugt; Joshua T Geiger; Rick A Van der Spek; Hannah A Pliner; Bradley N Smith; Giuseppe Marangi; Simon D Topp; Yevgeniya Abramzon; Athina Soragia Gkazi; John D Eicher; Aoife Kenna; Gabriele Mora; Andrea Calvo; Letizia Mazzini; Nilo Riva; Jessica Mandrioli; Claudia Caponnetto; Stefania Battistini; Paolo Volanti; Vincenzo La Bella; Francesca L Conforti; Giuseppe Borghero; Sonia Messina; Isabella L Simone; Francesca Trojsi; Fabrizio Salvi; Francesco O Logullo; Sandra D'Alfonso; Lucia Corrado; Margherita Capasso; Luigi Ferrucci; Cristiane de Araujo Martins Moreno; Sitharthan Kamalakaran; David B Goldstein; Aaron D Gitler; Tim Harris; Richard M Myers; Hemali Phatnani; Rajeeva Lochan Musunuri; Uday Shankar Evani; Avinash Abhyankar; Michael C Zody; Julia Kaye; Steven Finkbeiner; Stacia K Wyman; Alex LeNail; Leandro Lima; Ernest Fraenkel; Clive N Svendsen; Leslie M Thompson; Jennifer E Van Eyk; James D Berry; Timothy M Miller; Stephen J Kolb; Merit Cudkowicz; Emily Baxi; Michael Benatar; J Paul Taylor; Evadnie Rampersaud; Gang Wu; Joanne Wuu; Giuseppe Lauria; Federico Verde; Isabella Fogh; Cinzia Tiloca; Giacomo P Comi; Gianni Sorarù; Cristina Cereda; Philippe Corcia; Hannu Laaksovirta; Liisa Myllykangas; Lilja Jansson; Miko Valori; John Ealing; Hisham Hamdalla; Sara Rollinson; Stuart Pickering-Brown; Richard W Orrell; Katie C Sidle; Andrea Malaspina; John Hardy; Andrew B Singleton; Janel O Johnson; Sampath Arepalli; Peter C Sapp; Diane McKenna-Yasek; Meraida Polak; Seneshaw Asress; Safa Al-Sarraj; Andrew King; Claire Troakes; Caroline Vance; Jacqueline de Belleroche; Frank Baas; Anneloor L M A Ten Asbroek; José Luis Muñoz-Blanco; Dena G Hernandez; Jinhui Ding; J Raphael Gibbs; Sonja W Scholz; Mary Kay Floeter; Roy H Campbell; Francesco Landi; Robert Bowser; Stefan M Pulst; John M Ravits; Daniel J L MacGowan; Janine Kirby; Erik P Pioro; Roger Pamphlett; James Broach; Glenn Gerhard; Travis L Dunckley; Christopher B Brady; Neil W Kowall; Juan C Troncoso; Isabelle Le Ber; Kevin Mouzat; Serge Lumbroso; Terry D Heiman-Patterson; Freya Kamel; Ludo Van Den Bosch; Robert H Baloh; Tim M Strom; Thomas Meitinger; Aleksey Shatunov; Kristel R Van Eijk; Mamede de Carvalho; Maarten Kooyman; Bas Middelkoop; Matthieu Moisse; Russell L McLaughlin; Michael A Van Es; Markus Weber; Kevin B Boylan; Marka Van Blitterswijk; Rosa Rademakers; Karen E Morrison; A Nazli Basak; Jesús S Mora; Vivian E Drory; Pamela J Shaw; Martin R Turner; Kevin Talbot; Orla Hardiman; Kelly L Williams; Jennifer A Fifita; Garth A Nicholson; Ian P Blair; Guy A Rouleau; Jesús Esteban-Pérez; Alberto García-Redondo; Ammar Al-Chalabi; Ekaterina Rogaeva; Lorne Zinman; Lyle W Ostrow; Nicholas J Maragakis; Jeffrey D Rothstein; Zachary Simmons; Johnathan Cooper-Knock; Alexis Brice; Stephen A Goutman; Eva L Feldman; Summer B Gibson; Franco Taroni; Antonia Ratti; Cinzia Gellera; Philip Van Damme; Wim Robberecht; Pietro Fratta; Mario Sabatelli; Christian Lunetta; Albert C Ludolph; Peter M Andersen; Jochen H Weishaupt; William Camu; John Q Trojanowski; Vivianna M Van Deerlin; Robert H Brown; Leonard H van den Berg; Jan H Veldink; Matthew B Harms; Jonathan D Glass; David J Stone; Pentti Tienari; Vincenzo Silani; Adriano Chiò; Christopher E Shaw; Bryan J Traynor; John E Landers
Journal:  Neuron       Date:  2018-03-21       Impact factor: 18.688

4.  Second-generation PLINK: rising to the challenge of larger and richer datasets.

Authors:  Christopher C Chang; Carson C Chow; Laurent Cam Tellier; Shashaank Vattikuti; Shaun M Purcell; James J Lee
Journal:  Gigascience       Date:  2015-02-25       Impact factor: 6.524

5.  Quantifying reproducibility in computational biology: the case of the tuberculosis drugome.

Authors:  Daniel Garijo; Sarah Kinnings; Li Xie; Lei Xie; Yinliang Zhang; Philip E Bourne; Yolanda Gil
Journal:  PLoS One       Date:  2013-11-27       Impact factor: 3.240

6.  Singularity: Scientific containers for mobility of compute.

Authors:  Gregory M Kurtzer; Vanessa Sochat; Michael W Bauer
Journal:  PLoS One       Date:  2017-05-11       Impact factor: 3.240

7.  Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data.

Authors:  Ryan J Eller; Sarath C Janga; Susan Walsh
Journal:  BMC Bioinformatics       Date:  2019-06-28       Impact factor: 3.169

Review 8.  Reaching the End-Game for GWAS: Machine Learning Approaches for the Prioritization of Complex Disease Loci.

Authors:  Hannah L Nicholls; Christopher R John; David S Watson; Patricia B Munroe; Michael R Barnes; Claudia P Cabrera
Journal:  Front Genet       Date:  2020-04-15       Impact factor: 4.599

9.  Twelve years of SAMtools and BCFtools.

Authors:  Petr Danecek; James K Bonfield; Jennifer Liddle; John Marshall; Valeriu Ohan; Martin O Pollard; Andrew Whitwham; Thomas Keane; Shane A McCarthy; Robert M Davies; Heng Li
Journal:  Gigascience       Date:  2021-02-16       Impact factor: 6.524

10.  A tutorial on conducting genome-wide association studies: Quality control and statistical analysis.

Authors:  Andries T Marees; Hilde de Kluiver; Sven Stringer; Florence Vorspan; Emmanuel Curis; Cynthia Marie-Claire; Eske M Derks
Journal:  Int J Methods Psychiatr Res       Date:  2018-02-27       Impact factor: 4.035

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.