Literature DB >> 29953862

Practical Computational Reproducibility in the Life Sciences.

Björn Grüning1, John Chilton2, Johannes Köster3, Ryan Dale4, Nicola Soranzo5, Marius van den Beek6, Jeremy Goecks7, Rolf Backofen8, Anton Nekrutenko9, James Taylor10.   

Abstract

Many areas of research suffer from poor reproducibility, particularly in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but implementation of these practices remains difficult due to the challenge of assembling software tools plus associated libraries, connecting tools together into pipelines, and specifying parameters. Here, we discuss a suite of cutting-edge technologies that make computational reproducibility not just possible, but practical in both time and effort. This suite combines three well-tested components-a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and workflow systems that automatically orchestrate the composition of these packages for entire pipelines-to achieve an unprecedented level of computational reproducibility. We also provide a practical implementation and five recommendations to help set a typical researcher on the path to performing data analyses reproducibly.
Copyright © 2018 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2018        PMID: 29953862      PMCID: PMC6263957          DOI: 10.1016/j.cels.2018.03.014

Source DB:  PubMed          Journal:  Cell Syst        ISSN: 2405-4712            Impact factor:   10.304


  19 in total

1.  GenePattern 2.0.

Authors:  Michael Reich; Ted Liefeld; Joshua Gould; Jim Lerner; Pablo Tamayo; Jill P Mesirov
Journal:  Nat Genet       Date:  2006-05       Impact factor: 38.330

2.  Six red flags for suspect work.

Authors:  C Glenn Begley
Journal:  Nature       Date:  2013-05-23       Impact factor: 49.962

Review 3.  Next-generation sequencing data interpretation: enhancing reproducibility and accessibility.

Authors:  Anton Nekrutenko; James Taylor
Journal:  Nat Rev Genet       Date:  2012-09       Impact factor: 53.242

4.  Snakemake--a scalable bioinformatics workflow engine.

Authors:  Johannes Köster; Sven Rahmann
Journal:  Bioinformatics       Date:  2012-08-20       Impact factor: 6.937

5.  Reproducibility.

Authors:  Marcia McNutt
Journal:  Science       Date:  2014-01-17       Impact factor: 47.728

6.  Reproducibility of computational workflows is automated using continuous analysis.

Authors:  Brett K Beaulieu-Jones; Casey S Greene
Journal:  Nat Biotechnol       Date:  2017-03-13       Impact factor: 54.908

7.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.

Authors:  Mihaela Pertea; Geo M Pertea; Corina M Antonescu; Tsung-Cheng Chang; Joshua T Mendell; Steven L Salzberg
Journal:  Nat Biotechnol       Date:  2015-02-18       Impact factor: 54.908

8.  Galaxy CloudMan: delivering cloud compute clusters.

Authors:  Enis Afgan; Dannon Baker; Nate Coraor; Brad Chapman; Anton Nekrutenko; James Taylor
Journal:  BMC Bioinformatics       Date:  2010-12-21       Impact factor: 3.169

9.  Community-driven computational biology with Debian Linux.

Authors:  Steffen Möller; Hajo Nils Krabbenhöft; Andreas Tille; David Paleino; Alan Williams; Katy Wolstencroft; Carole Goble; Richard Holland; Dominique Belhachemi; Charles Plessy
Journal:  BMC Bioinformatics       Date:  2010-12-21       Impact factor: 3.169

10.  BioContainers: an open-source and community-driven framework for software standardization.

Authors:  Felipe da Veiga Leprevost; Björn A Grüning; Saulo Alves Aflitos; Hannes L Röst; Julian Uszkoreit; Harald Barsnes; Marc Vaudel; Pablo Moreno; Laurent Gatto; Jonas Weber; Mingze Bai; Rafael C Jimenez; Timo Sachsenberg; Julianus Pfeuffer; Roberto Vera Alvarez; Johannes Griss; Alexey I Nesvizhskii; Yasset Perez-Riverol
Journal:  Bioinformatics       Date:  2017-08-15       Impact factor: 6.937

View more
  30 in total

1.  DOME: recommendations for supervised machine learning validation in biology.

Authors:  Ian Walsh; Dmytro Fishman; Dario Garcia-Gasulla; Tiina Titma; Gianluca Pollastri; Jennifer Harrow; Fotis E Psomopoulos; Silvio C E Tosatto
Journal:  Nat Methods       Date:  2021-07-27       Impact factor: 28.547

Review 2.  Prospects and challenges of multi-omics data integration in toxicology.

Authors:  Sebastian Canzler; Jana Schor; Wibke Busch; Kristin Schubert; Ulrike E Rolle-Kampczyk; Hervé Seitz; Hennicke Kamp; Martin von Bergen; Roland Buesen; Jörg Hackermüller
Journal:  Arch Toxicol       Date:  2020-02-08       Impact factor: 5.153

3.  GraphClust2: Annotation and discovery of structured RNAs with scalable and accessible integrative clustering.

Authors:  Milad Miladi; Eteri Sokhoyan; Torsten Houwaart; Steffen Heyne; Fabrizio Costa; Björn Grüning; Rolf Backofen
Journal:  Gigascience       Date:  2019-12-01       Impact factor: 6.524

4.  Novel Bioinformatics Strategies Driving Dynamic Metaproteomic Studies.

Authors:  Caitlin M A Simopoulos; Daniel Figeys; Mathieu Lavallée-Adam
Journal:  Methods Mol Biol       Date:  2022

5.  Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR.

Authors:  Sebastian Beier; Anne Fiebig; Cyril Pommier; Isuru Liyanage; Matthias Lange; Paul J Kersey; Stephan Weise; Richard Finkers; Baron Koylass; Timothee Cezard; Mélanie Courtot; Bruno Contreras-Moreira; Guy Naamati; Sarah Dyer; Uwe Scholz
Journal:  F1000Res       Date:  2022-02-24

6.  Ranking reprogramming factors for cell differentiation.

Authors:  Jennifer Hammelman; Tulsi Patel; Michael Closser; Hynek Wichterle; David Gifford
Journal:  Nat Methods       Date:  2022-06-16       Impact factor: 47.990

7.  Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix.

Authors:  Justin Bedő; Leon Di Stefano; Anthony T Papenfuss
Journal:  Gigascience       Date:  2020-11-18       Impact factor: 6.524

8.  uap: reproducible and robust HTS data analysis.

Authors:  Christoph Kämpf; Michael Specht; Alexander Scholz; Sven-Holger Puppel; Gero Doose; Kristin Reiche; Jana Schor; Jörg Hackermüller
Journal:  BMC Bioinformatics       Date:  2019-12-12       Impact factor: 3.169

9.  pyrpipe: a Python package for RNA-Seq workflows.

Authors:  Urminder Singh; Jing Li; Arun Seetharam; Eve Syrkin Wurtele
Journal:  NAR Genom Bioinform       Date:  2021-06-01

Review 10.  Scalable Data Analysis in Proteomics and Metabolomics Using BioContainers and Workflows Engines.

Authors:  Yasset Perez-Riverol; Pablo Moreno
Journal:  Proteomics       Date:  2019-12-18       Impact factor: 5.393

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.