Literature DB >> 33170857

Ten simple rules for writing Dockerfiles for reproducible data science.

Daniel Nüst1, Vanessa Sochat2, Ben Marwick3, Stephen J Eglen4, Tim Head5, Tony Hirst6, Benjamin D Evans7.   

Abstract

Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, the main drivers for using these containers are transparency and support of reproducibility; in turn, a workflow's reproducibility can be greatly affected by the choices that are made with respect to building containers. In many cases, the build process for the container's image is created from instructions provided in a Dockerfile format. In support of this approach, we present a set of rules to help researchers write understandable Dockerfiles for typical data science workflows. By following the rules in this article, researchers can create containers suitable for sharing with fellow scientists, for including in scholarly communication such as education or scientific papers, and for effective and sustainable personal workflows.

Entities:  

Year:  2020        PMID: 33170857      PMCID: PMC7654784          DOI: 10.1371/journal.pcbi.1008316

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  11 in total

1.  An invitation to reproducible computational research.

Authors:  David L Donoho
Journal:  Biostatistics       Date:  2010-07       Impact factor: 5.899

2.  Before reproducibility must come preproducibility.

Authors:  Philip B Stark
Journal:  Nature       Date:  2018-05       Impact factor: 49.962

3.  Training students for the Open Science future.

Authors:  Felix Schönbrodt
Journal:  Nat Hum Behav       Date:  2019-10

4.  The Scientific Filesystem.

Authors:  Vanessa Sochat
Journal:  Gigascience       Date:  2018-05-01       Impact factor: 6.524

5.  Ten simple rules for reproducible computational research.

Authors:  Geir Kjetil Sandve; Anton Nekrutenko; James Taylor; Eivind Hovig
Journal:  PLoS Comput Biol       Date:  2013-10-24       Impact factor: 4.475

6.  Best practices for scientific computing.

Authors:  Greg Wilson; D A Aruliah; C Titus Brown; Neil P Chue Hong; Matt Davis; Richard T Guy; Steven H D Haddock; Kathryn D Huff; Ian M Mitchell; Mark D Plumbley; Ben Waugh; Ethan P White; Paul Wilson
Journal:  PLoS Biol       Date:  2014-01-07       Impact factor: 8.029

7.  Good enough practices in scientific computing.

Authors:  Greg Wilson; Jennifer Bryan; Karen Cranston; Justin Kitzes; Lex Nederbragt; Tracy K Teal
Journal:  PLoS Comput Biol       Date:  2017-06-22       Impact factor: 4.475

8.  Singularity: Scientific containers for mobility of compute.

Authors:  Gregory M Kurtzer; Vanessa Sochat; Michael W Bauer
Journal:  PLoS One       Date:  2017-05-11       Impact factor: 3.240

9.  Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks.

Authors:  Adam Rule; Amanda Birmingham; Cristal Zuniga; Ilkay Altintas; Shih-Cheng Huang; Rob Knight; Niema Moshiri; Mai H Nguyen; Sara Brin Rosenthal; Fernando Pérez; Peter W Rose
Journal:  PLoS Comput Biol       Date:  2019-07-25       Impact factor: 4.475

10.  Recommendations for the packaging and containerizing of bioinformatics software.

Authors:  Bjorn Gruening; Olivier Sallou; Pablo Moreno; Felipe da Veiga Leprevost; Hervé Ménager; Dan Søndergaard; Hannes Röst; Timo Sachsenberg; Brian O'Connor; Fábio Madeira; Victoria Dominguez Del Angel; Michael R Crusoe; Susheel Varma; Daniel Blankenberg; Rafael C Jimenez; Yasset Perez-Riverol
Journal:  F1000Res       Date:  2018-06-14
View more
  5 in total

1.  Ten simple rules for researchers who want to develop web apps.

Authors:  Sheila M Saia; Natalie G Nelson; Sierra N Young; Stanton Parham; Micah Vandegrift
Journal:  PLoS Comput Biol       Date:  2022-01-06       Impact factor: 4.475

2.  Consensus-based guidance for conducting and reporting multi-analyst studies.

Authors:  Balazs Aczel; Barnabas Szaszi; Gustav Nilsonne; Olmo R van den Akker; Casper J Albers; Marcel Alm van Assen; Jojanneke A Bastiaansen; Daniel Benjamin; Udo Boehm; Rotem Botvinik-Nezer; Laura F Bringmann; Niko A Busch; Emmanuel Caruyer; Andrea M Cataldo; Nelson Cowan; Andrew Delios; Noah Nn van Dongen; Chris Donkin; Johnny B van Doorn; Anna Dreber; Gilles Dutilh; Gary F Egan; Morton Ann Gernsbacher; Rink Hoekstra; Sabine Hoffmann; Felix Holzmeister; Juergen Huber; Magnus Johannesson; Kai J Jonas; Alexander T Kindel; Michael Kirchler; Yoram K Kunkels; D Stephen Lindsay; Jean-Francois Mangin; Dora Matzke; Marcus R Munafò; Ben R Newell; Brian A Nosek; Russell A Poldrack; Don van Ravenzwaaij; Jörg Rieskamp; Matthew J Salganik; Alexandra Sarafoglou; Tom Schonberg; Martin Schweinsberg; David Shanks; Raphael Silberzahn; Daniel J Simons; Barbara A Spellman; Samuel St-Jean; Jeffrey J Starns; Eric Luis Uhlmann; Jelte Wicherts; Eric-Jan Wagenmakers
Journal:  Elife       Date:  2021-11-09       Impact factor: 8.140

3.  Architect: A tool for aiding the reconstruction of high-quality metabolic models through improved enzyme annotation.

Authors:  Nirvana Nursimulu; Alan M Moses; John Parkinson
Journal:  PLoS Comput Biol       Date:  2022-09-08       Impact factor: 4.779

4.  Toward practical transparent verifiable and long-term reproducible research using Guix.

Authors:  Nicolas Vallet; David Michonneau; Simon Tournier
Journal:  Sci Data       Date:  2022-10-04       Impact factor: 8.501

5.  FAIRly big: A framework for computationally reproducible processing of large-scale data.

Authors:  Adina S Wagner; Laura K Waite; Małgorzata Wierzba; Felix Hoffstaedter; Alexander Q Waite; Benjamin Poldrack; Simon B Eickhoff; Michael Hanke
Journal:  Sci Data       Date:  2022-03-11       Impact factor: 6.444

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.