Literature DB >> 33735208

Principles for data analysis workflows.

Sara Stoudt1,2, Váleri N Vásquez1,3, Ciera C Martinez1,4.   

Abstract

A systematic and reproducible "workflow"-the process that moves a scientific investigation from raw data to coherent research question to insightful contribution-should be a fundamental part of academic data-intensive research practice. In this paper, we elaborate basic principles of a reproducible data analysis workflow by defining 3 phases: the Explore, Refine, and Produce Phases. Each phase is roughly centered around the audience to whom research decisions, methodologies, and results are being immediately communicated. Importantly, each phase can also give rise to a number of research products beyond traditional academic publications. Where relevant, we draw analogies between design principles and established practice in software development. The guidance provided here is not intended to be a strict rulebook; rather, the suggestions for practices and tools to advance reproducible, sound data-intensive analysis may furnish support for both students new to research and current researchers who are new to data-intensive work.

Entities:  

Year:  2021        PMID: 33735208      PMCID: PMC7971542          DOI: 10.1371/journal.pcbi.1008770

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  22 in total

1.  Reproducible research in computational science.

Authors:  Roger D Peng
Journal:  Science       Date:  2011-12-02       Impact factor: 47.728

2.  An invitation to reproducible computational research.

Authors:  David L Donoho
Journal:  Biostatistics       Date:  2010-07       Impact factor: 5.899

Review 3.  Next-generation sequencing data interpretation: enhancing reproducibility and accessibility.

Authors:  Anton Nekrutenko; James Taylor
Journal:  Nat Rev Genet       Date:  2012-09       Impact factor: 53.242

4.  Snakemake--a scalable bioinformatics workflow engine.

Authors:  Johannes Köster; Sven Rahmann
Journal:  Bioinformatics       Date:  2012-08-20       Impact factor: 6.937

Review 5.  What does research reproducibility mean?

Authors:  Steven N Goodman; Daniele Fanelli; John P A Ioannidis
Journal:  Sci Transl Med       Date:  2016-06-01       Impact factor: 17.956

6.  Using publication metrics to highlight academic productivity and research impact.

Authors:  Christopher R Carpenter; David C Cone; Cathy C Sarli
Journal:  Acad Emerg Med       Date:  2014-10       Impact factor: 3.451

7.  Reproducibility vs. Replicability: A Brief History of a Confused Terminology.

Authors:  Hans E Plesser
Journal:  Front Neuroinform       Date:  2018-01-18       Impact factor: 4.081

8.  Good enough practices in scientific computing.

Authors:  Greg Wilson; Jennifer Bryan; Karen Cranston; Justin Kitzes; Lex Nederbragt; Tracy K Teal
Journal:  PLoS Comput Biol       Date:  2017-06-22       Impact factor: 4.475

9.  Veridical data science.

Authors:  Bin Yu; Karl Kumbier
Journal:  Proc Natl Acad Sci U S A       Date:  2020-02-13       Impact factor: 11.205

10.  Streamlining data-intensive biology with workflow systems.

Authors:  Taylor Reiter; Phillip T Brooks; Luiz Irber; Shannon E K Joslin; Charles M Reid; Camille Scott; C Titus Brown; N Tessa Pierce-Ward
Journal:  Gigascience       Date:  2021-01-13       Impact factor: 6.524

View more
  1 in total

1.  Ten simple rules on writing clean and reliable open-source scientific software.

Authors:  Haley Hunter-Zinck; Alexandre Fioravante de Siqueira; Váleri N Vásquez; Richard Barnes; Ciera C Martinez
Journal:  PLoS Comput Biol       Date:  2021-11-11       Impact factor: 4.475

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.