Literature DB >> 33630841

Using prototyping to choose a bioinformatics workflow management system.

Michael Jackson1, Kostas Kavoussanakis1, Edward W J Wallace2.   

Abstract

Workflow management systems represent, manage, and execute multistep computational analyses and offer many benefits to bioinformaticians. They provide a common language for describing analysis workflows, contributing to reproducibility and to building libraries of reusable components. They can support both incremental build and re-entrancy-the ability to selectively re-execute parts of a workflow in the presence of additional inputs or changes in configuration and to resume execution from where a workflow previously stopped. Many workflow management systems enhance portability by supporting the use of containers, high-performance computing (HPC) systems, and clouds. Most importantly, workflow management systems allow bioinformaticians to delegate how their workflows are run to the workflow management system and its developers. This frees the bioinformaticians to focus on what these workflows should do, on their data analyses, and on their science. RiboViz is a package to extract biological insight from ribosome profiling data to help advance understanding of protein synthesis. At the heart of RiboViz is an analysis workflow, implemented in a Python script. To conform to best practices for scientific computing which recommend the use of build tools to automate workflows and to reuse code instead of rewriting it, the authors reimplemented this workflow within a workflow management system. To select a workflow management system, a rapid survey of available systems was undertaken, and candidates were shortlisted: Snakemake, cwltool, Toil, and Nextflow. Each candidate was evaluated by quickly prototyping a subset of the RiboViz workflow, and Nextflow was chosen. The selection process took 10 person-days, a small cost for the assurance that Nextflow satisfied the authors' requirements. The use of prototyping can offer a low-cost way of making a more informed selection of software to use within projects, rather than relying solely upon reviews and recommendations by others.

Entities:  

Mesh:

Year:  2021        PMID: 33630841      PMCID: PMC7906312          DOI: 10.1371/journal.pcbi.1008622

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  18 in total

1.  HISAT: a fast spliced aligner with low memory requirements.

Authors:  Daehwan Kim; Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2015-03-09       Impact factor: 28.547

2.  Workflow systems turn raw data into scientific knowledge.

Authors:  Jeffrey M Perkel
Journal:  Nature       Date:  2019-09       Impact factor: 49.962

3.  Toil enables reproducible, open source, big biomedical data analyses.

Authors:  John Vivian; Arjun Arkal Rao; Frank Austin Nothaft; Christopher Ketchum; Joel Armstrong; Adam Novak; Jacob Pfeil; Jake Narkizian; Alden D Deran; Audrey Musselman-Brown; Hannes Schmidt; Peter Amstutz; Brian Craft; Mary Goldman; Kate Rosenbloom; Melissa Cline; Brian O'Connor; Megan Hanna; Chet Birger; W James Kent; David A Patterson; Anthony D Joseph; Jingchun Zhu; Sasha Zaranek; Gad Getz; David Haussler; Benedict Paten
Journal:  Nat Biotechnol       Date:  2017-04-11       Impact factor: 54.908

4.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

5.  BEDTools: a flexible suite of utilities for comparing genomic features.

Authors:  Aaron R Quinlan; Ira M Hall
Journal:  Bioinformatics       Date:  2010-01-28       Impact factor: 6.937

6.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud.

Authors:  Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria P Balcazar Vargas; Shoaib Sufi; Carole Goble
Journal:  Nucleic Acids Res       Date:  2013-05-02       Impact factor: 16.971

Review 7.  A review of bioinformatic pipeline frameworks.

Authors:  Jeremy Leipzig
Journal:  Brief Bioinform       Date:  2017-05-01       Impact factor: 11.622

8.  The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update.

Authors:  Enis Afgan; Dannon Baker; Bérénice Batut; Marius van den Beek; Dave Bouvier; Martin Cech; John Chilton; Dave Clements; Nate Coraor; Björn A Grüning; Aysam Guerler; Jennifer Hillman-Jackson; Saskia Hiltemann; Vahid Jalili; Helena Rasche; Nicola Soranzo; Jeremy Goecks; James Taylor; Anton Nekrutenko; Daniel Blankenberg
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

9.  Computing Workflows for Biologists: A Roadmap.

Authors:  Ashley Shade; Tracy K Teal
Journal:  PLoS Biol       Date:  2015-11-24       Impact factor: 8.029

10.  riboviz: analysis and visualization of ribosome profiling datasets.

Authors:  Oana Carja; Tongji Xing; Edward W J Wallace; Joshua B Plotkin; Premal Shah
Journal:  BMC Bioinformatics       Date:  2017-10-25       Impact factor: 3.169

View more
  4 in total

Review 1.  A simple guide to de novo transcriptome assembly and annotation.

Authors:  Venket Raghavan; Louis Kraft; Fantin Mesny; Linda Rigerte
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 11.622

2.  pyrpipe: a Python package for RNA-Seq workflows.

Authors:  Urminder Singh; Jing Li; Arun Seetharam; Eve Syrkin Wurtele
Journal:  NAR Genom Bioinform       Date:  2021-06-01

3.  Design considerations for workflow management systems use in production genomics research and the clinic.

Authors:  Azza E Ahmed; Joshua M Allen; Tajesvi Bhat; Prakruthi Burra; Christina E Fliege; Steven N Hart; Jacob R Heldenbrand; Matthew E Hudson; Dave Deandre Istanto; Michael T Kalmbach; Gregory D Kapraun; Katherine I Kendig; Matthew Charles Kendzior; Eric W Klee; Nate Mattson; Christian A Ross; Sami M Sharif; Ramshankar Venkatakrishnan; Faisal M Fadlelmola; Liudmila S Mainzer
Journal:  Sci Rep       Date:  2021-11-04       Impact factor: 4.379

4.  riboviz 2: A flexible and robust ribosome profiling data analysis and visualization workflow.

Authors:  Alexander L Cope; Felicity Anderson; John Favate; Michael Jackson; Amanda Mok; Anna Kurowska; Junchen Liu; Emma MacKenzie; Vikram Shivakumar; Peter Tilton; Sophie M Winterbourne; Siyin Xue; Kostas Kavoussanakis; Liana F Lareau; Premal Shah; Edward W J Wallace
Journal:  Bioinformatics       Date:  2022-02-14       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.