Literature DB >> 23786315

Bioinformatic pipelines in Python with Leaf.

Francesco Napolitano1, Renato Mariani-Costantini, Roberto Tagliaferri.   

Abstract

BACKGROUND: An incremental, loosely planned development approach is often used in bioinformatic studies when dealing with custom data analysis in a rapidly changing environment. Unfortunately, the lack of a rigorous software structuring can undermine the maintainability, communicability and replicability of the process. To ameliorate this problem we propose the Leaf system, the aim of which is to seamlessly introduce the pipeline formality on top of a dynamical development process with minimum overhead for the programmer, thus providing a simple layer of software structuring.
RESULTS: Leaf includes a formal language for the definition of pipelines with code that can be transparently inserted into the user's Python code. Its syntax is designed to visually highlight dependencies in the pipeline structure it defines. While encouraging the developer to think in terms of bioinformatic pipelines, Leaf supports a number of automated features including data and session persistence, consistency checks between steps of the analysis, processing optimization and publication of the analytic protocol in the form of a hypertext.
CONCLUSIONS: Leaf offers a powerful balance between plan-driven and change-driven development environments in the design, management and communication of bioinformatic pipelines. Its unique features make it a valuable alternative to other related tools.

Entities:  

Mesh:

Year:  2013        PMID: 23786315      PMCID: PMC3747863          DOI: 10.1186/1471-2105-14-201

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  15 in total

1.  Python: a programming language for software integration and development.

Authors:  M F Sanner
Journal:  J Mol Graph Model       Date:  1999-02       Impact factor: 2.518

2.  Biopipe: a flexible framework for protocol-based bioinformatics analysis.

Authors:  Shawn Hoon; Kiran Kumar Ratnapu; Jer-Ming Chia; Balamurugan Kumarasamy; Xiao Juguang; Michele Clamp; Arne Stabenau; Simon Potter; Laura Clarke; Elia Stupka
Journal:  Genome Res       Date:  2003-07-17       Impact factor: 9.043

3.  Ruffus: a lightweight Python library for computational pipelines.

Authors:  Leo Goodstadt
Journal:  Bioinformatics       Date:  2010-09-16       Impact factor: 6.937

Review 4.  Automation of in-silico data analysis processes through workflow management systems.

Authors:  Paolo Romano
Journal:  Brief Bioinform       Date:  2007-12-02       Impact factor: 11.622

5.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.

Authors:  Kai Wang; Mingyao Li; Dexter Hadley; Rui Liu; Joseph Glessner; Struan F A Grant; Hakon Hakonarson; Maja Bucan
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

6.  Conveyor: a workflow engine for bioinformatic analyses.

Authors:  Burkhard Linke; Robert Giegerich; Alexander Goesmann
Journal:  Bioinformatics       Date:  2011-01-28       Impact factor: 6.937

Review 7.  Mechanisms of change in gene copy number.

Authors:  P J Hastings; James R Lupski; Susan M Rosenberg; Grzegorz Ira
Journal:  Nat Rev Genet       Date:  2009-08       Impact factor: 53.242

8.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors:  Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal:  Genome Biol       Date:  2010-08-25       Impact factor: 13.583

9.  Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme.

Authors:  Kristian Ovaska; Marko Laakso; Saija Haapa-Paananen; Riku Louhimo; Ping Chen; Viljami Aittomäki; Erkka Valo; Javier Núñez-Fontarnau; Ville Rantanen; Sirkku Karinen; Kari Nousiainen; Anna-Maria Lahesmaa-Korpinen; Minna Miettinen; Lilli Saarinen; Pekka Kohonen; Jianmin Wu; Jukka Westermarck; Sampsa Hautaniemi
Journal:  Genome Med       Date:  2010-09-07       Impact factor: 11.117

10.  Pegasys: software for executing and integrating analyses of biological sequences.

Authors:  Sohrab P Shah; David Y M He; Jessica N Sawkins; Jeffrey C Druce; Gerald Quon; Drew Lett; Grace X Y Zheng; Tao Xu; B F Francis Ouellette
Journal:  BMC Bioinformatics       Date:  2004-04-19       Impact factor: 3.169

View more
  4 in total

1.  Integrative genetic, epigenetic and pathological analysis of paraganglioma reveals complex dysregulation of NOTCH signaling.

Authors:  Alessandro Cama; Fabio Verginelli; Lavinia Vittoria Lotti; Francesco Napolitano; Annalisa Morgano; Andria D'Orazio; Michele Vacca; Silvia Perconti; Felice Pepe; Federico Romani; Francesca Vitullo; Filippo di Lella; Rosa Visone; Massimo Mannelli; Hartmut P H Neumann; Giancarlo Raiconi; Carlo Paties; Antonio Moschetta; Roberto Tagliaferri; Angelo Veronese; Mario Sanna; Renato Mariani-Costantini
Journal:  Acta Neuropathol       Date:  2013-08-18       Impact factor: 17.088

2.  BigDataScript: a scripting language for data pipelines.

Authors:  Pablo Cingolani; Rob Sladek; Mathieu Blanchette
Journal:  Bioinformatics       Date:  2014-09-03       Impact factor: 6.937

3.  repo: an R package for data-centered management of bioinformatic pipelines.

Authors:  Francesco Napolitano
Journal:  BMC Bioinformatics       Date:  2017-02-16       Impact factor: 3.169

4.  Challenges of the Unknown: Clinical Application of Microbial Metagenomics.

Authors:  Graham Rose; David J Wooldridge; Catherine Anscombe; Edward T Mee; Raju V Misra; Saheer Gharbia
Journal:  Int J Genomics       Date:  2015-09-14       Impact factor: 2.326

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.