Literature DB >> 28453682

DistributedFBA.jl: high-level, high-performance flux balance analysis in Julia.

Laurent Heirendt1, Ines Thiele1, Ronan M T Fleming1.   

Abstract

Motivation: Flux balance analysis and its variants are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations.
Results: DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes. Availability and Implementation: The code is freely available on github.com/opencobra/COBRA.jl. The documentation can be found at opencobra.github.io/COBRA.jl. Contact: ronan.mt.fleming@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28453682      PMCID: PMC5408791          DOI: 10.1093/bioinformatics/btw838

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Constraint-based reconstruction and analysis (COBRA) (Palsson ) is a widely used approach for modeling genome-scale biochemical networks and integrative analysis of omics data in a network context. All COBRA predictions are derived from optimization problems, typically formulated in the form where represents the rate of each biochemical reaction, is a lower semi-continuous and convex function, is a stoichiometric matrix for m molecular species and n reactions, and b is a vector of known metabolic exchanges. Additional linear inequalities (expressed as a system of equations with matrix C and vector d) may be used to constrain combinations of reaction rates and keep reactions between upper and lower bounds, u and l, respectively. In flux balance analysis (FBA), one obtains a steady-state by choosing a coefficient vector and letting and . However, the biologically correct coefficient vector is usually not known, so exploration of the set of steady states relies on the embarrassingly parallel problem of solving (1) for many c. Moreover, while is unique for an optimal flux vector , there may be alternate optimal solutions. In flux variability analysis (FVA), one finds the extremes for each optimal reaction rate by choosing a coefficient vector with one nonzero entry, then minimizing and maximizing , subject to the additional constraint for each reaction in turn (). For kilo-scale models (), the 2n linear optimization problems required for FVA can currently be solved efficiently using existing methods, e.g. FVA of the COBRA Toolbox, fastFVA, or the COBRApy implementation (Schellenberger ; Gudmundsson ; Ebrahim ). However, these implementations perform best when using only one computing node with a few cores, which becomes a temporal limiting factor when exploring the steady state solution space of larger models. Julia is a high-level, high-performance dynamic programming language for technical computing (Bezanson ). Here, we exploit Julia to distribute sets of FBA problems and compare its performance to existing implementations.

2 Overview and implementation

DistributedFBA.jl, part of a novel COBRA.jl package, is implemented in Julia and makes use of the high-level interface MathProgBase.jl (Lubin ; see Supplementary Material). A key feature is the integrated capability of spawning synchronously any number of processes to local and remote workers. Parallelization is primarily achieved through distribution of FBA problems (outer layer), while parallelization of the solution algorithm is solver based (inner layer). COBRA.jl extends the COBRA Toolbox (Schellenberger ) while existing COBRA models (Orth ) can be input.

3 Benchmark results

DistributedFBA.jl and fastFVA (Gudmundsson ) were benchmarked on a set of models of varying dimension (Table 1). All experiments were run on several DELL R630 computing nodes with 2 × 36 threads and 768GB RAM running Linux. As Julia is a just-in-time language, pre-compilation (warm-up) was done on a small-scale model before benchmarking (Orth ). The creation of a parallel pool of workers and the time to spawn the processes are not considered in the reported times.
Table 1

Sizes of S for benchmark models

#Model nameMetabolites mReactions nReferences
1Recon127853820Duarte et al. (2007)
2Recon250637440Thiele et al. (2013)
3Recon3a786612 566
4Recon2 + 11M19 71428 199Heinken et al. (2015)
5Multi-organb47 12361 230
6SRS06464589 75699 104Magnusdottir et al. (2016)
7SRS011061126 682139 420Magnusdottir et al. (2016)
8SRS012273186 662208 714Magnusdottir et al. (2016)

Brunk, E. et al. (2016) Recon 3d: a three-dimensional view of human metabolism and disease (in revision).

Thiele, I. et al. (2016) Multi-organ model (prototype model) (in preparation).

Sizes of S for benchmark models Brunk, E. et al. (2016) Recon 3d: a three-dimensional view of human metabolism and disease (in revision). Thiele, I. et al. (2016) Multi-organ model (prototype model) (in preparation). The serial performance of both implementations is within 10%. The uninodal performance of fastFVA is slightly higher on a few threads, but the performance of distributedFBA.jl is superior for a higher number of threads on a single node (Fig. 1A). The way the FBA problems are distributed among workers (distribution strategy s, see Supplementary Material) yields an additional speedup of 10–20% on a larger number of threads.
Fig. 1

Performance of distributedFBA for selected benchmark models given in Table 1. (A) Speedup factor relative to fastFVA as a function of threads and distribution strategy s (1 node). (B) Multi-nodal speedup in latency and Amdahl’s law (s = 0)

Performance of distributedFBA for selected benchmark models given in Table 1. (A) Speedup factor relative to fastFVA as a function of threads and distribution strategy s (1 node). (B) Multi-nodal speedup in latency and Amdahl’s law (s = 0) According to Amdahl’s law, the theoretical speedup factor is , where N is the number of threads and p is the fraction of the code (including the model) that can be parallelized. The fraction p increases with an increasing model size (Fig. 1B). The maximum speedup factor for a very large number of threads N is . All reactions of models 6–8 given in Table 1 have been optimized (with full output, s = 0) on 4 nodes/256 threads in only , and , respectively. This demonstrates that for high-dimensional models, it is critical to have a large number of threads on multiple high-memory nodes to accrue a significant speedup.

4 Discussion

The multi-nodal performance of distributedFBA.jl is unparalleled: the scalability of distributedFBA.jl matches theoretical predictions, and resources are optimally used. Key advantages are that the present implementation is open-source, platform independent, and that no pool size limits, memory, or node/thread limitations exist. Its uninodal performance is similar to the performance of fastFVA on a few threads and about 2–3 times higher on a larger number of threads. A key reason is the direct parallelization capabilities of Julia and the wrapper-free interface to the solver. The unilingual and easy-to-use implementation relies on solvers written in other languages, allows the analysis of large and huge-scale biochemical networks in a timely manner, and lifts the analysis possibilities in the COBRA community to another level. Click here for additional data file.
  8 in total

1.  Global reconstruction of the human metabolic network based on genomic and bibliomic data.

Authors:  Natalie C Duarte; Scott A Becker; Neema Jamshidi; Ines Thiele; Monica L Mo; Thuy D Vo; Rohith Srivas; Bernhard Ø Palsson
Journal:  Proc Natl Acad Sci U S A       Date:  2007-01-31       Impact factor: 11.205

2.  Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0.

Authors:  Jan Schellenberger; Richard Que; Ronan M T Fleming; Ines Thiele; Jeffrey D Orth; Adam M Feist; Daniel C Zielinski; Aarash Bordbar; Nathan E Lewis; Sorena Rahmanian; Joseph Kang; Daniel R Hyduke; Bernhard Ø Palsson
Journal:  Nat Protoc       Date:  2011-08-04       Impact factor: 13.491

3.  Generation of genome-scale metabolic reconstructions for 773 members of the human gut microbiota.

Authors:  Stefanía Magnúsdóttir; Almut Heinken; Laura Kutt; Dmitry A Ravcheev; Eugen Bauer; Alberto Noronha; Kacy Greenhalgh; Christian Jäger; Joanna Baginska; Paul Wilmes; Ronan M T Fleming; Ines Thiele
Journal:  Nat Biotechnol       Date:  2016-11-28       Impact factor: 54.908

4.  A community-driven global reconstruction of human metabolism.

Authors:  Ines Thiele; Neil Swainston; Ronan M T Fleming; Andreas Hoppe; Swagatika Sahoo; Maike K Aurich; Hulda Haraldsdottir; Monica L Mo; Ottar Rolfsson; Miranda D Stobbe; Stefan G Thorleifsson; Rasmus Agren; Christian Bölling; Sergio Bordel; Arvind K Chavali; Paul Dobson; Warwick B Dunn; Lukas Endler; David Hala; Michael Hucka; Duncan Hull; Daniel Jameson; Neema Jamshidi; Jon J Jonsson; Nick Juty; Sarah Keating; Intawat Nookaew; Nicolas Le Novère; Naglis Malys; Alexander Mazein; Jason A Papin; Nathan D Price; Evgeni Selkov; Martin I Sigurdsson; Evangelos Simeonidis; Nikolaus Sonnenschein; Kieran Smallbone; Anatoly Sorokin; Johannes H G M van Beek; Dieter Weichart; Igor Goryanin; Jens Nielsen; Hans V Westerhoff; Douglas B Kell; Pedro Mendes; Bernhard Ø Palsson
Journal:  Nat Biotechnol       Date:  2013-03-03       Impact factor: 54.908

5.  Computationally efficient flux variability analysis.

Authors:  Steinn Gudmundsson; Ines Thiele
Journal:  BMC Bioinformatics       Date:  2010-09-29       Impact factor: 3.169

6.  Reconstruction and Use of Microbial Metabolic Networks: the Core Escherichia coli Metabolic Model as an Educational Guide.

Authors:  Jeffrey D Orth; R M T Fleming; Bernhard Ø Palsson
Journal:  EcoSal Plus       Date:  2010-09

7.  Systematic prediction of health-relevant human-microbial co-metabolism through a computational framework.

Authors:  Almut Heinken; Ines Thiele
Journal:  Gut Microbes       Date:  2015

8.  COBRApy: COnstraints-Based Reconstruction and Analysis for Python.

Authors:  Ali Ebrahim; Joshua A Lerman; Bernhard O Palsson; Daniel R Hyduke
Journal:  BMC Syst Biol       Date:  2013-08-08
  8 in total
  11 in total

Review 1.  Path to improving the life cycle and quality of genome-scale models of metabolism.

Authors:  Yara Seif; Bernhard Ørn Palsson
Journal:  Cell Syst       Date:  2021-09-22       Impact factor: 11.091

2.  Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0.

Authors:  Laurent Heirendt; Sylvain Arreckx; Thomas Pfau; Sebastián N Mendoza; Anne Richelle; Almut Heinken; Hulda S Haraldsdóttir; Jacek Wachowiak; Sarah M Keating; Vanja Vlasov; Stefania Magnusdóttir; Chiam Yu Ng; German Preciat; Alise Žagare; Siu H J Chan; Maike K Aurich; Catherine M Clancy; Jennifer Modamio; John T Sauls; Alberto Noronha; Aarash Bordbar; Benjamin Cousins; Diana C El Assal; Luis V Valcarcel; Iñigo Apaolaza; Susan Ghaderi; Masoud Ahookhosh; Marouen Ben Guebila; Andrejs Kostromins; Nicolas Sompairac; Hoai M Le; Ding Ma; Yuekai Sun; Lin Wang; James T Yurkovich; Miguel A P Oliveira; Phan T Vuong; Lemmer P El Assal; Inna Kuperstein; Andrei Zinovyev; H Scott Hinton; William A Bryant; Francisco J Aragón Artacho; Francisco J Planes; Egils Stalidzans; Alejandro Maass; Santosh Vempala; Michael Hucka; Michael A Saunders; Costas D Maranas; Nathan E Lewis; Thomas Sauter; Bernhard Ø Palsson; Ines Thiele; Ronan M T Fleming
Journal:  Nat Protoc       Date:  2019-03       Impact factor: 13.491

3.  Metabolic modelling reveals broad changes in gut microbial metabolism in inflammatory bowel disease patients with dysbiosis.

Authors:  Almut Heinken; Johannes Hertel; Ines Thiele
Journal:  NPJ Syst Biol Appl       Date:  2021-05-06

4.  The Microbiome Modeling Toolbox: from microbial interactions to personalized microbial communities.

Authors:  Federico Baldini; Almut Heinken; Laurent Heirendt; Stefania Magnusdottir; Ronan M T Fleming; Ines Thiele
Journal:  Bioinformatics       Date:  2019-07-01       Impact factor: 6.937

5.  Systematic assessment of secondary bile acid metabolism in gut microbes reveals distinct metabolic capabilities in inflammatory bowel disease.

Authors:  Almut Heinken; Dmitry A Ravcheev; Federico Baldini; Laurent Heirendt; Ronan M T Fleming; Ines Thiele
Journal:  Microbiome       Date:  2019-05-15       Impact factor: 14.650

6.  COBREXA.jl: constraint-based reconstruction and exascale analysis.

Authors:  Miroslav Kratochvíl; Laurent Heirendt; St Elmo Wilken; Taneli Pusa; Sylvain Arreckx; Alberto Noronha; Marvin van Aalst; Venkata P Satagopam; Oliver Ebenhöh; Reinhard Schneider; Christophe Trefois; Wei Gu
Journal:  Bioinformatics       Date:  2021-11-16       Impact factor: 6.937

Review 7.  From Network Analysis to Functional Metabolic Modeling of the Human Gut Microbiota.

Authors:  Eugen Bauer; Ines Thiele
Journal:  mSystems       Date:  2018-03-27       Impact factor: 6.496

8.  Personalized whole-body models integrate metabolism, physiology, and the gut microbiome.

Authors:  Ines Thiele; Swagatika Sahoo; Almut Heinken; Johannes Hertel; Laurent Heirendt; Maike K Aurich; Ronan Mt Fleming
Journal:  Mol Syst Biol       Date:  2020-05       Impact factor: 11.429

9.  Metagenome-Scale Metabolic Network Suggests Folate Produced by Bifidobacterium longum Might Contribute to High-Fiber-Diet-Induced Weight Loss in a Prader-Willi Syndrome Child.

Authors:  Baoyu Xiang; Liping Zhao; Menghui Zhang
Journal:  Microorganisms       Date:  2021-12-01

Review 10.  Genome-Scale Metabolic Modeling Enables In-Depth Understanding of Big Data.

Authors:  Anurag Passi; Juan D Tibocha-Bonilla; Manish Kumar; Diego Tec-Campos; Karsten Zengler; Cristal Zuniga
Journal:  Metabolites       Date:  2021-12-24
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.