Literature DB >> 28666314

GPU-powered model analysis with PySB/cupSODA.

Leonard A Harris1,2, Marco S Nobile3,4, James C Pino2,5, Alexander L R Lubbock1,2, Daniela Besozzi3,4, Giancarlo Mauri3,4, Paolo Cazzaniga4,6, Carlos F Lopez1,2.   

Abstract

SUMMARY: A major barrier to the practical utilization of large, complex models of biochemical systems is the lack of open-source computational tools to evaluate model behaviors over high-dimensional parameter spaces. This is due to the high computational expense of performing thousands to millions of model simulations required for statistical analysis. To address this need, we have implemented a user-friendly interface between cupSODA, a GPU-powered kinetic simulator, and PySB, a Python-based modeling and simulation framework. For three example models of varying size, we show that for large numbers of simulations PySB/cupSODA achieves order-of-magnitude speedups relative to a CPU-based ordinary differential equation integrator.
AVAILABILITY AND IMPLEMENTATION: The PySB/cupSODA interface has been integrated into the PySB modeling framework (version 1.4.0), which can be installed from the Python Package Index (PyPI) using a Python package manager such as pip. cupSODA source code and precompiled binaries (Linux, Mac OS/X, Windows) are available at github.com/aresio/cupSODA (requires an Nvidia GPU; developer.nvidia.com/cuda-gpus). Additional information about PySB is available at pysb.org. CONTACT: paolo.cazzaniga@unibg.it or c.lopez@vanderbilt.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28666314      PMCID: PMC5860165          DOI: 10.1093/bioinformatics/btx420

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Kinetic modeling of complex biochemical systems is central to the emerging field of systems biology (Kitano, 2002; Le Novère, 2015). Kinetic models require definition of numerous free parameters, usually obtained by calibration to experimental data, that specify initial species concentrations and kinetic rate constants. Once calibrated, a model should be analyzed for its sensitivity and predictive power over ranges of parameter values (Fisher and Henzinger, 2007). Both model calibration and analysis can require thousands to millions of model simulations for statistical convergence and significance (Eydgahi ; Gutenkunst ). In many cases, the computational expense of simulation at this scale makes detailed model analysis infeasible. Recently, efforts have been made to leverage the highly parallel structure of graphics processing units (GPUs) to accelerate scientific computations (Dematté and Prandi, 2010; Nobile ). GPUs are well suited for applications in which the same arithmetic operations are applied to many independent data elements, e.g. solving independently parameterized systems of ordinary differential equations (ODEs). GPU-based kinetic simulators thus hold great promise for accelerating tasks such as model calibration and analysis, but are challenging for non-experts to use because they require specialized settings and inputs. To address this problem, we have created a user-friendly interface between the GPU-based kinetic simulator cupSODA (Nobile , 2014) and PySB, a Python-based modeling and simulation platform (Lopez ). cupSODA is built around the well-known adaptive stiff/non-stiff ODE integrator LSODA (Petzold, 1983). It is designed to perform thousands of parallel simulations, each independently parameterized, of mass-action kinetic models by leveraging the high-performance memories on the GPU, specifically the cached and non-mutable constant memory and the low-latency on-chip shared memory. PySB is a rule-based modeling (Chylek , 2015) platform for constructing and analyzing complex models of biochemical systems. Models can be constructed in native Python code or imported from various formats, including the Systems Biology Markup Language (SBML) (Hucka ). PySB leverages powerful libraries within the Python ecosystem, such as NumPy, SymPy and SciPy (Perez ), and provides user-friendly interfaces to numerous third-party simulation and analysis tools, including BioNetGen (Faeder ; Harris ), KaSim (Suderman and Deeds, 2013) and StochKit (Sanft ). Below, we briefly describe the main features of the PySB/cupSODA interface and showcase its utility by performing run time and sensitivity analyses for three model systems of varying size (Table 1).
Table 1.

Models used for PySB/cupSODA performance testing

ModelSpeciesReactionsEnd timeOutput steps
Cell cyclea57100100
Ras/cAMP/PKAb33391500100
EARMc7710520 000100

Tyson (1991).

Besozzi et al. (2012).

Lopez et al. (2013).

Models used for PySB/cupSODA performance testing Tyson (1991). Besozzi et al. (2012). Lopez et al. (2013).

2 Features and implementation

cupSODA is designed to exploit the massive parallelism of the CUDA architecture (Nickolls ). To run simulations with cupSODA, one must construct multiple input files containing, e.g. the reaction stoichiometries and the initial species concentrations and rate parameter values for each specified simulation. Numerous simulator-specific parameters must also be defined, such as the number of CUDA ‘blocks’ to use and the desired cupSODA memory configuration (see Supplementary Information). The number of simulations that cupSODA can run in parallel is limited by the number of CUDA ‘cores’ on the GPU (usually a few thousand; see Supplementary Table S1), but the number of simulations that can be loaded onto the GPU at one time is usually many more than this, limited by the available memory (Nobile , 2014). The PySB/cupSODA interface simplifies and streamlines the use of cupSODA via a CupSodaSimulator class, available within the PySB package. The class constructor accepts the following arguments: The CupSodaSimulator constructor also recognizes numerous keyword arguments (kwargs), such as n_blocks, the number of CUDA blocks and memory_usage, the desired memory configuration. Importantly, default values are defined for each kwarg, removing the need for user input. For example, if a user-defined value is not provided, the number of CUDA blocks is automatically calculated by querying the specifications of the GPU in use. model: A PySB model object (required) tspan: A list of output time points (default: None) initials: A list or dictionary of initial species concentrations for each simulation (default: None) param_values: A list or dictionary of rate parameter values for each simulation (default: None) verbose: Verbose output (default: False) The CupSodaSimulator.run() method performs the simulations by constructing the cupSODA input files and invoking cupSODA as a subprocess (the method takes tspan, initials and param_values as optional arguments). Additionally, the method reads into a three-dimensional array the results of the simulations (species time courses), which cupSODA outputs to (typically thousands of) separate text files. The user then has the ability to analyze and/or visualize the results using tools available within the Python ecosystem, e.g. plotting the time courses using the Matplotlib library (Perez ). For convenience, a run_cupSODA wrapper function has also been implemented that combines invocations of the CupSodaSimulator constructor and run method into a single step. A workflow diagram and example Python script using the run_cupSODA function are provided in Supplementary Figures S1 and S2, respectively.

3 Results

In Figure 1A–C and Supplementary Figure S3, we compare the run time efficiency of PySB/cupSODA to the CPU-bound ODE integrator LSODA, available in the Python package SciPy (Oliphant, 2007), for three example models listed in Table 1 (see Supplementary Information for descriptions). These include models of the eukaryotic cell cycle (Tyson, 1991), the Ras/cAMP/PKA signaling pathway in Saccharomyces cerevisiae (Besozzi ), and extrinsically induced apoptosis in mammalian cells (EARM: extrinsic apoptosis reaction model) (Lopez ). Run time comparisons show that in all cases SciPy/LSODA is faster for small numbers of simulations but PySB/cupSODA overtakes it for large numbers of simulations, achieving a maximum speedup of approximately one order of magnitude. Comparable speedups are achieved for other memory settings and GPUs (Supplementary Information and Supplementary Figs S4 and S5).
Fig. 1

(A–C) Run time comparisons between PySB/cupSODA and SciPy/LSODA for the example models in Table 1 (all simulations performed with the same initial protein concentrations and rate parameters). (D) Sensitivity in time-to-death in EARM to variations (±20%; 25 410 total simulations) in the initial protein concentrations (gold lines are medians; boxes range from the first to third quartile; whiskers extend to the minimum and maximum values). PySB/cupSODA simulations were run using cupSODA 1.0.0 on a GeForce GTX 980 Ti GPU (2816 cores, 16 threads/block); SciPy/LSODA simulations were run on an Intel Xeon E5-2667 v3 @ 3.20 GHz CPU (see Supplementary Table S1)

(A–C) Run time comparisons between PySB/cupSODA and SciPy/LSODA for the example models in Table 1 (all simulations performed with the same initial protein concentrations and rate parameters). (D) Sensitivity in time-to-death in EARM to variations (±20%; 25 410 total simulations) in the initial protein concentrations (gold lines are medians; boxes range from the first to third quartile; whiskers extend to the minimum and maximum values). PySB/cupSODA simulations were run using cupSODA 1.0.0 on a GeForce GTX 980 Ti GPU (2816 cores, 16 threads/block); SciPy/LSODA simulations were run on an Intel Xeon E5-2667 v3 @ 3.20 GHz CPU (see Supplementary Table S1) For each model in Table 1, we also performed sensitivity analyses (Fig. 1D and Supplementary Figs S6–S12) by quantifying changes in defined model outputs to variations (±20%) in initial protein concentrations around a set of reference values (Supplementary Tables S2–S4; see Supplementary Information for further details). The ability to efficiently perform such analyses is critical since non-genetic variability within isogenic cell populations has been attributed to significant variations in protein concentrations across cells (Spencer ). In Figure 1D and Supplementary Figure S11, we analyze the sensitivity in ‘time-to-death’ in EARM (defined as the time at which Smac reaches 50% cleavage; see Supplementary Information and Supplementary Fig. S10) for a specific set of rate parameters. Our results show that time-to-death is sensitive to the initial levels of six of the 21 proteins considered. Of particular interest is the sensitivity to Bak. The same analysis performed for a different set of rate parameters (Supplementary Fig. S12) shows insensitivity to Bak but sensitivity to Bax. This indicates that the model harbors at least two alternative pathways to apoptosis induction. The analysis comprised 25 410 total simulations and took ∼11 min with PySB/cupSODA and ∼35 min with SciPy/LSODA (for both parameter sets). Similar accelerations were seen for the cell cycle and Ras/cAMP/PKA models (Supplementary Information).

4 Conclusion

The PySB/cupSODA interface provides the modeling community with a high-performance GPU-based kinetic simulator, that can run thousands of parallel simulations on a common desktop workstation, within the easy-to-use framework of a full-fledged, open-source programming and analysis environment in Python. This will greatly accelerate and streamline the process of analyzing complex biochemical models for systems biology applications. Click here for additional data file.
  18 in total

1.  Modeling the cell division cycle: cdc2 and cyclin interactions.

Authors:  J J Tyson
Journal:  Proc Natl Acad Sci U S A       Date:  1991-08-15       Impact factor: 11.205

Review 2.  Modeling for (physical) biologists: an introduction to the rule-based approach.

Authors:  Lily A Chylek; Leonard A Harris; James R Faeder; William S Hlavacek
Journal:  Phys Biol       Date:  2015-07-16       Impact factor: 2.583

Review 3.  Rule-based modeling: a computational approach for studying biomolecular site dynamics in cell signaling systems.

Authors:  Lily A Chylek; Leonard A Harris; Chang-Shung Tung; James R Faeder; Carlos F Lopez; William S Hlavacek
Journal:  Wiley Interdiscip Rev Syst Biol Med       Date:  2013-09-30

4.  StochKit2: software for discrete stochastic simulation of biochemical systems with events.

Authors:  Kevin R Sanft; Sheng Wu; Min Roh; Jin Fu; Rone Kwei Lim; Linda R Petzold
Journal:  Bioinformatics       Date:  2011-07-04       Impact factor: 6.937

5.  BioNetGen 2.2: advances in rule-based modeling.

Authors:  Leonard A Harris; Justin S Hogg; José-Juan Tapia; John A P Sekar; Sanjana Gupta; Ilya Korsunsky; Arshi Arora; Dipak Barua; Robert P Sheehan; James R Faeder
Journal:  Bioinformatics       Date:  2016-07-08       Impact factor: 6.937

Review 6.  Quantitative and logic modelling of molecular and gene networks.

Authors:  Nicolas Le Novère
Journal:  Nat Rev Genet       Date:  2015-02-03       Impact factor: 53.242

7.  The role of feedback control mechanisms on the establishment of oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae.

Authors:  Daniela Besozzi; Paolo Cazzaniga; Dario Pescini; Giancarlo Mauri; Sonia Colombo; Enzo Martegani
Journal:  EURASIP J Bioinform Syst Biol       Date:  2012-07-20

8.  Non-genetic origins of cell-to-cell variability in TRAIL-induced apoptosis.

Authors:  Sabrina L Spencer; Suzanne Gaudet; John G Albeck; John M Burke; Peter K Sorger
Journal:  Nature       Date:  2009-04-12       Impact factor: 49.962

9.  Programming biological models in Python using PySB.

Authors:  Carlos F Lopez; Jeremy L Muhlich; John A Bachman; Peter K Sorger
Journal:  Mol Syst Biol       Date:  2013       Impact factor: 11.429

10.  Properties of cell death models calibrated and compared using Bayesian approaches.

Authors:  Hoda Eydgahi; William W Chen; Jeremy L Muhlich; Dennis Vitkup; John N Tsitsiklis; Peter K Sorger
Journal:  Mol Syst Biol       Date:  2013       Impact factor: 11.429

View more
  3 in total

1.  Modeling heterogeneous tumor growth dynamics and cell-cell interactions at single-cell and cell-population resolution.

Authors:  Leonard A Harris; Samantha Beik; Patricia M M Ozawa; Lizandra Jimenez; Alissa M Weaver
Journal:  Curr Opin Syst Biol       Date:  2019-09-16

2.  Programmatic modeling for biological systems.

Authors:  Alexander L R Lubbock; Carlos F Lopez
Journal:  Curr Opin Syst Biol       Date:  2021-05-24

3.  FiCoS: A fine-grained and coarse-grained GPU-powered deterministic simulator for biochemical networks.

Authors:  Andrea Tangherloni; Marco S Nobile; Paolo Cazzaniga; Giulia Capitoli; Simone Spolaor; Leonardo Rundo; Giancarlo Mauri; Daniela Besozzi
Journal:  PLoS Comput Biol       Date:  2021-09-09       Impact factor: 4.475

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.