Literature DB >> 23964998

WholeCellViz: data visualization for whole-cell models.

Ruby Lee¹, Jonathan R Karr, Markus W Covert.

Abstract

BACKGROUND: Whole-cell models promise to accelerate biomedical science and engineering. However, discovering new biology from whole-cell models and other high-throughput technologies requires novel tools for exploring and analyzing complex, high-dimensional data.
RESULTS: We developed WholeCellViz, a web-based software program for visually exploring and analyzing whole-cell simulations. WholeCellViz provides 14 animated visualizations, including metabolic and chromosome maps. These visualizations help researchers analyze model predictions by displaying predictions in their biological context. Furthermore, WholeCellViz enables researchers to compare predictions within and across simulations by allowing users to simultaneously display multiple visualizations.
CONCLUSION: WholeCellViz was designed to facilitate exploration, analysis, and communication of whole-cell model data. Taken together, WholeCellViz helps researchers use whole-cell model simulations to drive advances in biology and bioengineering.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2013 PMID： 23964998 PMCID： PMC3765349 DOI： 10.1186/1471-2105-14-253

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

Background

Whole-cell computational models promise to predict how complex cellular behaviors such as growth and replication arise from individual molecules and their interactions. Recently, we developed the first whole-cell model of a single cell, the Gram-positive bacterium Mycoplasma genitalium[1]. The model predicts the dynamics of every molecular species over the entire cell cycle, accounting for the specific function of every annotated gene product. The model’s simulations produce rich data containing valuable insights into cellular behavior. For example, the model’s simulations have generated new insights into cell cycle regulation, energy usage, and gene essentiality [1]. However, the large number of whole-cell model predictions – over 50 billion data points in a typical dataset – makes directly analyzing the predictions time consuming and cumbersome. Furthermore, directly analyzing the model’s predictions requires deep knowledge of mathematical modeling, computer programming, and the unique data structures used to represent the model’s predictions. Data visualization software is critically needed to help researchers realize the full potential of whole-cell models by enabling researchers to more quickly and efficiently analyze whole-cell model simulations. We developed WholeCellViz to enable researchers to easily visualize whole-cell model predictions. WholeCellViz provides researchers interactive animations as well as time series plots to easily explore whole-cell model predictions. Furthermore, WholeCellViz facilitates comparisons within and across simulations by enabling researchers to view grids of animations and plots. Interactive data visualization is becoming increasingly important as biological data continues to grow in complexity and volume. Data visualization can help scientists identify subtle patterns in large data sets leading to important scientific findings. For example, Lum et al. used Iris to visualize genetic data from 272 breast cancer patients [2]. Iris revealed a specific genetic profile for women with low estrogen receptor expression, but high survival rates, a group which now receives targeted treatment for breast cancer. Shannon et al. used Cytoscape to visually link biomolecular networks with high-throughput data on various molecular states and functional annotations [3]. Baliga et al. used Cytoscape to obtain a systems-level understanding of Halobacterium energy transduction by visualizing its protein interaction network [4]. Pathway Tools enables researchers to visually integrate genomic, proteomic, and metabolomic data [5]. Chang et al. and Paley et al. used the Pathway Tools Omics Viewer to investigate the role of individual metabolic networks in bacterial infection [6,7]. MulteeSum was developed to visualize three-dimensional gene expression data, and has been used to gain insight into Drosophila development [8,9]. Here we describe WholeCellViz’s implementation, features, and visualizations. We also provide two examples of how WholeCellViz can be used to analyze whole-cell model predictions.

Implementation

Software overview

WholeCellViz is composed of a web-based front-end application and a back-end web server. The front-end displays visualizations to the user. The back-end server stores over 2 TB of simulation data using a combination of a MySQL relational database and JSON (JavaScript Object Notation) files, and sends this data to the front-end as requested by the user. WholeCellViz was developed as a web application in order to enable platform independence, simple installation, instant developer updates, and data streaming.

Back-end storage server

Our whole-cell model software stores the predicted values of each biological variable at each time point using a set of MATLAB data files. We converted this data into the JSON format using custom Python scripts. We stored the metadata for each simulation, and the label and units for each data point in the database. The WholeCellViz front-end requests metadata and JSON file(s) from the back-end server as needed to display visualizations.

Graphical user interface

The WholeCellViz front-end was implemented in HTML5 and JavaScript using the native canvas to maximize performance. We used JQuery (http://jquery.com) to implement event handling, animations, and AJAX calls. The visualizations were implemented using an extensible framework designed to enable additional visualizations to be easily added to WholeCellViz. Specifically, each visualization extends a common class by defining methods for requesting and displaying data. The source code contains a template for constructing additional visualizations. We developed the time series plots using the Flot (http://www.flotcharts.org) plotting library. We used the JQuery and JQuery UI (http://jqueryui.com) libraries to implement WholeCellViz’s grid layout and animation controls.

Results and discussion

We developed WholeCellViz to accelerate data-driven discovery by visualizing whole-cell model simulation data. WholeCellViz uses simulation data to render 14 visualizations that display model predictions in their biological context. Time series plots supplement the visualizations by showing the detailed dynamics of one or multiple biological variables over time. WholeCellViz lays out these visualizations in an easily configurable grid. The animation timeline controls the simultaneous playback of all displayed animations in the grid. Hence, WholeCellViz is able to simultaneously visualize and animate multiple model predictions.

Features

Figure 1 is a sample screenshot of WholeCellViz. We use this figure to describe the features of WholeCellViz.

Figure 1

Cell cycle dynamics view of one wild-type cell at 7 h post-cell cycle initiation. This view includes six animations which highlight the dynamics of the predicted metabolic fluxes and RNA and protein expression over the cell cycle. In particular, the view shows the onset of DNA replication, and the subsequent bidirectional movement of DNA polymerase on the chromosome. The view also highlights the onset of cytokinesis following the completion of DNA replication. (a) Instantaneous shape of M. genitalium as it initially elongates and later pinches at the septum, forming two daughter cells. (b) Metabolic map illustrating metabolite concentrations and reaction fluxes. Each metabolite is normalized to its mean concentration, and each reaction is normalized to its mean flux. Dark blue arrows indicate high reaction flux; light blue arrows indicate low reaction flux. Large circles indicate high metabolite concentrations; small circles indicate low metabolite concentrations. (c) Heatmap of the copy number of each RNA, protein monomer, and protein complex species. Each gene product is normalized to its mean copy number. Yellow indicates high expression; blue indicates low expression. (d) Instantaneous polymerization (blue), methylation (orange), strand break (red), and protein-binding status of the M. genitalium chromosomes. (e) Space-time plot illustrating the instantaneous chromosomal locations of the replication initiator DnaA and DNA polymerase. (f) Map of the protein-coding genes indicating protein synthesis. Each gene is colored according to the length of its longest nascent polypeptide. Green represents genes with one active ribosome; blue represents genes with multiple active ribosomes. An interactive version is available at http://wholecellviz.stanford.edu/cellCycle.

Visualizations

WholeCellViz contains 14 visualizations that animate specific model predictions within their biological context. These visualizations are listed in Table 1 and illustrated in Figures 1 and 2. Together, these 14 visualizations are capable of displaying 88% of the model’s predictions. These visualizations are also interactive. For example, hovering over the metabolism (Figure 1b) visualization reveals tooltips which display metabolite names, compartments, and concentrations. The gene expression panel’s tooltips display gene names, descriptions, and instantaneous copy numbers (Figure 1c). Clicking on a gene in the translation panel (Figure 1f) opens a new tab which displays the gene’s entry in the WholeCellKB model organism database [10].

Table 1

WholeCellViz visualizations

Visualization	Figure	URL
Cell shape	1a	http://wholecellviz.stanford.edu/CellShape
Cell shape (3D)	2a	http://wholecellviz.stanford.edu/CellShape3D
Chromosome (linear)	2c	http://wholecellviz.stanford.edu/Chromosome1
Chromosome (circular)	1d	http://wholecellviz.stanford.edu/Chromosome2
Chromosome (space-time)	1e	http://wholecellviz.stanford.edu/ChrSpaceTime
Cytokinesis	2b	http://wholecellviz.stanford.edu/Cytokinesis
Gene expression	1c	http://wholecellviz.stanford.edu/GeneExp
Immature protein expression	2d	http://wholecellviz.stanford.edu/NascentProtExp
Immature RNA expression	2e	http://wholecellviz.stanford.edu/NascentRnaExp
Metabolism	1b	http://wholecellviz.stanford.edu/Metabolism
Mature protein expression	2f	http://wholecellviz.stanford.edu/MatureProtExp
Mature RNA expression	2g	http://wholecellviz.stanford.edu/MatureRnaExp
Replication initiation	2h	http://wholecellviz.stanford.edu/RepInit
Translation	1f	http://wholecellviz.stanford.edu/Translation

Figure 2

Additional WholeCellViz visualizations. Visualizations highlight one wild-type in silico cell at various time points. (a) Instantaneous shape of M. genitalium as it initially elongates and later pinches at the septum, forming two daughter cells. (b) Instantaneous FtsZ contractile ring size. FtsZ rings iteratively contract at the cell septum to pinch the cell membrane during cytokinesis. (c) Instantaneous polymerization (blue), methylation (orange), strand break (red), and protein-binding status of the M. genitalium chromosomes. (d–g) Heatmaps of the copy number dynamics of immature proteins (d), immature RNA (e), mature proteins (f), and mature RNA (g). Each gene product is normalized to its maximal expression. Yellow indicates high expression; blue indicates low expression. (h) Occupancy of the oriC functional DnaA boxes which recruit DNA polymerase to the oriC to initiate replication.

Time series plots

WholeCellViz can also display line plots showing the values of one or multiple biological variables over time. For example, the middle-left panel of Figure 3 illustrates the temporal dynamics of the intracellular ATP copy number. Time series plots can also display the dynamics of biological variables across simulations, facilitating comparisons across simulations.

Figure 3

Replication dynamics view of one wild-type cell at 7.5 h post-cell cycle initiation. (a) Instantaneous cell shape. (b) Instantaneous polymerization (blue), methylation (orange), strand break (red), and protein-binding status of the M. genitalium chromosomes. (c) Intracellular dNTP copy number dynamics. (d) Instantaneous FtsZ and cell septum sizes. (e) Instantaneous oriC DnaA box occupancy. (f) Superhelicity dynamics. An interactive version is available at http://wholecellviz.stanford.edu/replication.

Animation timeline

The animation timeline at the bottom of the screen controls the simultaneous playback of all displayed visualizations. It provides play/pause, seek, speed, and repeat controls.

Layout editor

The layout editor is accessed by clicking the gear icon in the top-right corner of the visualization panels. The layout editor enables users to configure the grid dimensions and select the visualization or time series plot displayed in each panel.

Data import

Users can visualize data from any server running the server-side WholeCellViz software. The hosted version at http://wholecellviz.stanford.edu provides the over 3,000 described in Karr et al., 2012 [1]. Users can install the whole-cell model and WholeCellViz server software on their own machines, or use the whole-cell Linux virtual machine to execute and visualize new simulations. See below for more information about availability.

Graphical & data export

WholeCellViz exports the plotted data in JSON format and exports graphics in SVG format.

Data exploration using WholeCellViz

WholeCellViz can display multiple visualization panels to facilitate comparative and simultaneous analysis of multiple aspects of simulated cell physiology. In particular, WholeCellViz provides six preconfigured views to help users quickly get started. Each of the six views is a grid of visualizations selected to represent a particular aspect of cellular or population dynamics. These views enable users to explore hypotheses about the data. Here we discuss two case studies to illustrate the power of WholeCellViz to facilitate data exploration.

Replication dynamics

Figure 3 shows a screen shot of the replication dynamics view. This view displays several perspectives on DNA replication and cytokinesis: cell shape, chromosome dynamics, cytokinesis, replication initiation, and dNTP copy number. First, the view shows that before replication initiates the cell contains a single chromosome and steadily accumulates an increasingly large pool of dNTPs. Second, the view shows that once a sufficiently large oriC DNA complex forms, replication begins accompanied by a sharp drop in the dNTP level. Third, the view shows that replication then proceeds quickly until the dNTP supply is depleted, at which point the rate of replication slows. Finally, the view shows that the FtsZ ring contracts immediately following replication completion.

Population variance

Figure 4 shows a screen shot of the population variance view. This view presents summary statistics – growth rate, ATP copy number, dNTP copy number, DNA mass, RNA mass, and protein mass – for eight wild-type in silico cells. The view shows that the growth rate, ATP copy number, RNA mass, and protein mass have relatively little variance at the population level. The dNTP copy number and DNA mass have substantially more variance. In three simulations, the dNTP copy number is depleted more than two hours earlier than in the other simulations, and the DNA mass increases earlier in these simulations. This suggests that the timing of DNA replication initiation does not impact the cellular growth rate, ATP copy number, RNA mass, protein mass, or cell cycle length. Rather, the view suggests that metabolism is the primary factor controlling and coordinating the cell’s growth, chemical content, and division.

Figure 4

Population variance view of eight wild-type cells at 6 h post-cell cycle initiation. View illustrates the temporal dynamics of the cellular growth rate (a), ATP and total dNTP copy numbers (b, c), and DNA, RNA, and protein masses (d–f). Colors indicate the eight in silico cells. An interactive version is available at http://wholecellviz.stanford.edu/population.

Conclusions

WholeCellViz is a web-based program designed to facilitate exploration, and analysis of in silico biological experiments of whole-cell models. The software enables users to fully explore whole-cell model simulations, and displays whole-cell model predictions in their biological context using visualizations and time series plots. Furthermore, WholeCellViz’s grid layout feature enables users to display multiple visualizations and plots, enabling comparative analysis both within and across in silico cells. Going forward, we plan to improve WholeCellViz as a tool for novel model analysis. We plan to develop new visualizations to communicate additional model predictions including DNA supercoiling and RNA and protein maturation. We also plan to develop enhanced plotting tools for detecting complex relationships among model predictions and analyzing stochastic variation. For example, scatter plots could be used to drill-down to specific time points and examine correlations among multiple variable in a single simulation, or among one variable across multiple simulations. Box plots could be used to compare the variance of variables across simulations. To date only one whole-cell model has been developed. Consequently, we chose to focus WholeCellViz on the over 3,000 M. genitalium simulations described in Karr et al., 2012 [1]. Going forward, we plan to integrate WholeCellViz with other whole-cell models and simulation data servers as they become publicly available. Currently users can visualize alternative whole-cell model simulations by (1) running their own simulations using either our M. genitalium model or a similarly detailed model, (2) storing their simulations on their own server using the hybrid MySQL/JSON format described here, and (3) editing the back-end server URL configuration option from the WholeCellViz front-end. Researchers can achieve this either by installing the whole-cell model and WholeCellViz software on their own machine or by using our Linux virtual machine which contains both the whole-cell model and WholeCellViz software (see below for more information about availability). In the future, we also plan to enable researchers to configure and run whole-cell simulations through a simple graphical interface within WholeCellViz. However, this will require the development of more computationally efficient whole-cell model simulations. Overall, whole-cell modeling is an emerging field that has the potential to accelerate the pace of biological discovery and enable rational bioengineering and personalized medicine. Data visualization software such as WholeCellViz is critically needed to help researchers access, explore, and analyze complex, high-dimensional whole-cell model simulations, as well as to accelerate model-driven biological discovery. With the current influx of big data in research and industry, WholeCellViz also serves as an example of how to use animation for scientific communication. We anticipate that WholeCellViz will play a critical role in realizing the full potential of whole-cell models.

Availability and requirements

Project name: WholeCellViz Project home page:http://wholecellviz.stanford.edu Operating system(s): Platform independent Programming language: HTML, JavaScript, PHP Other requirements: Web browser License: MIT license Any restrictions to use by non-academics: None WholeCellViz is available under the MIT license at http://wholecellviz.stanford.edu. The hosted version visualizes the over 3,000 simulations described in Karr et al., 2012 [1], and is also capable of visualizing simulations stored on other servers running the WholeCellViz server-side software. Researchers can install the whole-cell model and WholeCellViz software locally to execute and visualize new simulations. All source code is available open-source at SimTK: http://simtk.org/home/wholecell. A Linux virtual machine containing the whole-cell model and WholeCellViz server and client software is also available at SimTK.

Abbreviations

AJAX: Asynchronous javascript and XML; ATP: Adenosine triphosphate; dNTP: Deoxynucleotide triphosphate; HTML: Hypertext markup language; JSON: Javascript object notation; oriC: Origin of replication; PHP: PHP: hypertext preprocessor; SVG: Scalable vector graphics; TB: Terabyte; URL: Uniform resource locator; XML: Extensible markup language.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

RL and KR contributed equally to the conception and development of WholeCellViz. MC supervised the project. All authors wrote and approved the final manuscript.

9 in total

1. The Pathway Tools software.

Authors: Peter D Karp; Suzanne Paley; Pedro Romero
Journal: Bioinformatics Date: 2002 Impact factor: 6.937

2. Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors: Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal: Genome Res Date: 2003-11 Impact factor: 9.043

3. MulteeSum: a tool for comparative spatial and temporal gene expression data.

Authors: Miriah Meyer; Tamara Munzner; Angela DePace; Hanspeter Pfister
Journal: IEEE Trans Vis Comput Graph Date: 2010 Nov-Dec Impact factor: 4.579

4. Coordinate regulation of energy transduction modules in Halobacterium sp. analyzed by a global systems approach.

Authors: Nitin S Baliga; Min Pan; Young Ah Goo; Eugene C Yi; David R Goodlett; Krassen Dimitrov; Paul Shannon; Ruedi Aebersold; Wailap Victor Ng; Leroy Hood
Journal: Proc Natl Acad Sci U S A Date: 2002-10-28 Impact factor: 11.205

5. A whole-cell computational model predicts phenotype from genotype.

Authors: Jonathan R Karr; Jayodita C Sanghvi; Derek N Macklin; Miriam V Gutschow; Jared M Jacobs; Benjamin Bolival; Nacyra Assad-Garcia; John I Glass; Markus W Covert
Journal: Cell Date: 2012-07-20 Impact factor: 41.582

6. Carbon nutrition of Escherichia coli in the mouse intestine.

Authors: Dong-Eun Chang; Darren J Smalley; Don L Tucker; Mary P Leatham; Wendy E Norris; Sarah J Stevenson; April B Anderson; Joe E Grissom; David C Laux; Paul S Cohen; Tyrrell Conway
Journal: Proc Natl Acad Sci U S A Date: 2004-05-03 Impact factor: 11.205

7. WholeCellKB: model organism databases for comprehensive whole-cell models.

Authors: Jonathan R Karr; Jayodita C Sanghvi; Derek N Macklin; Abhishek Arora; Markus W Covert
Journal: Nucleic Acids Res Date: 2012-11-21 Impact factor: 16.971

8. A conserved developmental patterning network produces quantitatively different output in multiple species of Drosophila.

Authors: Charless C Fowlkes; Kelly B Eckenrode; Meghan D Bragdon; Miriah Meyer; Zeba Wunderlich; Lisa Simirenko; Cris L Luengo Hendriks; Soile V E Keränen; Clara Henriquez; David W Knowles; Mark D Biggin; Michael B Eisen; Angela H DePace
Journal: PLoS Genet Date: 2011-10-27 Impact factor: 5.917

9. The Pathway Tools cellular overview diagram and Omics Viewer.

Authors: Suzanne M Paley; Peter D Karp
Journal: Nucleic Acids Res Date: 2006-08-07 Impact factor: 16.971

9 in total

10 in total

1. Building Structural Models of a Whole Mycoplasma Cell.

Authors: Martina Maritan; Ludovic Autin; Jonathan Karr; Markus W Covert; Arthur J Olson; David S Goodsell
Journal: J Mol Biol Date: 2021-11-10 Impact factor: 5.469

Review 2. Construction of Multiscale Genome-Scale Metabolic Models: Frameworks and Challenges.

Authors: Xinyu Bi; Yanfeng Liu; Jianghua Li; Guocheng Du; Xueqin Lv; Long Liu
Journal: Biomolecules Date: 2022-05-19

Review 3. The future of whole-cell modeling.

Authors: Derek N Macklin; Nicholas A Ruggero; Markus W Covert
Journal: Curr Opin Biotechnol Date: 2014-02-17 Impact factor: 9.740

4. WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions.

Authors: Jonathan R Karr; Nolan C Phillips; Markus W Covert
Journal: Database (Oxford) Date: 2014-09-16 Impact factor: 3.451

Review 5. Open source libraries and frameworks for biological data visualisation: a guide for developers.

Authors: Rui Wang; Yasset Perez-Riverol; Henning Hermjakob; Juan Antonio Vizcaíno
Journal: Proteomics Date: 2015-02-05 Impact factor: 3.984

Review 6. Evolvix BEST Names for semantic reproducibility across code2brain interfaces.

Authors: Laurence Loewe; Katherine S Scheuer; Seth A Keel; Vaibhav Vyas; Ben Liblit; Bret Hanlon; Michael C Ferris; John Yin; Inês Dutra; Anthony Pietsch; Christine G Javid; Cecilia L Moog; Jocelyn Meyer; Jerdon Dresel; Brian McLoone; Sonya Loberger; Arezoo Movaghar; Morgaine Gilchrist-Scott; Yazeed Sabri; Dave Sescleifer; Ivan Pereda-Zorrilla; Andrew Zietlow; Rodrigo Smith; Samantha Pietenpol; Jacob Goldfinger; Sarah L Atzen; Erika Freiberg; Noah P Waters; Claire Nusbaum; Erik Nolan; Alyssa Hotz; Richard M Kliman; Ayalew Mentewab; Nathan Fregien; Martha Loewe
Journal: Ann N Y Acad Sci Date: 2016-12-05 Impact factor: 5.691

7. Automated visualization of rule-based models.

Authors: John Arul Prakash Sekar; Jose-Juan Tapia; James R Faeder
Journal: PLoS Comput Biol Date: 2017-11-13 Impact factor: 4.475

8. Toward Community Standards and Software for Whole-Cell Modeling.

Authors: Dagmar Waltemath; Jonathan R Karr; Frank T Bergmann; Vijayalakshmi Chelliah; Michael Hucka; Marcus Krantz; Wolfram Liebermeister; Pedro Mendes; Chris J Myers; Pinar Pir; Begum Alaybeyoglu; Naveen K Aranganathan; Kambiz Baghalian; Arne T Bittig; Paulo E Pinto Burke; Matteo Cantarelli; Yin Hoon Chew; Rafael S Costa; Joseph Cursons; Tobias Czauderna; Arthur P Goldberg; Harold F Gomez; Jens Hahn; Tuure Hameri; Daniel F Hernandez Gardiol; Denis Kazakiewicz; Ilya Kiselev; Vincent Knight-Schrijver; Christian Knupfer; Matthias Konig; Daewon Lee; Audald Lloret-Villas; Nikita Mandrik; J Kyle Medley; Bertrand Moreau; Hojjat Naderi-Meshkin; Sucheendra K Palaniappan; Daniel Priego-Espinosa; Martin Scharm; Mahesh Sharma; Kieran Smallbone; Natalie J Stanford; Je-Hoon Song; Tom Theile; Milenko Tokic; Namrata Tomar; Vasundra Toure; Jannis Uhlendorf; Thawfeek M Varusai; Leandro H Watanabe; Florian Wendland; Markus Wolfien; James T Yurkovich; Yan Zhu; Argyris Zardilis; Anna Zhukova; Falk Schreiber
Journal: IEEE Trans Biomed Eng Date: 2016-06-10 Impact factor: 4.538

9. Multi-scale cellular engineering: From molecules to organ-on-a-chip.

Authors: Ngan F Huang; Ovijit Chaudhuri; Patrick Cahan; Aijun Wang; Adam J Engler; Yingxiao Wang; Sanjay Kumar; Ali Khademhosseini; Song Li
Journal: APL Bioeng Date: 2020-03-03

10. Understanding Metabolic Flux Behaviour in Whole-Cell Model Output.

Authors: Sophie Landon; Oliver Chalkley; Gus Breese; Claire Grierson; Lucia Marucci
Journal: Front Mol Biosci Date: 2021-12-17

10 in total