| Literature DB >> 32214380 |
Gennady Gorin1, Mengyu Wang2,3, Ido Golding2,3, Heng Xu4,5.
Abstract
Recent advances in single-molecule fluorescent imaging have enabled quantitative measurements of transcription at a single gene copy, yet an accurate understanding of transcriptional kinetics is still lacking due to the difficulty of solving detailed biophysical models. Here we introduce a stochastic simulation and statistical inference platform for modeling detailed transcriptional kinetics in prokaryotic systems, which has not been solved analytically. The model includes stochastic two-state gene activation, mRNA synthesis initiation and stepwise elongation, release to the cytoplasm, and stepwise co-transcriptional degradation. Using the Gillespie algorithm, the platform simulates nascent and mature mRNA kinetics of a single gene copy and predicts fluorescent signals measurable by time-lapse single-cell mRNA imaging, for different experimental conditions. To approach the inverse problem of estimating the kinetic parameters of the model from experimental data, we develop a heuristic optimization method based on the genetic algorithm and the empirical distribution of mRNA generated by simulation. As a demonstration, we show that the optimization algorithm can successfully recover the transcriptional kinetics of simulated and experimental gene expression data. The platform is available as a MATLAB software package at https://data.caltech.edu/records/1287.Entities:
Mesh:
Year: 2020 PMID: 32214380 PMCID: PMC7098607 DOI: 10.1371/journal.pone.0230736
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Model and simulation platform.
A: Model schematic and probe parameterization (gold: probe coverage, P3: 3’-most edge of the probe, P5: 5′-most edge of the probe) B: Time-dependent molecule-level visualizations available through the GUI. Trajectory generated using k = 100 min-1, k = 3 min-1, k = 10 min-1, k = 0.5 min-1, v = 41.5 nt s-1, T = 10 min, L = 5300 nt, 241 steps of elongation to complete transcription (dark line: intact RNA stretches, light line: degraded RNA stretches, pink circle: RNase molecule). C: Single-cell trajectory with simulated nascent and mature fluorescent signals. Parameters same as in B (red: total signal, blue: nascent signal, green: mature signal, shaded regions: times displayed in B).
Fig 2Parameter estimation process and performance.
A: Parallelized calculation of the search objective function for a set of trial parameters (ΔMean: mean squared error, ΔCDF: Wasserstein distance, Objective: error function value). B: Convergence of the genetic algorithm at the end of each stage of the search (red: ground truth target, gray: population of parameter estimates). C: Final trial parameter population from B (red: ground truth target, histogram: estimate population, gray line: mean estimate, gray region: one-sigma region of estimates). D: Evolution of parameter estimates throughout the search process (red: ground truth target, gray line: mean estimate, gray region: one-sigma region of estimates). E: Comparison of mean probe signal between target and fit (circles: target data, dotted line: mean parameter estimate, shaded region around dotted line: signal spanned by fifty estimates sampled from the one-sigma region). Colors as in Fig 1. F: Comparison of copy-number distributions between target and fit (shaded gray regions: target histogram, colored lines: histogram generated from mean parameter estimate, top row/blue: nascent mRNA distribution, bottom row/red: total mRNA distribution). G: Comparison of mean probe signal between target and fit in turn-off cross-validation experiment. Convention as given for E. H: Estimation of modulated parameters. Top trial modulates k, bottom trial modulates k. All other parameters are constant but unknown to the search algorithm and are fit independently (red: ground truth target, gray dots and error bars: mean estimate and one-sigma region of three replicates).