| Literature DB >> 26405142 |
Christoph Zimmer1, Sven Sahle2.
Abstract
Estimating model parameters from experimental data is a crucial technique for working with computational models in systems biology. Since stochastic models are increasingly important, parameter estimation methods for stochastic modelling are also of increasing interest. This study presents an extension to the 'multiple shooting for stochastic systems (MSS)' method for parameter estimation. The transition probabilities of the likelihood function are approximated with normal distributions. Means and variances are calculated with a linear noise approximation on the interval between succeeding measurements. The fact that the system is only approximated on intervals which are short in comparison with the total observation horizon allows to deal with effects of the intrinsic stochasticity. The study presents scenarios in which the extension is essential for successfully estimating the parameters and scenarios in which the extension is of modest benefit. Furthermore, it compares the estimation results with reversible jump techniques showing that the approximation does not lead to a loss of accuracy. Since the method is not based on stochastic simulations or approximative sampling of distributions, its computational speed is comparable with conventional least-squares parameter estimation methods.Entities:
Mesh:
Year: 2015 PMID: 26405142 PMCID: PMC8687418 DOI: 10.1049/iet-syb.2014.0020
Source DB: PubMed Journal: IET Syst Biol ISSN: 1751-8849 Impact factor: 1.615
Fig. 1Outline of the simulation study
Time courses are simulated (5 small graphics) and used as pseudo data for a parameter estimation (rep resented by the arrows). Each time series results in one estimator plotted in the coordinate system in the middle. As each time series is intrinsically stochastic, the estimators are random variables clustering around the true parameter value. To judge “how close” they are, two statistical quantities are displayed: their average av,. to see if the method is unbiased, and their average relative error ARE, to see how far they spread around the true value.
Note that for a parameter estimation one time series is necessary. Only for the simulation study more than one time series is needed.
Statistics of the estimation results for immigration‐death model
|
| (0.6, 0.03) | (0.6, 0.06) | (0.6, 0.1) | (0.1, 0.03) | (0.2, 0.03) |
|---|---|---|---|---|---|
| (21 2) | av:(0.59, 0.03) | av:(0.62, 0.06) | av:(0.64,0.11) | av:(0.106, 0.032) | av:(0.2,0.027) |
| ARE:(33%, 34%) | ARE:(31%, 34%) | ARE:(28%, 34%) | ARE:(46%, 47%) | ARE:(35%, 32%) | |
|
|
|
|
|
| |
| rank 1 | rank 68 | rank 14 | rank 51 | rank 57 | |
| (51 2) | av:(0.6, 0.03) | av:(0.61, 0.063) | av:(0.6, 0.103) | av:(0.09, 0.033) | av:(0.21, 0.033) |
| ARE:(16%, 16%) | ARE:(21%, 20%) | ARE:(24%, 24%) | ARE:(28%, 37%) | ARE:(27%, 25%) | |
|
|
|
|
|
| |
| rank 59 | rank 72 | rank 28 | rank 21 | rank 26 | |
| (21 5) | av:(0.62, 0.031) | av:(0.61, 0.063) | av:(0.67, 0.108) | av:(0.1, 0.032) | av:(0.21, 0.034) |
| ARE:(31%, 31%) | ARE:(30%, 30%) | ARE:(35%, 31%) | ARE:(37%, 28%) | ARE:(36%, 35%) | |
|
|
| w:(0.69, 0.12) |
|
| |
| rank 21 | rank 49 | rank 37 | rank 10 | rank 6 | |
| (101 10) | av = (0.62, 0.031) | av:(0.63, 0.063) | av:(0.76, 0.126) | av:(0.11, 0.032) | av:(0.20, 0.031) |
| ARE:(13%, 15%) | ARE:(16%, 16%) | ARE:(40%, 39%) | ARE:(16%, 17%) | ARE:(14%, 14%) | |
|
|
|
|
|
| |
| rank 56 | rank 91 | rank 73 | rank 47 | rank 57 |
This table shows the performance of the extended MSS method on an immigration‐death model and compares it with results calculated by Wang et al. [8]. Different parametrisation are considered (columns) as well as different measurement scenarios (rows). θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For each of the 20 scenarios, 100 simulated time series are calculated with the Gillespie algorithm [2] within COPASI [37] Version 4.11‐65. For each of the time series, an estimator is calculated. The table shows The rank of w among the 100 estimates of this paper with respect to relative error (details on how this rank is calculated can be found in Sections 3.1.2).
the average of the estimators:
their ARE:
w is the estimate of one time series shown by Wang et al. [8]
The table shows that the extended MSS method estimates the parameters with acceptably small relative error for all considered scenarios. In comparison with the reversible jump method by Wang et al. [8], the relative error is not significantly larger, which shows that the high computational speed of the method does not lead to a loss of accuracy.
Fig. 2Estimates for an immigration‐death scenario
For the scenario of 21 measurements with Δt = 2 (table 1, first row, first column) and θ (0) = (0.6,0.03) the graphic shows the 100 estimates (black dots) and the true parameter value θ (0) (big grey dot). Using a deterministic model, only the ratio but not the absolute value of the parameters is identifiable. Using the MSS method, one can see that not only the ratio but also the absolute value is identifiable, however, with a slightly worse accuracy.
Fig. 3Checking the approximation for one time course
This figure checks how well the approximation works for a time course of the Immigration‐Death model.
As the data sets contain 21 measurements, 21‐1 N(0, 1) random variables are drawn and a density estimation is performed with the SmoothKernelDistribution function in Mathematica 9 [39]. This action is repeated 1000 times. The solid line shows the mean of the 1000 densities and the yellow area fills the area from the 10%‐quantile to the 90%‐quantile, the grey area from the 1%‐quantile to the 99%‐quantile. The graphic shows an example of the ID11 scenario. The dashed line shows the density estimate of the (r 1,…,r 20) of equation (5). One can see that these numbers are compatible with the assumption of a N(0, 1) distribution which is a condition for the validity of the approximation.
Statistics of the estimation results for an auto‐regulatory gene network – fully observed
|
|
|
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|---|---|---|
| (# Δ | 0.1 | 0.7 | 0.35 | 0.3 | 0.1 | 0.9 | 0.2 | 0.1 | rank | |
| (50 1) | av | 0.294 | 2.051 | 0.367 | 0.323 | 0.405 | 3.624 | 0.197 | 0.1 | div 18 |
| DNA | ARE | 222% | 226% | 20% | 20% | 344% | 341% | 18% | 18% | |
| medRE | 31% | 32% | 16% | 17% | 47% | 44% | 15% | 17% | ||
|
| 0.114 | 0.81 | 0.346 | 0.229 | 0.051 | 0.418 | 0.221 | 0.074 | rank 21 | |
|
| 0.094 | 0.72 | 0.435 | 0.344 | 0.052 | 0.485 | 0.265 | 0.119 | rank 21 | |
| (100 0.5) | av | 0.111 | 0.753 | 0.351 | 0.299 | 0.257 | 2.337 | 0.194 | 0.097 | div 4 |
| DNA | ARE | 21% | 19% | 18% | 16% | 185% | 184% | 17% | 17% | |
| medRE | 13% | 13% | 17% | 12% | 28% | 26% | 13% | 12% | ||
|
| 0.113 | 0.82 | 0.408 | 0.321 | 0.075 | 0.75 | 0.226 | 0.095 | rank 20 | |
|
| 0.113 | 0.71 | 0.276 | 0.253 | 0.086 | 0.77 | 0.223 | 0.1 | rank 11 | |
| (500 0.1) | av | 0.101 | 0.696 | 0.349 | 0.305 | 0.094 | 0.905 | 0.202 | 0.101 | div 0 |
| DNA | ARE | 8% | 7% | 13% | 12% | 9% | 9% | 14% | 15% | |
| medRE | 7% | 6% | 10% | 11% | 8% | 8% | 11% | 12% | ||
|
| 0.079 | 0.74 | 0.349 | 0.286 | 0.101 | 0.86 | 0.183 | 0.094 | rank 10 | |
| (50 1) | av | 0.421 | 2.275 | 0.36 | 0.327 | 0.394 | 3.358 | 0.197 | 0.101 | div 22 |
| DNA | ARE | 340% | 246% | 26% | 25% | 336% | 309% | 35% | 16% | |
| medRE | 21% | 24% | 18% | 19% | 36% | 30% | 31% | 13% | ||
|
| 0.095 | 0.42 | 0.321 | 0.277 | 0.10 | 0.73 | 0.235 | 0.104 | rank 2 | |
|
| 0.097 | 0.9 | 0.35 | 0.335 | 0.079 | 0.92 | 0.312 | 0.12 | rank 8 | |
| (100 0.5) | av | 0.105 | 0.687 | 0.359 | 0.309 | 0.298 | 2.552 | 0.199 | 0.1 | div 8 |
| DNA | ARE | 23% | 23% | 29% | 18% | 220% | 203% | 25% | 17% | |
| medRE | 21% | 20% | 26% | 15% | 25% | 25% | 20% | 14% | ||
|
| 0.12 | 0.4 | 0.52 | 0.38 | 0.092 | 0.998 | 0.215 | 0.081 | rank 42 | |
|
| 0.116 | 0.96 | 0.41 | 0.41 | 0.101 | 1.01 | 0.144 | 0.094 | rank 27 | |
| (500 0.1) | av | 0.107 | 0.623 | 0.302 | 0.315 | 0.092 | 0.895 | 0.215 | 0.096 | div 8 |
| DNA | ARE | 17% | 23% | 32% | 25% | 11% | 7% | 34% | 16% | |
| medRE | 14% | 19% | 27% | 18% | 10% | 5% | 28% | 16% | ||
|
| 0.052 | 0.91 | 0.277 | 0.35 | 0.128 | 0.93 | 0.137 | 0.075 | rank 73 |
This table shows the performance of the extended MSS method on an auto‐regulatory gene network model and compares it with results calculated by Wang et al. [8]. Different measurement scenarios are considered as well as different amounts of DNA . The initial values are for DNA = 10: DNA0 = 6, DNA.P 2(0) = 4 as well as for DNA = 2: DNA0 = 2, DNA.P 2(0) = 0. θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For each scenarios, 100 simulated time series are calculated with the Gillespie algorithm [2] within COPASI [37] Version 4.11‐65. For each of the time series an estimator is calculated. The table shows
the average of the estimators:
their ARE:
medRE: The median of the 100 relative errors
w1: the estimate of one time series shown by Wang et al. [8]
w2: the estimate of a second time series shown by Wang et al. [8]
the rank of w1 and w2 among the 100 estimates of this paper with respect to relative error (details on how this rank is calculated can be found in Sections 3.2.2); and
div stands for the number of time series for which no convergence could be achieved
Statistics of the estimation results for an auto‐regulatory gene network – fully observed plus measurement noise
|
|
|
|
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| (# Δ | 0.1 | 0.7 | 0.35 | 0.3 | 0.1 | 0.9 | 0.2 | 0.1 | div | ||
| (500 0.1) | av | 0.105 | 0.679 | 0.413 | 0.39 | 0.094 | 0.909 | 0.397 | 0.208 | 0.451 | 2 |
|
| ARE | 10% | 8% | 26% | 32% | 10% | 10% | 108% | 115% | 11% | |
|
| medRE | 9% | 7% | 21% | 29% | 9% | 8% | 85% | 101% | 10% | |
| (500 0.1) | av | 0.145 | 0.886 | 0.947 | 0.937 | 0.098 | 0.969 | 0.883 | 0.499 | 1.603 | 1 |
|
| ARE | 45% | 28% | 171% | 212% | 10% | 13% | 343% | 400% | 60% | |
|
| medRE | 43% | 26% | 171% | 207% | 8% | 11% | 323% | 387% | 60% |
This table shows the performance of the extended MSS method on an auto‐regulatory gene network model with measurement noise. Different measurement scenarios are considered as well as different amounts of DNA . The initial values are DNA0 = 6, DNA.P 2(0) = 4, and therefore DNA = 10. θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For both scenarios 100 simulated time series are calculated with the Gillespie algorithm [2] within COPASI [37] Version 4.11‐65. Normally distributed measurement noise is added to each of the simulated time courses with a standard deviation σ displayed in the table. For each of the time series, an estimator is calculated. The table shows
the average of the estimators:
their ARE:
medRE: The median of the 100 relative errors
div stands for the number of time series for which no convergence could be achieved
Statistics of the estimation results for an auto‐regulatory gene network – partially observed
|
|
|
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|---|---|---|
| (# Δ | 0.1 | 0.7 | 0.35 | 0.3 | 0.1 | 0.9 | 0.2 | 0.1 | ||
| (50 1) | av | 0.519 | 5.476 | 0.293 | 0.322 | 0.258 | 2.299 | 0.439 | 0.224 | div 36 |
| DNA | ARE | 438% | 689% | 29% | 25% | 201% | 196% | 136% | 137% | |
| medRE | 80% | 138% | 26% | 15% | 51% | 50% | 95% | 113% | ||
|
| 0.102 | 0.47 | 0.44 | 0.214 | 0.04 | 0.326 | 0.4 | 0.156 | rank 9 | |
|
| 0.09 | 0.70 | 0.440 | 0.348 | 0.052 | 0.483 | 0.263 | 0.119 | rank 1 | |
| (100 0.5) | av | 0.111 | 0.753 | 0.351 | 0.299 | 0.257 | 2.337 | 0.194 | 0.097 | div 4 |
| DNA | ARE | 21% | 19% | 18% | 16% | 185% | 184% | 17% | 17% | |
| medRE | 13% | 13% | 17% | 12% | 28% | 26% | 13% | 12% | ||
|
| 0.125 | 0.91 | 0.402 | 0.316 | 0.077 | 0.78 | 0.23 | 0.097 | rank 26 | |
|
| 0.188 | 0.64 | 0.413 | 0.25 | 0.072 | 0.64 | 0.43 | 0.196 | rank 84 | |
| (500 0.1) | Av | 0.101 | 0.696 | 0.349 | 0.305 | 0.094 | 0.905 | 0.202 | 0.101 | div 0 |
| DNA | ARE | 8% | 7% | 13% | 12% | 9% | 9% | 14% | 15% | |
| medRE | 7% | 6% | 10% | 11% | 8% | 8% | 11% | 12% | ||
|
| 0.078 | 0.76 | 0.35 | 0.3 | 0.103 | 0.88 | 0.188 | 0.097 | rank 7 | |
| (50 1) | av | 0.421 | 2.275 | 0.36 | 0.327 | 0.394 | 3.358 | 0.197 | 0.101 | div 22 |
| DNA | ARE | 340% | 246% | 26% | 25% | 336% | 309% | 35% | 16% | |
| medRE | 21% | 24% | 18% | 19% | 36% | 30% | 31% | 13% | ||
|
| 0.108 | 0.41 | 0.303 | 0.247 | 0.131 | 0.955 | 0.214 | 0.107 | rank 7 | |
|
| 0.079 | 0.56 | 0.383 | 0.332 | 0.073 | 0.82 | 0.228 | 0.099 | rank 3 | |
| (100 0.5) | Av | 0.105 | 0.687 | 0.359 | 0.309 | 0.298 | 2.552 | 0.199 | 0.1 | div 8 |
| DNA | ARE | 23% | 23% | 29% | 18% | 220% | 203% | 25% | 17% | |
| medRE | 21% | 20% | 26% | 15% | 25% | 25% | 20% | 14% | ||
|
| 0.123 | 0.41 | 0.55 | 0.386 | 0.079 | 0.87 | 0.213 | 0.085 | rank 48 | |
|
| 0.103 | 0.81 | 0.419 | 0.421 | 0.102 | 1.04 | 0.142 | 0.097 | rank 13 | |
| (500 0.1) | av | 0.107 | 0.623 | 0.302 | 0.315 | 0.092 | 0.895 | 0.215 | 0.096 | div 8 |
| DNA | ARE | 17% | 23% | 32% | 25% | 11% | 7% | 34% | 16% | |
| medRE | 14% | 19% | 27% | 18% | 10% | 5% | 28% | 16% | ||
|
| 0.075 | 0.75 | 0.37 | 0.3 | 0.13 | 0.96 | 0.25 | 0.11 | rank 14 |
This table shows the performance of the extended MSS method on a partially observed auto‐regulatory gene network model, only mRNA, P and P 2 observed, and compares it with results calculated by Wang et al. [8]. Different measurement scenarios are considered as well as different amounts of DNA . The initial values are for DNA = 10: DNA0 = 6, DNA.P 2(0) = 4 as well as for DNA = 2: DNA0 = 2, DNA.P 2(0) = 0. θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For each scenarios, 100 simulated time series are calculated with the Gillespie algorithm [2] within Copasi [37] Version 4.11‐65. For each of the time series, an estimator is calculated. The table shows
the average of the estimators:
their ARE:
medRE: The median of the 100 relative errors
w1 is the estimate of one time series shown by Wang et al. [8]
w2 is the estimate of a second time series shown by Wang et al. [8]
the rank of w1 and w2 among the 100 estimates of this paper with respect to relative error (details on how this rank is calculated can be found in the “Design of the simulation study section”) and
div stands for the number of time series for which no convergence could be achieved
Fig. 4Different scenarios for Lotka–Volterra
The stochastic Lotka–Volterra model can show a behaviour that is qualitatively different from the deterministic Lotka–Volterra model. Panel A shows oscillations in which the intrinsic stochasticity influences amplitude and frequency. Panel B shows a case in which the prey Y (1) dies out and then after that the predator Y (2). Panel C shows a scenario in which the predator dies out and then the prey population explodes
Statistics of the estimation results for a Lotka–Volterra model – fully observed
|
|
|
| |||
|---|---|---|---|---|---|
| (# Δ | 0.5 | 0.0025 | 0.3 | ||
| (40 1) | av | 0.5 | 0.002504 | 0.3 | |
| ARE | 3% | 2% | 3% | ||
| rj | 0.48 | 0.00255 | 0.308 | rank 69 | |
| bu | 0.48 | 0.00247 | 0.307 | rank 66 | |
| a | 0.484 | 0.00307 | 0.31 | rank 101 | |
| d | 0.48 | 0.00254 | 0.307 | rank 68 | |
| old av | 0.501 | 0.002515 | 0.302 | ||
| old ARE | 3% | 3% | 3% | ||
| (200 1) | av | 0.501 | 0.002499 | 0.3 | |
| normal | ARE | 1% | 1% | 1% | |
| rj | 0.5 | 0.0025 | 0.304 | rank 10 | |
| bu | 0.5 | 0.00251 | 0.304 | rank 16 | |
| a | 0.507 | 0.00254 | 0.308 | rank 83 | |
| d | 0.503 | 0.00252 | 0.306 | rank 52 | |
| old av | 0.5 | 0.002508 | 0.301 | ||
| old ARE | 1% | 2% | 2% | ||
| (200 1) | av | 0.5 | 0.002502 | 0.3 | |
| die‐out | ARE | 1% | 1% | 1% | |
| old av | 0.503 | 0.0025 | 0.299 | ||
| old ARE | 2% | 2% | 2% | ||
| (200 1) | av | 0.436 | 0.002363 | 0.286 | |
| explode | ARE | 13% | 6% | 5% | |
| old av | 0.214 | 0.001971 | 0.226 | ||
| old ARE | 57% | 21% | 25% | ||
| (40 5) | av | 0.499 | 0.002492 | 0.299 | |
| ARE | 2% | 3% | 3% | ||
| rj | 0.493 | 0.00262 | 0.314 | rank 85 | |
| bu | 0.493 | 0.00262 | 0.314 | rank 85 | |
| a | 0.493 | 0.00262 | 0.314 | rank 85 | |
| d | 0.492 | 0.00263 | 0.315 | rank 87 | |
| old av | 0.5 | 0.0025 | 0.301 | ||
| old ARE | 3% | 3% | 3% |
This table shows the performance of the extended MSS method on a Lotka–Volterra model and compares it with results calculated by Boys et al. [7]. Different measurement scenarios are considered. The initial values are (Y (1), Y (2)) = (71, 79). θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For each scenarios, 100 simulated time series are calculated with the Gillespie algorithm [2] within COPASI [37] Version 4.11‐65. For each of the time series, an estimator is calculated. The table shows
the average of the estimators:
their ARE:
(rj) an estimate by Boys et al. [7] using a reversible jump method
(bu) an estimate by Boys et al. [7] using a block updating method
an estimate by Boys et al. [7] using an approximation of a block updating method
(d) an estimate by Boys et al. [7] using a diffusion approximation
the rank of rj, bu, a and d among the 100 estimates of this paper with respect to relative error (details on how this rank is calculated can be found in the “Design of the simulation study section”)
Statistics of the estimation results for a Lotka–Volterra model – partially observed
|
|
|
| |||
|---|---|---|---|---|---|
| (# Δ | 0.5 | 0.0025 | 0.3 | ||
| (40 1) | av | 0.494 | 0.002536 | 0.307 | |
|
| ARE | 6% | 7% | 9% | |
| rj | 0.469 | 0.00244 | 0.287 | rank 34 | |
| bu | 0.469 | 0.00244 | 0.287 | rank 34 | |
| a | 0.49 | 0.00236 | 0.273 | rank 49 | |
| d | 0.473 | 0.00256 | 0.304 | rank 11 | |
| old av | 0.495 | 0.002554 | 0.31 | ||
| old ARE | 6% | 8% | 10% | ||
| (40 1) | av | 0.491 | 0.002639 | 0.319 | |
| only | ARE | 12% | 15% | 17% | |
| bu | 0.572 | 0.00201 | 0.236 | rank 39 | |
| a | 0.752 | 0.00151 | 0.176 | rank 78 | |
| d | 0.432 | 0.00291 | 0.347 | rank 32 | |
| old av | 0.552 | 0.002628 | 0.319 | ||
| old ARE | 27% | 23% | 25% |
This table shows the performance of the extended MSS method on a partially observed Lotka–Volterra model and compares it with results calculated by Boys et al. [7]. Different measurement scenarios are considered. The initial values are (Y (1), Y (2)) = (71, 79). θ (0) stands for the true parameter, # stands for number of measurement points and Δt stands for the time interval between two succeeding observations. For each scenarios, 100 simulated time series are calculated with the Gillespie algorithm [2] within COPASI [37] Version 4.11‐65. For each of the time series, an estimator is calculated. The table shows
the average of the estimators:
their average relative error:
(rj) an estimate by Boys et al. [7] using a reversible jump method
(bu) an estimate by Boys et al. [7] using a block updating method
an estimate by Boys et al. [7] using an approximation of a block updating method
(d) an estimate by Boys et al. [7] using a diffusion approximation
the rank of rj, bu, a and d among the 100 estimates of this paper with respect to relative error (details on how this rank is calculated can be found in “Design of the simulation study sections”)