Literature DB >> 28095403

A Likelihood Approach for Real-Time Calibration of Stochastic Compartmental Epidemic Models.

Christoph Zimmer¹, Reza Yaesoubi², Ted Cohen¹.

Abstract

Stochastic transmission dynamic models are especially useful for studying the early emergence of novel pathogens given the importance of chance events when the number of infectious individuals is small. However, methods for parameter estimation and prediction for these types of stochastic models remain limited. In this manuscript, we describe a calibration and prediction framework for stochastic compartmental transmission models of epidemics. The proposed method, Multiple Shooting for Stochastic systems (MSS), applies a linear noise approximation to describe the size of the fluctuations, and uses each new surveillance observation to update the belief about the true epidemic state. Using simulated outbreaks of a novel viral pathogen, we evaluate the accuracy of MSS for real-time parameter estimation and prediction during epidemics. We assume that weekly counts for the number of new diagnosed cases are available and serve as an imperfect proxy of incidence. We show that MSS produces accurate estimates of key epidemic parameters (i.e. mean duration of infectiousness, R0, and Reff) and can provide an accurate estimate of the unobserved number of infectious individuals during the course of an epidemic. MSS also allows for accurate prediction of the number and timing of future hospitalizations and the overall attack rate. We compare the performance of MSS to three state-of-the-art benchmark methods: 1) a likelihood approximation with an assumption of independent Poisson observations; 2) a particle filtering method; and 3) an ensemble Kalman filter method. We find that MSS significantly outperforms each of these three benchmark methods in the majority of epidemic scenarios tested. In summary, MSS is a promising method that may improve on current approaches for calibration and prediction using stochastic models of epidemics.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2017 PMID： 28095403 PMCID： PMC5240920 DOI： 10.1371/journal.pcbi.1005257

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

Introduction

The sporadic emergence of novel human pathogens (e.g. SARS, MERS, new strains of influenza) serves as a reminder of the importance of monitoring the spillover into and spread of pathogens in human populations. Accurate estimates of the fundamental epidemic parameters during the earliest phase of disease emergence can facilitate rational public health policy decisions [1, 2] and are thus of high priority [3, 4]. Estimating epidemic parameters is challenging during this initial period of disease emergence because the dynamics are stochastic and the processes governing disease spread are complex and usually only partially observable. Public health decision makers, therefore, require tools to make inference about the true state of the epidemic and the values of fundamental epidemic parameters using observations that can only imperfectly represent the true epidemic state (e.g. reported disease-related illnesses serve as an imperfect measure of incident cases). There has been substantial recent progress in the development of methods to infer epidemic parameters from partial epidemic observations [5-11]. For example, infection network models are commonly used when symptom onset date for each reported case is available [12-17]. In these models, disease spread is described as a directed network in which the nodes represent cases and the directed edges between nodes represent transmission links. In the absence of the type of detailed individual-level data required by infection network methods, compartmental transmission dynamic models have been used for parameter estimation and for projecting epidemic trajectory. These models divide the population into disjoint subgroups (e.g. susceptible, infectious, and recovered) and transitions between the epidemic states are described using ordinary differential equations (for deterministic models) [18] or Markov chains (for stochastic models) [19]. A wide variety of methods have been described to calibrate these compartmental models which differ based how they estimate the likelihood of observations. These methods make use of deterministic [20-22] or stochastic [23] models to describe the underlying epidemic process, and may involve Markov Chain Monte Carlo [24] or filtering techniques [25-32] to sample parameter space and the unobserved states. In this paper, we describe an alternative method for calibrating a general class of stochastic compartmental models using the types of data that would be available during the early period of epidemic spread. Unlike deterministic models, stochastic models capture chance events, a feature that proves essential for the accuracy of model-based parameter estimation and prediction during pathogen emergence. Our calibration method, Multiple Shooting for Stochastic systems (MSS), utilizes a linear noise approximation (LNA) approach and a state-updating procedure to approximate the likelihood of observations while explicitly accounting for the interdependency between subsequent epidemic observations. Using simulation experiments, we compare the performance of MSS with several competing approaches in terms of accuracy in estimating parameters, predicting future behaviors, and inferring the current epidemic state. We test the sensitivity of the performance of these approaches to observational noise and model misspecification.

Methods

Problem formulation

During an epidemic, surveillance systems may be able to capture a variety of measures, such as the number of disease-related diagnoses, hospitalizations or deaths. We use y to denote the vector of epidemic measures that can be observed during the period [t, t] (see Fig 1). Let Y = (y1, y2, …, y) be the epidemic history up to time t. To measure the fit of an epidemic model to the observations accumulated up to time t, we use the following likelihood function: where is the probability of observing y given the previous observations (y1, y2, …, y) and the model parameters θ.

Fig 1

Sequence of observations during an epidemic.

For many epidemics, the probability function may not be readily available and can be computationally expensive to evaluate. Therefore, the likelihood function (1) is often approximated by assuming that the epidemic observations in each period follow a discrete probability distribution (e.g. Poisson distribution) and are independently distributed across observation periods [23, 33]. For example, to approximate the likelihood function (1), Riley and colleagues assume that epidemic observations follow independent Poisson distributions where means are chosen to match the means generated from 250 model replications [23]. The method we develop here to approximate the likelihood function (1) does not retain the unrealistic assumption that these observations are independent. Below, we first describe an algorithm to efficiently approximate the probability function which can then be used to evaluate the likelihood function (1) for observations accumulated up to any point in the epidemic.

Approximating the likelihood function

In a compartmental model, the state of the epidemic at a given time t is defined as the number of individuals in each compartment at time t [19, 34], which we will refer to as ν. In most settings, particularly if the disease has a complicated natural history, the true distribution of individuals in each state over the course of epidemic progression, ν, i ∈ {0, 1, 2, …}, is not fully observable. Hence, to form statistical inference about the true epidemic state, we utilize belief states which define a probability distribution over all possible epidemic states given past observations. Let Π(⋅|Y) denote the belief state at time t given the accumulated observations Y. Now by conditioning on the epidemic state at time t, i.e. ν, the probability function in Eq (1) can be calculated as: where Ω is the set of feasible epidemic states (i.e. the support of the belief state Π) at time t. By conditioning on the state of the epidemic at time t, the probability function in Eq (2) can be calculated as: Calculating the probability function (3) can be computationally difficult. First, it requires calculating or approximating the transition probability p(ν|ν; θ) for each pair (ν, ν) ∈ Ω × Ω, and second, it involves enumeration over the set Ω × Ω, which can be prohibitively large even for simple epidemic models. One way to simplify the computational complexity of Eq (3) is to represent the belief state Π(⋅|Y) as a step function that takes 1 for the most probable state (denoted by ) and 0 elsewhere. This allows us to approximate the function in Eq (3) with: where represents the most likely epidemic state given observations Y = (y1, y2, …, y). By the definition of epidemic states, the transition from state ν to ν generates a unique set of observations, and it is trivial to find whether the state transition to ν can generate the observation y (see design of performance analysis for an illustrative example). Therefore, for a given observation y, the probability in Eq (4) is equal to 1 if the transition from state to ν results in observing y, and is zero otherwise. To calculate in Eq (4), it only remains to identify the state transition probability function and a state estimation scheme. In the next subsections, we propose an algorithm to achieve this.

Approximating state transition probabilities

Finding the exact state transition probability function p(⋅) can be difficult, and in many cases impossible, as state spaces in epidemic models can be quite large or unbounded. To overcome this problem, we employ a linear noise approximation (LNA) method to approximate the probability distribution of the new epidemic state ν given the previous state ν, i.e. p(ν|ν; θ). The LNA has been previously used to estimate parameters of stochastic biochemical reaction models [35, 36]. Here we extend Zimmer and Sahle’s method [36] to calibrate stochastic epidemic models where the true epidemic state is only partially observable. To approximate the probability distribution of ν given the state ν, the LNA method uses an ordinary differential equations (ODE) model to approximate the expected behavior of the epidemic over the period [t, t] and to identify a co-variance matrix to characterize the uncertainty around the epidemic behavior over this interval. We use the following notation to denote the ODE epidemic model used by the LNA method: In the ODE system (5), the vector x(t, x0; θ) is the epidemic state of the ODE model at time t given the initial state x0, the vector Λ(x(t, x0; θ), θ) denotes the instantaneous changes in the epidemic when at state x(t, x0; θ), and the matrix Γ describes how the instantaneous changes at time t impact the epidemic state at time t + Δt (see subsequent sections for an example). The LNA assumes that the probability distribution of ν|ν can be properly approximated by a normal distribution . The mean vector μ is the solution of ODE system (5) with ν as the initial condition (i.e. μ = x(t − t, ν; θ)) and the variance matrix cov = Σ(t − t, ν; θ) is the solution of the following ODE systems [37, 38]: In the ODE system (6), and D is a K × K matrix with the (i, j) entity equal to , where K is the number of compartments in the epidemic model. An important question remains about how well this proposed LNA method approximates the probability distribution of epidemic states. Relying on an extensive numerical analysis, we will demonstrate in the Results section that for the epidemic scenarios considered, our method yields accurate parameter estimations and reliable predictions. We also note that our method does not rely on a single LNA model to approximate the entire epidemic trajectory. For each observation period [t, t], it generates a new LNA model to approximate the epidemic behavior only over this particular period.

Updating belief states

We now describe how to update our belief about the true state of the epidemic, denoted by Π(⋅), once new observations y are obtained. We first note that Π(⋅) is defined to return 1 for the most likely state , and zero elsewhere. Therefore, given the state at time t, the probability of observing y during the interval [t, t] is equal to (see the discussion prior to Eq (3)). Now, the most probable state at time t, , is the one that leads to the highest probability of observing y: As discussed above, is either 0 or 1, and the probability function is approximated with a Normal distribution. Therefore, it is straightforward to solve the optimization problem (7). Note that in compartmental models, the support Ω can be recursively updated as new observations become available (see [39] for details and the S1 Text subsection 8 “Detailed pseudo code for our MSS” for an illustration).

Estimating model parameters

One of the main goals of model calibration is to leverage information from accumulating observations to characterize the uncertainty around model parameters. To this end, we employ a Bayesian approach that updates the probabilistic information on model parameters as new observations become available. This updating is performed according to the following Bayes rule: where π(⋅, Y) is the prior distribution before obtaining the i observation and π(⋅|Y) is the posterior distribution after obtaining the i observation. In Eq (8), the likelihood function L(MSS)(⋅|θ) is estimated using the approach described in the preceding sections. Note that π is used for both prior and posterior as the posterior is recursively calculated from the prior upon obtaining subsequent observations. Fig 2 summarizes the steps of our proposed method for calibrating stochastic epidemic models. The set Θ in Step 1.b includes parameter values for which the likelihood function (1) should be calculated. This set can be built by sampling from the prior distribution π0(θ) or via other sampling methods including random, Latin hypercube, or orthogonal sampling. Obviously, the sample set Θ can be modified as new observations become available and the prior distribution π0(θ) is being updated. However, the main advantage of fixing the sample set Θ at the beginning of the algorithm is to allow the algorithm to update the likelihood function (1) in a recursive manner as new observations are accumulating over time. That is, to find the likelihood value L(MSS)(Y; θ) at time t, we only need to know the value of the likelihood function from the previous time step (i.e. L(MSS)(Y; θ)) and the probability P(y|Y; θ) (see Step 2.c). The parameter posterior distribution obtained by using observations Y can be used as a prior distribution once a new observation y becomes available.

Fig 2

An algorithm for real-time calibration of stochastic compartmental epidemic models.

Predicting epidemic behavior

A central motivation for developing better methods for calibration of stochastic epidemic models is to improve predictions about the future course of epidemics. To demonstrate how our method can be used for prediction, let Z denote the quantity that we wish to predict (e.g. the number of diagnoses during the subsequent week) and denote the probability density function of the random variable Z if the epidemic is at state ν ∈ Ω at time t and the epidemic parameters have value θ ∈ Θ. Now, the probability distribution of the random variable Z given the accumulated observations at time t, Y = (y1, y2, …, y), can be calculated as: In Eq (9), the belief state Π(⋅|Y; θ) and the parameter posterior distribution π(⋅|Y) are both obtained from the algorithm described in Fig 2. To calculate the distribution of the random variable Z|Y, we assume that we have access to a stochastic simulation model to sample from the random variable Z|ν for a given set of parameter values θ. For the numerical results presented here, we use the Gillespie algorithm [40] (see SI document) to generate realizations for Z|Y that can be used to estimate the probability function P(Z|Y) in Eq (9), or to calculate mean or other moments of the random variable Z|Y.

Benchmark methods

We compare the performance of our method with three competing state-of-the-art approaches. The first benchmark method, which is based on an assumption of independent Poisson observations, is straightforward to understand and to implement and has been used to infer basic fundamental parameters during the early appearance of novel pathogens [23]. For the second and third approaches, we consider Particle Filter and Ensemble Kalman Filter [27] as these methods won the 2014 “Predict the Influenza Season Challenge” sponsored by the U.S. Center for Disease Control and Prevention [41] and also demonstrate competitive performance against four other methods in a recent comparison study [27].

Benchmark method A: Likelihood approximation with the assumption of independent Poisson observations (I.Poi)

We use the method from Riley and colleagues [23] as the first benchmark. This method assumes that the observations y are independent and factorizes the likelihood function L(Y; θ) (Eq (1)) to where . To calculate the mean μ, we obtain 1000 stochastic trajectories using parameter values θ and shift them so that the first observation for each trajectory lies within the first observation period [t0, t1]. We then calculate the mean of observations over period [t, t] (i.e. μ) using these simulated trajectories. This approach ignores the inter-dependencies between observations y1, y2, …, y and approximates by . Without the knowledge of an updated belief state Π(⋅|ν; θ), Eq (9) for prediction reduces to The mean and variance of random variable Z can be estimated using simulated trajectories with parameter values selected according to the probability function π(θ|Y). These simulated trajectories can also be used to make inference about the epidemic state at time t. To make predictions at time t, our implementation of the benchmark approach uses only simulated trajectories that have not been extinguished before t.

Benchmark method B: Particle Filter

The Particle Filter (PF) approach described by [27] also seeks to approximate the likelihood function (1) to calculate an a-posteriori distribution π(θ|Y) using the epidemic history Y. This method, however, uses a different approximation to calculate the probability : While the belief state is also a point distribution, the state update is performed based on the solution of ODE system (5): In Eq (12), the forward propagation p(PF)(ν|ν; θ) is a point distribution with mass 1 at , and P(PF)(y|ν, ν; θ) is assumed to follow a normal distribution where the function h maps state to observations and the observational variance is assumed to be: This observational variance is identical to the one used in [27]. Our sensitivity analysis reveals that the choice of observational variance, Eq (13), does not have a significant effect on the comparative performance of PF compared to the methods studied here (see Discussion). Since this approach calculates belief states and posterior distribution for θ, the prediction is performed in an identical fashion as for the MSS approach as described in Eq (9). The pseudo code and the full implementation of this approach for an epidemic model is described in the SI (S1 Text subsection 6 “Pseudo code for the PF” and S1 Text subsection 10 “Detailed pseudo code for PF”).

Benchmark method C: Ensemble Kalman Filter (EnKF)

We selected the ensemble Kalman Filter (EnKF) [27] for the third benchmark method. EnKF does not use a likelihood function L(Y; θ) and instead it propagates an initial ensemble that contains both epidemic states and epidemic parameters. Each new observation is then used to update this ensemble. After such updating, the ensemble is considered as a sample from the posterior distribution for both the parameters and the states. Similar to PF and MSS, this approach calculates belief states and posterior distribution for θ, and hence predictions can be performed using Eq (9). The pseudo code and the full implementation of this approach for an epidemic model is described in the SI (S1 Text subsection 7 “Pseudo code for the EnKF” and S1 Text subsection 11 “Detailed pseudo code for the EnKF”).

Design of the performance analysis

To compare the performance of our approach with that of the three competing benchmark methods (described above), we use several different simulated scenarios after the introduction of a novel viral pathogen epidemic. We develop a simple stochastic compartmental model where population members are grouped into four mutually exclusive compartments (see Fig 3). In this model, Infective individuals may infect Susceptibles with whom they come into contact. We assume that Infective individuals will eventually seek treatment due to worsening symptoms (and move to the Treatment compartment). Here, cases under treatment do not transmit infection and those who recover from the disease have full immunity against reinfection with the pathogen (and moved to the Recovered compartment).

Fig 3

A model for the outbreak of a novel viral pathogen.

Let x(t), x(t), x(t) and x(t), respectively, denote the number of individuals in compartments Susceptible, Infective, Treatment, and Recovered, and N(t) = x(t) + x(t) + x(t) + x(t) denote the population size at time t. Disease transmission can be modeled using the following ODE model: where θ1 is the disease transmission rate, θ2 is the rate of seeking treatment while infectious, and θ3 is the rate of recovering. This model can be presented in the format of the ODE system (5) by choosing In our analysis, we assume that at time t = 0 one population member becomes infected. The model initial condition can therefore be defined as x0 = (x(0), x(0), x(0), x(0)) = (N − 1, 1, 0, 0). To evaluate the performance of the calibration methods considered here, we used nine epidemic scenarios that differ by population size N ∈ {1000, 10,000, 10,0000} and the epidemic attack rate (moderate: [30% − 50%], severe: [50% − 70%] and extreme [70% − 100%]). To produce epidemic trajectories for each scenario, we first drew R0 from U([1, 3]) and mean duration of infectiousness from U([1, 20]), and then used the Gillespie algorithm [40] (see S1 Text subsection 1 “The Gillespie Algorithm” implemented in the software COPASI [42]) to simulate the SITR model of Fig 3 with and (Eq 14). We assumed that the delay until recovery after the onset of treatment is 4 days and that this is a known quantity (that is, θ3 = 0.25). To ensure that our selected simulated trajectories reflect expected shapes of outbreaks (rather than slow, simmering transmission), we only included trajectories with an epidemic peak that occurs between weeks 10 and 20 after take-off. Fig 4A and S1 to S8 Figs show the 50 simulated trajectories chosen for each scenario. In our evaluation, we assumed that the weekly number of disease-associated diagnoses was observed throughout the outbreak; this was calculated for each week as the number of people that transitioned from I to T. For our model, the number of diagnoses in week i is calculated as x(t)+x(t) − x(t) − x(t), where t − t = 7 days to obtain weekly data.

Fig 4

Median integrated relative errors (mIRE) in estimating model parameters and infection prevalence as well as prediction for simulated epidemics in a population of 10 000 with 50% − 70% attack rate.

Median integrated relative errors (mIRE) in estimating model parameters and infection prevalence as well as prediction for simulated epidemics in a population of 10 000 with 50% − 70% attack rate.

A) Epidemic scenarios used to evaluate the performance of calibration methods. Each scenario consists of 50 stochastic trajectories obtained by simulating the model in Fig 3 using the Gillespie algorithm. B) results for estimating R0, C) results for estimating Reff, D) results for estimating the mean duration of infectiousness, E) results for estimating the infection prevalence, F) results for predicting the next week diagnoses, G) results for predicting the diagnoses 3 weeks from now, H) results for predicting the diagnoses over the next 3 weeks, I) results for predicting the attack rate. In each panel: MSS: Multiple Shooting for Stochastic systems (the method proposed here); I.Poi: Independent Poisson (Benchmark Method A); PF: Particle Filter (Benchmark Method B); EnKF: Ensemble Kalman Filter (Benchmark Method C). P-values are from Wilcoxon Signed-Rank test evaluating the hypothesis that the median of relative errors for the MSS approach is smaller than that of I.Poi (first row), PF (second row), and EnKF (third row); p-values smaller than 0.001 are displayed as ***, p-values in between 0.001 and 0.01 as ** and between 0.01 and 0.05 as *. The values of mIRE for some scenarios fall above the vertical axis range and are not displayed. We implemented each of the calibration methods in Mathematica [43] and evaluated their performance based on their ability to accurately (1) estimate R0, effective R (Reff), the duration of infectiousness, and the current number of infectious individuals, and (2) predict the incident number of diagnoses (for the subsequent week and 3 weeks in the future) and the total cumulative cases (over the subsequent 3 weeks and over the whole epidemic). We note that in our analysis, we calibrate R0 and mean duration of infectiousness, and θ1 and θ2 are calculated based on the samples of R0 and mean duration of infectiousness for likelihood evaluations. For each simulated trajectory shown in Fig 4A and S1A to S8A Figs, we evaluated the performance of our method at several different times during the epidemic: 8 weeks to peak incidence, 4 weeks to peak, at peak, 4 weeks after the peak, and 8 weeks after the peak. To quantify the performance of these methods based on each target metric M (e.g. estimating R0 or prediction of the number of incident cases in the subsequent period after observing Y), we used the integrated relative error (IRE): where f(m|θ) is the probability density function of target M given the estimated parameter values θ, is the true value of target M. Ψ is the set of possible values that the target M can take (i.e. the support of the probability density function f), and represents the set of all parameter values (see SI for further details). We chose IRE as the performance metric since it provides a precise measure for how well the mass of the posterior distribution covers the true value of the targets to estimate. Alternative performance metrics such as relative error or mean square error do not account for the variance of predictions and hence are not suitable for this purpose. When fitting the model in Fig 3 to the simulated trajectories, we did not assume that the length of the period between the onset of the outbreak and the first observation is known. We therefore considered this delay as an additional model parameter that must be estimated through the calibration procedures in MSS.

Results

Performance evaluation: Parameter estimation

Fig 4B–4E displays the median of IRE (defined in Eq (16)) in estimating R0, effective R, duration of infectiousness, and infection prevalence from applying calibration methods described above on 50 simulated trajectories shown in Fig 4A for a severe scenario with attack rate between 50% and 70% for a population of 10000. See S1 to S8 Figs for the performance of these methods under other epidemic and population size scenarios. In all 9 scenarios (defined by epidemic severity and population size) and 5 estimation times (8 and 4 weeks to peak, at the peak, and 4 and 8 weeks after the peak), the MSS method either outperforms the competing benchmark methods or demonstrates similar performance. In particular, we note that in contrast to other approaches, the MSS method offers continuous improvement in the accuracy of parameter estimation as epidemics progress and more observations become available. At the peak and thereafter, MSS displays dominating performance, which is attributable to the more accurate likelihood approximation offered by this approach (see Fig 4B–4E and S1B–S1E to S8B–S8E Figs. Furthermore, the MSS method performs significantly better than the other competing methods in estimating the true (and unobserved) prevalence of infection. This suggests that the state updating procedure utilized by MSS method is more effective than the state updating methods employed by PF and EnKF. It is worth noting that the high IRE for infection prevalence at the beginning and the end of epidemics is the result of the small number of infectious individuals during these phases. While this impacts the performance of all methods (including MSS), the effect is small for MSS as the state updating procedure leads to more reliable state estimates. Also, since the estimation of the effective R requires an estimate of the current number of susceptibles, the MSS method again shows a superior behavior. Fig 5A–5C shows the results of testing the hypothesis that the MSS method outperforms all benchmark methods at 0.05 significance level at different epidemic times. This figure suggests that, in the majority of cases, the MSS method demonstrates either statistically superior or similar performance to the benchmark methods (represented by Green, Blue and Black colors). We note that even for the small number of situations in which MSS is statistically dominated by another benchmark method, the absolute difference in mIRE (as shown in Fig 4B–4E and S1B–S1E to S8B–S8E Figs is small.

Fig 5

Identifying the calibration method with statistically dominant performance at 0.05 significance level for estimating model parameters and infection prevalence.

Identifying the calibration method with statistically dominant performance at 0.05 significance level for estimating model parameters and infection prevalence.

Red, if MSS is statistically better than all benchmark methods. White, if all methods fail to demonstrate statistically dominant performance. Blue if I.Poi (Benchmark A), Green if Particle Filter (Benchmark B) and Black if the Ensemble Kalman filter (Benchmark C) statistically outperform the MSS method. Multiple colors are displayed if more than one method is significantly better than MSS. The performance of each method tested does not change markedly between scenarios which differ by host population size (Fig 4B–4E and S1B–S1E to S8B–S8E Figs). Even though the trajectories appear less stochastic when larger host populations are considered, because all scenarios assume a small number of infective persons in the early phase, there remains substantial stochasticity during the period of initial take-off which influences the timing of the epidemic peak. We note that the performance of I.Poi (in terms of estimating R0 and R) deteriorates as more observations accumulate (Fig 4B and 4C and S1B–S1C to S8B –S8C Figs. A potential explanation of this observation is that I.Poi uses the mean of 1000 stochastic simulations which is likely to be close to the deterministic solution. If the observed stochastic trajectory peaks early or late, each additional observation will deviate from the expected behavior (for the true parameter) and, therefore, the precision worsens. In the analyses presented so far, we assumed that the model structure is correct and that the calibration target (i.e. weekly number of diagnoses) can be accurately measured. To test the robustness of the results to imperfect calibration targets, we allowed for accumulating observations to be disturbed by an error term that follows a Gaussian distribution with mean zero and standard deviation 100. In this sensitivity analysis, the MSS method demonstrated the ability to cope with noisy data and sustain its performance (S9 Fig). We next considered a scenario for which the epidemic trajectories are provided by an SEITR model (see S1 Text subsection 2 “Equations of the SEITR model”) but an SITR model is chosen for parameter estimation and prediction. This allowed us to test to what extent model misspecification could erode the performance of these approaches, which is important because the true epidemic process may not be known. This sensitivity analysis revealed that each of the calibration methods considered here maintained their performance under this particular model misspecification scenario with respect to most performance criteria, but had problems with estimates for Reff (S10 and S11 Figs). Each method failed to accurately estimate Reff after the peak, which may be explained by the fact that in extreme epidemic scenarios used in this experiment, only a few number of individuals remain in the Susceptible compartment. Since the misspecified model used for calibration does not include the additional Exposed compartment, the size of Susceptible compartment is overestimated by all calibration methods, which leads to inaccurate Reff estimates after the peak.

Performance evaluation: Prediction

The F–I panels of Fig 4 and S1 to S8 Figs display the median of integrated relative errors (defined in Eq (16)) in predicting the number of diagnoses during the subsequent week, three weeks from now, accumulated over the next three weeks, and accumulated until the end of the epidemic from applying calibration methods described above on 50 simulated trajectories shown their A panels. In all 9 scenarios (defined by epidemic severity and population size) and 5 epidemic times (8 and 4 weeks to peak, at the peak, and 4 and 8 weeks after the peak), the MSS method demonstrates superior or similar performance against the competing benchmark methods. As for the estimation targets, benchmark methods show higher integrated relative errors during both early and late phases of epidemics. This behavior occurs due to the small number of infectious individuals (which results in a low number of diagnoses) during these epidemic phases. The state updating procedure employed by the MSS method plays a central role in sustaining the performance of MSS method under these conditions. Fig 6A–6C displays the results of testing the hypothesis that the MSS method outperforms all benchmark methods at 0.05 significance level at different epidemic times. This figure suggest that in the majority of cases the MSS method demonstrates statistically superior than or similar performance to the benchmark methods (represented by Green, Blue and Black colors). We note that even for case where MSS is statistically dominated by another benchmark method, the absolute difference in mIRE (as shown in Fig 4 and S1 to S8 Figs) is not substantial.

Fig 6

Identifying the calibration method with statistically dominant performance for predictions at 0.05 significance level.

Identifying the calibration method with statistically dominant performance for predictions at 0.05 significance level.

Red, if MSS is statistically better than all benchmark methods. White, if all methods fail to demonstrate statistically dominant performance. Blue if I.Poi (benchmark A), Green if Particle Filter (benchmark B) and Black if the Ensemble Kalman filter (benchmark C) statistically outperform the MSS method. Multiple colors can occur if more than one method is significantly better than MSS. The prediction ability of all methods is not strongly influenced by a slight model mis-specification as shwon in S10 Fig, but, importantly, the superiority holds as well despite lack of knowledge of the exact model.

Discussion

Our results suggest that the MSS calibration method can address several major challenges inherent in parameter estimation and model-based prediction during outbreaks, when the true epidemic state can be only partially observed and the behavior of disease spread is most stochastic. Using a comprehensive set of simulation studies, we compare the performance of our method to several existing state-of-the-art calibration methods including a particle filter (PF) [27], an ensemble Kalman filter (EnKF) [27], and a likelihood approximation with the assumption of independent Poisson observations (I.Poi) [23]. For the majority of epidemic scenarios we evaluated, we find that MSS either outperforms or does as well as existing methods in terms of the ability to provide accurate estimates for the basic reproductive number (R0), effective R, mean duration of infectiousness, unobserved number of infected individuals, the epidemic final attack rate and the number of cases during future weeks. The superior performance of MSS can be attributed to the use of LNA to capture the correlations between epidemic compartments throughout the simulated outbreak, and to the state updating procedure employed by MSS. While it would be in principal possible to incorporate an LNA approximation into a PF framework, we used the version of PF that is commonly used [27] for the purpose of this comparison study. The main difference between EnKF and MSS is that EnKF does not use an approximation for the likelihood for an observation but rather updates states and parameters based on correlation between compartments and data using multiple parameter vectors. The MSS state updating procedure allows the calculation of these correlations using a single parameter vector. This results in a more accurate description of correlations between compartments as epidemic data are accumulating. The LNA has been successfully applied to describe intrinsic stochastic fluctuations in various contexts [44-46] and has a solid theoretical basis [47, 48]. The LNA provides accurate approximation for large populations or when the fluctuations are small compared to the steady state of the system. As also suggested by the results presented here, LNA can perform well even when the system is not in steady state as long as the intervals over which the LNA projections are made are sufficiently small (here, observations occurred at weekly time steps for epidemics that lasted up to 100 weeks). A main advantage of the MSS technique is that the LNA needs only to be employed on the relatively short time intervals between recordings and not over the whole course of the epidemics. This is possible by employing the state estimation procedure to initialize the LNA on the succeeding interval. Additionally, the simulation studies presented here and previous work in a Systems Biology context [35, 36] show that LNA can yield accurate estimates for parameters for stochastic systems. Nevertheless, if required, LNA accuracy can be improved by using higher order terms for correction [49]. For additional discussion on the LNA in a calibration setting (LNA on a whole system) see [50], for additional theoretical remarks [48] or the original work [47]. The computationally challenging part of the MSS is the interval-wise evaluation of the LNA which consists of n + n (n + 1)/2 ODEs where n stands for the number of compartments. PF and EnKF, on the other hand, only require the solution of the ODE system that consists of n equations. The I.Poi. requires the execution of stochastic simulations for each parameter and is the most cost intensive of all four alternatives tested. MSS can be easily carried out without the use of a computing cluster: calibration and prediction for one time-series at the peak of the epidemic takes roughly 5 minutes on an Intel Xeon CPU E5-2630 v3 with 16GM RAM, see the S1 Text subsection 4 “Computational Effort” for a comparison of runtimes. We note that the performance of filtering methods can be impaired by filter degeneracy, a phenomenon where particles or members of an ensemble or filter have a very small likelihood value that varies over orders of magnitude even for the best members/particles. Our implementation of PF and EnKF is consistent with the original implementation by [27] and both methods are equipped with suitable mechanisms to handle filter degeneracy. Despite the fact that the MSS approach proposed here does not include such a mechanism to address filter degeneracy, MSS demonstrated very competitive performance in these simulated epidemic scenarios. We believe that speaks to the promise of MSS and potential extensions of this methodology. The I.Poi method assumes that epidemic observations gathered over time are independent, an assumption that may be violated in many epidemic scenarios. For example, S12 Fig shows a clear correlation pattern among observations of epidemics trajectories produced by the SITR model of Fig 3. Despite lacking a mechanism to account for correlations between epidemic observations, I.Poi remained competative under several of the epidemic scenarios studied here. Our numerical analysis suggests that the combination of noisy observations and model mis-specification will erode the performance of all calibration methods considered here S11 Fig. However, our results indicate that mis-specification of model structure far outweighs the impact of observational noise in these simulations (compare S8 and S9 Figs which show similar results fort he noise-free and noisy scenarios). This finding highlights the importance of selecting a model structure that adequately reflects the complexity of the disease. The performance of PF and EnKF can be affected by the choice of observational variance (Eq 13). The observation variance we selected for our analyses is adapted from [27], which is a heuristic choice but has proved reliable in previous studies [26]. We ran a sensitivity analysis on this parameter by dropping the absolute variance term from 10,000 to 1, and observed no major change in the performance of PF and EnKF methods in populations of various size. This suggests that the absolute term in the observational variance is not the determining factor in the relative performance of EnKF, PF and MSS. We note four main limitations of our study. First, while including the necessary complexity to offer a suitable platform for our analyses, the epidemic model used to evaluate the comparative performance of calibration methods is highly simplified. Investigating the impact of model complexity level on the performance of calibration methods would be of immediate interest for future research. Second, our evaluation are based on “simulated” epidemic trajectories. While only simulated scenarios can allow us to calculate the relative errors in estimating the unknown parameters (e.g. R0, mean duration of infectiousness), further investigation using real-world data is required to comprehensively evaluate the performance of MSS along with other existing calibration methods. Third, in our analyses, we used an SITR model of influenza epidemics that allows for a delay between the time of infection and time of diagnosis. Our model, however, does not account for the fact that in reality only a portion of influenza cases will be reported as not all infected individuals will have severe enough symptoms to seek care. If the fraction of cases that remains undetected is known and fixed, this is easily handled in the model with the addition of compartments and parameters to reflect the observation process, However, the degree of underreporting is often unknown and may change over the course of an epidemic. Such underreporting impacts the performance of all available calibration methods, and the development of novel approaches that can better account for this phenomenon is an important direction for future research. And finally, our simulation study assumed that epidemic parameters, such as contact rate, remain constant through the outbreak. In reality, however, some epidemic parameters may vary over time due to changes in population behavior and interactions. We also note that for simplicity, our analysis used only one type of real-time observations (i.e. weekly diagnoses), but the methods presented here can be extended for scenarios where multiple sources of real-time data (e.g. disease-related hospitalizations and deaths) are available. In summary, precise and timely estimation of key epidemic parameters (e.g. expected number of secondary cases, or mean duration of infectiousness) remains a critical component of accurate model-based prediction and effective response to infectious disease outbreaks. The increase in accuracy offered by the MSS method could enable public health officials to response more effectively to epidemic threats. (PDF) Click here for additional data file.

Median integrated relative errors (mIRE) in estimating model parameters and infection prevalence as well as prediction for simulated epidemics in a population of 1 000 with 30% − 50% attack rate.