| Literature DB >> 31734243 |
Robert C Cope1, Joshua V Ross2.
Abstract
In an outbreak of an emerging disease the epidemiological characteristics of the pathogen may be largely unknown. A key determinant of ability to control the outbreak is the relative timing of infectiousness and symptom onset. We provide a method for identifying this relationship with high accuracy based on data from simulated household-stratified symptom-onset data. Further, this can be achieved with observations taken on only a few specific days, chosen optimally, within each household. The information provided by this method may inform decision making processes for outbreak response. An accurate and computationally-efficient heuristic for determining the optimal surveillance scheme is introduced. This heuristic provides a novel approach to optimal design for Bayesian model discrimination.Entities:
Keywords: Bayesian model discrimination; Epidemiology; Optimal experimental design; Random forests.
Mesh:
Year: 2019 PMID: 31734243 PMCID: PMC7094159 DOI: 10.1016/j.jtbi.2019.110079
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691
Fig. 1(a) Model schematic describing: transitions between states within each household continuous-time Markov chain; the five observation models being discriminated between; and, the way that these household-level data are observed. The data observed in each model are the number of observations of the relevant transition each day, within each household: data from four illustrative sample households are shown here. (b) Random forest feature importance for the full 14-day design, used to construct the heuristic for smaller designs. Each bar represents a feature, so within each day there are (in this case, for households of size 5) 6 features, corresponding to the proportion of households with each incidence count, each day. (c) Resulting random forest accuracy (proportion of test simulations assigned to the correct model) as design size increases, for the true optimal design (solid lines) and heuristic solution (crosses with dashed line). In this case, random assignment would produce accuracy 0.2. (d) Two-class accuracy of random forest model discrimination: this measures the accuracy of discrimination between models with symptoms before or coincident with infectiousness, versus models with symptoms beginning after infectiousness. In this case, random assignment would produce accuracy 0.52 (red dashed line). These results correspond to households of size 5, with 10,000 training samples from each model, each with parameters drawn from the distributions displayed in Supplemental Figure S1. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Events, transitions and rates within a household. N is the (fixed) household size, β, γ and σ are the rates of infection, gaining infectiousness and recovery, respectively.
| Description | Transition | Rate |
|---|---|---|
| Infection | ||
| 2 | ||
| Infectiousness | 2 | |
| 2 | ||
| Recovery | 2 |
Fig. 2(a) Accuracy of model discrimination in designs of size 5, as the number of households increases, and under partial observation. Note that pobs is not a fixed parameter but is sampled from a distribution: The Beta(5,5) distribution has mean 0.5, and the Beta(7.5,2.5) distribution has mean 0.75. Figure S3 shows the equivalent result with a design of size 14. (b) Difference between heuristic designs (coloured points) and optimal designs (black boxes) as the design size increases. Note that we do not evaluate optimal designs of size 1 or 2, and so there are no optimal designs in these columns. (c) Distribution of training sample observations (under each model and number of households) for the most important feature under the heuristic: the proportion of households with 1 case observed on day 2. Each coloured point represents an observation in the training sample. These results correspond to households of size 5, with 10,000 training samples from each model, each with parameters drawn from the distributions that appear in Supplemental Figure S1.