| Literature DB >> 35604952 |
Christopher I Jarvis1,2, Amy Gimma1,2, Flavio Finger1,2,3, Tim P Morris4, Jennifer A Thompson1, Olivier le Polain de Waroux2, W John Edmunds1,2, Sebastian Funk1,2, Thibaut Jombart1,2,5,6.
Abstract
The fraction of cases reported, known as 'reporting', is a key performance indicator in an outbreak response, and an essential factor to consider when modelling epidemics and assessing their impact on populations. Unfortunately, its estimation is inherently difficult, as it relates to the part of an epidemic which is, by definition, not observed. We introduce a simple statistical method for estimating reporting, initially developed for the response to Ebola in Eastern Democratic Republic of the Congo (DRC), 2018-2020. This approach uses transmission chain data typically gathered through case investigation and contact tracing, and uses the proportion of investigated cases with a known, reported infector as a proxy for reporting. Using simulated epidemics, we study how this method performs for different outbreak sizes and reporting levels. Results suggest that our method has low bias, reasonable precision, and despite sub-optimal coverage, usually provides estimates within close range (5-10%) of the true value. Being fast and simple, this method could be useful for estimating reporting in real-time in settings where person-to-person transmission is the main driver of the epidemic, and where case investigation is routinely performed as part of surveillance and contact tracing activities.Entities:
Mesh:
Year: 2022 PMID: 35604952 PMCID: PMC9166360 DOI: 10.1371/journal.pcbi.1008800
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Fig 1Rationale of the method for estimating reporting.
This diagram illustrates transmission events inferred by case investigation of reported secondary cases, with arrows pointing from infectors to infectees. Darker shades are used to indicate documented transmission events, while lighter shades show unknown infectors. Numbers of secondary cases with (blue) or without (orange) known infectors are used to estimate the reporting probability. This example uses an approximate reporting of 50%.
Parameters used for simulating outbreaks.
This table details input parameters used for simulating outbreaks using the R package simulacr. Fixed values were used for all simulations, and reflect the natural history of the 2018–2020 Eastern DRC Ebola outbreak. Variable values changed across simulations.
|
| |
| Maximum duration of the outbreak | 365 days |
| Incubation time distribution | Discretised gamma distribution |
| Infectious period distribution | Discretised gamma distribution |
| Reproduction number distribution | Gamma distribution: |
|
| |
| Population size | 200, 500, 1000, 2000, 5000, 7500, 10000, 15000, 20000 |
| Outbreak size | 10–99, 100–499, 500–999, 1000+ |
| Proportion of cases not reported | 0.25, 0.50, 0.75 |
*Population size is controlled in each simulation, the outbreak sizes are determined after the outbreaks have been simulated and the proportion of cases not reported have been removed.
Metrics used to measure performance in the simulation study.
| Performance measure | Definition |
|---|---|
| Bias | |
| Coverage | If we define a confidence interval |
| Precision | |
| Model based |
|
| Empirical based standard error |
|
| Absolute error |
|
Performance measures from 4000 simulation by reported outbreak size and true reporting level.
Estimate (Monte-carlo standard error).
| Reported outbreak size | |||||
|---|---|---|---|---|---|
| Performance measures (MCSE) | Proportion reported | 10–99 | 100–499 | 500–999 | 1000 or more |
|
| 0.25 | 0 (0.07) | 0 (0.03) | 0 (0.02) | 0 (0.01) |
| 0.5 | -0.01 (0.07) | 0 (0.04) | 0 (0.02) | 0 (0.01) | |
| 0.75 | -0.01 (0.07) | 0 (0.04) | 0 (0.02) | 0 (0.01) | |
|
| 0.25 | 95.7% (0.3) | 94.1% (0.4) | 94.4% (0.4) | 93% (0.4) |
| 0.5 | 92.6% (0.4) | 92.4% (0.4) | 91.3% (0.4) | 91.2% (0.4) | |
| 0.75 | 92.3% (0.4) | 91.5% (0.4) | 89.2% (0.5) | 88.6% (0.5) | |
|
| 0.25 | 0.065 (0) | 0.024 (0) | 0.015 (0) | 0.01 (0) |
| 0.5 | 0.061 (0) | 0.038 (0) | 0.019 (0) | 0.011 (0) | |
| 0.75 | 0.059 (0.001) | 0.036 (0) | 0.014 (0) | 0.011 (0) | |
|
| 0.25 | 0.071 (0.001) | 0.025 (0) | 0.016 (0) | 0.01 (0) |
| 0.5 | 0.07 (0.001) | 0.044 (0) | 0.022 (0) | 0.012 (0) | |
| 0.75 | 0.068 (0.001) | 0.043 (0) | 0.017 (0) | 0.013 (0) | |
Fig 2Comparison of estimated versus actual reporting.
This graph shows the results of reporting estimated by the method for 4000 simulated outbreaks, broken down by outbreak size category (y-axis). Each dot corresponds to an independent simulation. The vertical red bars indicate the average within each category. True reporting used in the simulations is indicated by colors.
Fig 3Zip plot of showing coverage results.
This graph shows the 95% confidence intervals estimated by the method, broken down by reported outbreak size category and true reporting value. The vertical axis represent the fractional centile of |Z| where and π is reporting. The confidence intervals are ranked by their level of coverage and thus the vertical axis can be used to determine the proportion of confidence intervals that contain the true value where 0.95 would represent a coverage of 95%.
Fig 4Absolute error in reporting estimation.
This graph shows, for different simulation settings, the proportion of results within a given margin of absolute error, expressed as the absolute difference between the true and the estimated reporting (in %). Rows correspond to different outbreak size categories (outbreak size as reported). True reporting is indicated in color.
Comparison of absolute error from 4000 simulations between true reporting levels and estimate of reporting by reported outbreak size and true reporting level.
| Absolute error from true value | |||||
|---|---|---|---|---|---|
| Proportion reported | Reported outbreak size | ≤ 5% | ≤ 10% | ≤ 15% | ≤ 20% |
| 0.25 | 10–99 | 2213 (55.3%) | 3376 (84.4%) | 3849 (96.2%) | 3973 (99.3%) |
| 100–499 | 3817 (95.4%) | 4000 | 4000 | 4000 | |
| 500–999 | 3995 (99.9%) | 4000 | 4000 | 4000 | |
| 1000+ | 3999 (100%) | 4000 | 4000 | 4000 | |
| 0.5 | 10–99 | 2110 (52.8%) | 3430 (85.8%) | 3860 (96.5%) | 3978 (99.4%) |
| 100–499 | 2981 (74.5%) | 3899 (97.5%) | 3998 (100%) | 4000 | |
| 500–999 | 3905 (97.6%) | 4000 | 4000 | 4000 | |
| 1000+ | 4000 | 4000 | 4000 | 4000 | |
| 0.75 | 10–99 | 2400 (60%) | 3575 (89.4%) | 3835 (95.9%) | 3942 (98.6%) |
| 100–499 | 3067 (76.7%) | 3890 (97.2%) | 3991 (99.8%) | 4000 | |
| 500–999 | 3988 (99.7%) | 4000 | 4000 | 4000 | |
| 1000+ | 3992 (99.8%) | 4000 | 4000 | 4000 | |