| Literature DB >> 26181387 |
Marc Lipsitch1, Christl A Donnelly2, Christophe Fraser2, Isobel M Blake2, Anne Cori2, Ilaria Dorigatti2, Neil M Ferguson2, Tini Garske2, Harriet L Mills2, Steven Riley2, Maria D Van Kerkhove3, Miguel A Hernán4.
Abstract
Estimating the case-fatality risk (CFR)-the probability that a person dies from an infection given that they are a case-is a high priority in epidemiologic investigation of newly emerging infectious diseases and sometimes in new outbreaks of known infectious diseases. The data available to estimate the overall CFR are often gathered for other purposes (e.g., surveillance) in challenging circumstances. We describe two forms of bias that may affect the estimation of the overall CFR-preferential ascertainment of severe cases and bias from reporting delays-and review solutions that have been proposed and implemented in past epidemics. Also of interest is the estimation of the causal impact of specific interventions (e.g., hospitalization, or hospitalization at a particular hospital) on survival, which can be estimated as a relative CFR for two or more groups. When observational data are used for this purpose, three more sources of bias may arise: confounding, survivorship bias, and selection due to preferential inclusion in surveillance datasets of those who are hospitalized and/or die. We illustrate these biases and caution against causal interpretation of differential CFR among those receiving different interventions in observational datasets. Again, we discuss ways to reduce these biases, particularly by estimating outcomes in smaller but more systematically defined cohorts ascertained before the onset of symptoms, such as those identified by forward contact tracing. Finally, we discuss the circumstances in which these biases may affect non-causal interpretation of risk factors for death among cases.Entities:
Mesh:
Year: 2015 PMID: 26181387 PMCID: PMC4504518 DOI: 10.1371/journal.pntd.0003846
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Potential biases that can affect the estimation of CFR (and thereby also the comparison of CFR across groups).
| Bias | Direction | Outbreaks in which analysts have noted this bias may be operating | Possible solutions |
|---|---|---|---|
|
| Spuriously increases estimate of CFR | Influenza H1N1pdm [ | Note: these solutions are listed in approximately the temporal order in which they may be practical, from early in the outbreak to later on; details will depend on the epidemiology of the outbreak. |
| Use sentinel surveillance sites to estimate multipliers between various levels of severity and extrapolate to a larger population [ | |||
| Survey- or health-facility–based surveillance for symptomatic infection [ | |||
| Use travelers from high-burden areas with low ascertainment to low-burden areas with higher ascertainment to estimate incidence of infection in source population [ | |||
| Surveillance pyramid approaches: reconstruct conditional probabilities of appearing at one severity level conditional on reaching a lower severity level; combine data sources that have relatively complete ascertainment of higher severity levels (e.g., hospitalization, ICU, death) with those having relatively complete ascertainment of lower levels (e.g., seeking medical attention, hospitalization) [ | |||
| Serologic ascertainment of infection [ | |||
| Individuals ascertained by a different mechanism, e.g., named healthy contacts of cases who subsequently test positive, could be a more representative group in whom to assess severity [ | |||
|
| Spuriously decreases estimate of CFR | SARS [ | Limit analysis to those cases with sufficiently long follow-up for a death to have been recorded had a death occurred. While this may lead to extremely small sample sizes near the beginning of an epidemic, this strategy is more feasible after a local epidemic wave, including reporting delays, has passed or nearly passed [ |
| Limit analysis to those cases known either to have died or recovered, but exclude those with unknown outcome (biased if severity affects outcome ascertainment)[ | |||
| Apply a competing-risk Kaplan-Meier–like method or a parametric mixture model to the full dataset (biased if the times to death and time to recovery have different distributions) [ | |||
| Fit the distribution of times to death and to recovery to estimate the true CFR [ |
Fig 1Illustration of delayed reporting bias in an exponentially growing epidemic.
In an ongoing epidemic, there will typically be a delay between the reporting of a case and the reporting of the death of that case, if the infected person dies. Thus, at any moment, there will be some cases reported who will die of the infection but who have not yet died, or whose deaths have not yet been reported. Simple division of the number of deaths reported by week w (green), by the number of cases reported by week w (blue) will underestimate the CFR because the numerator does not include all those cases in the denominator who will eventually die. With a reporting delay of 3 weeks for deaths compared to cases, the reported deaths curve will be shifted 3 weeks to the right, relative to the curve of the total number of cases reported by week w who will die (red). If the epidemic doubling time is 2 weeks, as shown here, the underestimate of CFR will be by a factor of about 23/2 ≈ 2.8, with the exponent being the number of epidemic doubling times that pass between case reporting and death reporting. In reality, there will be a distribution of reporting delays rather than a fixed delay, making this a heuristic rather than exact approach. The problem is ameliorated in an epidemic that grows more slowly or less than exponentially. For more details, see references in Table 1.
Potential biases that can affect the comparison of CFR across groups (relative CFR), using the example of comparing the CFR among hospitalized and non-hospitalized persons to assess the relative CFR for hospitalization.
| Bias | Direction | Outbreaks in which analysts have noted this bias may be operating | Possible solutions/means of detecting the bias |
|---|---|---|---|
|
| Spurious protective effect of hospitalization on risk of death | Ebola (this article) | Conditioning analysis on survival up to day |
| Individuals identified before becoming cases (e.g., as healthy contacts of infected persons) and actively followed regardless of clinical severity could be analyzed separately as a prospective cohort for whom the course of disease could be observed and this restriction readily made. | |||
|
| May be in either direction, depending on whether those receiving the intervention have better or worse prognosis. | Ebola (this article), H1N1pdm (effect of antiviral treatment on death) [ | In principle, analysis can adjust for prognostic factors that also predict hospitalization via matching, stratification, or multivariable analysis. In practice, such information may be unavailable [ |
| Such adjustments will be more readily made if data are obtained prospectively from a cohort of cases identified before becoming cases. | |||
|
| Direction of bias depends on the probabilities of inclusion in the dataset depending on exposure and outcome. | Ebola (this article) | Without knowledge of how cases came to enter a dataset, the magnitude of this bias cannot be evaluated. Under assumptions about the proportion of cases entering the dataset for various reasons, a sensitivity analysis could be performed to assess the plausibility of assigning any observed protective effect to this bias [ |
| This bias too may be avoided by prospectively following a cohort of individuals who are identified before becoming cases. |
Effect of selection bias on estimates of relative CFR on the risk ratio (RR) and odds ratio (OR) scale.
| Joint frequencies of hospitalization and death in the whole population among those alive at day 8 of symptoms | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 200 | 400 | 600 |
| 0.75 |
| Die | 800 | 600 | 1,400 |
| 0.375 |
| Total | 1,000 | 1,000 | 2,000 |
| -0.20 |
| Assumed probability of being in the database sample given hospitalization and death | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Average | |||
| Survive | 5% | 40% | 28% |
| 0.16 |
| Die | 40% | 50% | 44% | ||
| Average | 33% | 46% | 40% | ||
| Frequencies of persons in the database | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 10 | 160 | 170 |
| 0.67 |
| Die | 320 | 300 | 620 |
| 0.06 |
| Total | 330 | 460 | 790 |
| -0.32 |
Subscript P represents the population values, while subscript D represents the values measured for those cases included in the data base; selection bias produces the discrepancy. The extent of selection bias may be measured as , where S is the probability a case with exposure (hospitalization at day 8) i and outcome (mortality) j appears in the database. In this example, selection bias spuriously enhances the negative association between hospitalization on day 8 and death, on all scales: RR, OR, and RD.
Effect of selection bias on estimates of relative CFR on the risk ratio (RR) odds ratio (OR) and risk difference (RD) scales.
| Joint frequencies of hospitalization and death in the whole population among those alive at day 8 of symptoms | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 200 | 400 | 600 |
| 0.75 |
| Die | 800 | 600 | 1,400 |
| 0.375 |
| Total | 1,000 | 1,000 | 2,000 |
| -0.20 |
| Assumed probability of being in the database sample given hospitalization and death | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Average | |||
| Survive | 20% | 40% | 33% |
| 1.13 |
| Die | 40% | 90% | 61% | ||
| Average | 36% | 70% | 53% | ||
| Frequencies of persons in the database | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 40 | 160 | 200 |
| 0.87 |
| Die | 320 | 540 | 860 |
| 0.42 |
| Total | 360 | 700 | 1060 |
| -0.12 |
In this example, selection bias spuriously reduces the negative association between hospitalization on day 8 and death, on all three scales: RR, OR, and RD.
Effect of selection bias on estimates of relative CFR on the risk ratio (RR) and odds ratio (OR) scale.
| Joint frequencies of hospitalization and death in the whole population among those alive at day 8 of symptoms | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 200 | 400 | 600 |
| 0.75 |
| Die | 800 | 600 | 1,400 |
| 0.375 |
| Total | 1,000 | 1,000 | 2,000 |
| -0.20 |
| Assumed probability of being in the database sample given hospitalization and death | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Average | |||
| Survive | 20% | 40% | 33% |
| 0.88 |
| Die | 40% | 70% | 53% | ||
| Average | 36% | 58% | 47% | ||
| Frequencies of persons in the database | |||||
| Not hospitalized on day 8 of symptoms | Hospitalized on day 8 of symptoms | Total | |||
| Survive | 40 | 160 | 200 |
| 0.81 |
| Die | 320 | 420 | 740 |
| 0.33 |
| Total | 360 | 580 | 940 |
| -0.16 |
In this example, selection bias spuriously enhances the negative association between hospitalization on day 8 and death on the RR and RD scales and reduces it (biases toward a null association) on the OR scale.