| Literature DB >> 36064448 |
Sharon Liu1, Scott R Garrison2.
Abstract
BACKGROUND: Stopping trials early because of a favourable interim analysis can exaggerate benefit. This study simulated trials typical of those stopping early for benefit in the real world and estimated the degree to which early stopping likely overestimates benefit.Entities:
Keywords: Interim analysis; Monte Carlo methods; Overestimation; Selection bias; Stopping early; Truncated trials; Winner’s curse
Mesh:
Year: 2022 PMID: 36064448 PMCID: PMC9446780 DOI: 10.1186/s13063-022-06689-9
Source DB: PubMed Journal: Trials ISSN: 1745-6215 Impact factor: 2.728
Characteristics of simulated trials observing statistically significant benefit
| Characteristic | Typical truncated trials ( | Typical truncated trials if carried to completion ( | Large truncated trials ( |
|---|---|---|---|
| Number of participants | 541 (456–626) | 771 (648–896) | 9,616 (8070–11,160) |
| Average follow-up (months) | 27.1 (23.0–31.2) | 49.9 (42.0–57.9) | 40.2 (33.8–46.7) |
| Placebo event rate (per 100 person-years) | 8.6 (7.3–10.0) | 7.7 (6.5–9.0) | 2.3 (1.9–2.7) |
| 2.782 | 1.960 | 2.358 | |
| Number of events | 72 (53–96) | 194 (142–259) | 588 (429–784) |
| Number of trials that overestimate benefit | 42,191 (89.4) | 181,218 (74.6) | 195,854 (66.3) |
| Number of trials that underestimate benefit | 4983 (10.6) | 61,611 (25.4) | 99,748 (33.7) |
| True RRR (%) | 38.6 (27.9–50.9) | 27.0 (18.7–36.6) | 26.0 (18.9–35.0) |
| Observed RRR (%) | 54.8 (47.5–63.6) | 32.3 (25.6–41.2) | 28.1 (21.9–36.7) |
| Absolute RRR overestimate (%) | 14.9 (6.4–24.6) | 5.3 (−0.1 to 11.4) | 2.3 (−1.3–6.3) |
| Observed RRR/true RRR | 1.37 (1.12–1.83) | 1.18 (0.99–1.51) | 1.08 (0.96–1.28) |
| Number of trials with negative true benefit (i.e. harm) | 265 (0.6) | 2172 (0.9) | 456 (0.2) |
| Number of trials with observed RRR/true RRR in range | |||
| 1.0–1.2 | 10,559 (22.4) | 63,803 (26.3) | 99,369 (33.6) |
| 1.2–1.5 | 12,194 (25.8) | 53,165 (21.9) | 56,779 (19.2) |
| > 1.5 | 19,173 (40.6) | 62,078 (25.6) | 39,250 (13.3) |
Data are median (IQR), or number (%)
For “Typical truncated trials” and “Large truncated trials”, this refers to statistical significance at the time of interim analysis, when an early stopping decision is being made. For “Typical truncated trials if carried to completion”, this refers to statistical significance at trial conclusion if no interim analysis is carried out
Fig. 1Scatterplot of observed relative risk reduction versus true relative risk reduction for all simulated typical truncated trials where true benefit existed. Figure shows overestimates in purple and underestimates in orange. Overestimates are 8.5 times more numerous than underestimates. Superimposed is a histogram of observed RRR/true RRR to show the range of overestimation
Fig. 2Absolute overestimate of the relative risk reduction versus the total number of outcome events in simulated trials. Coloured trendlines, along with grey-shaded areas representing the 95% confidence intervals, are created using a generalized additive model
Fig. 3Absolute overestimate of the relative risk reduction as a function of the number of observed events and the observed relative risk reduction in simulated typical truncated trials. Data are a “heatmap” of the absolute overestimate of the relative risk reduction showing all cells for which there were at least 50 simulated trials. The range over which data is displayed is the range over which our model predictions can be applied.
Fig. 4Comparison of our model’s predictions for overestimation of benefit with real-world overestimates of effect. Figure shows our model predictions, displayed as observed relative risk reduction/true relative risk reduction, overlayed on top of the summary figure from Bassler et al. [9]. The Bassler estimates come from comparing the effect estimate in truncated trials with non-truncated trials answering the same question. As in our Figure 2, the purple line represents typical truncated trials, the red line represents typical truncated trials if instead carried to completion, and the orange line represents large truncated trials