| Literature DB >> 36184621 |
Brian M Alexander1,2, Lorenzo Trippa3,4, Steffen Ventz5, Sean Khozin6, Bill Louv7, Jacob Sands8, Patrick Y Wen9, Rifaquat Rahman1, Leah Comment2.
Abstract
Patient-level data from completed clinical studies or electronic health records can be used in the design and analysis of clinical trials. However, these external data can bias the evaluation of the experimental treatment when the statistical design does not appropriately account for potential confounders. In this work, we introduce a hybrid clinical trial design that combines the use of external control datasets and randomization to experimental and control arms, with the aim of producing efficient inference on the experimental treatment effects. Our analysis of the hybrid trial design includes scenarios where the distributions of measured and unmeasured prognostic patient characteristics differ across studies. Using simulations and datasets from clinical studies in extensive-stage small cell lung cancer and glioblastoma, we illustrate the potential advantages of hybrid trial designs compared to externally controlled trials and randomized trial designs.Entities:
Mesh:
Year: 2022 PMID: 36184621 PMCID: PMC9527257 DOI: 10.1038/s41467-022-33192-1
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1Two-stage hybrid trial (HT) and externally controlled trial (ECT) designs.
Panel (A) shows a two-stage HT design, with n1 and n2 and enrollments to the internal control (IC) and experimental arm in ratios r1E:r1C and r2E:r2C during the first and second stages of the study, respectively. An interim analysis (IA) determines if the study is closed for futility or not, and potentially updates the randomization ratio from r1E:r1C during the first stage to r2,C:r2,E for the second stage of the study. These decisions are supported by an index of dissimilarity (see Methods) between the IC and external control (EC) populations. The same index of dissimilarity is recomputed at completion of the study and supports the decision to leverage the EC data for estimating the treatment effects of the experimental therapeutic or not. Panel (B) describes an ECT design that enrolls all n = n1 + n2 patients to the experimental arm. The ECT uses patient-level data of the experimental arm and external control data for a futility IA and for the estimation and testing () of treatment effects at the final analysis.
Model-based simulation scenarios
| Scenarios | Distribution of pre-treatment variables in the EC population | Effect of pre-treatment variables on the outcome in the EC (and HT) population | Response rates for the EC, IC, and EXPT | ||||||
|---|---|---|---|---|---|---|---|---|---|
| EC | IC and EXPT (TE = 0) | EXPT (TE > 0) | |||||||
| 1 | 0.2 | 0.8 | 0.5 | 0.5 | −0.5 | 0.0 | 0.43 | 0.50 | 0.68 |
| 2 | 0.2 | 0.8 | 0.1 | 0.5 | −0.5 | 1.5 | 0.46 | 0.66 | 0.79 |
| 3 | 0.2 | 0.8 | 0.9 | 0.5 | −0.5 | 1.5 | 0.73 | 0.66 | 0.79 |
| 4 | 0.2 | 0.8 | 0.1 | 0.5 | −1.5(1.5) | 1.5 | 0.30 | 0.66 | 0.79 |
| 5 | 0.2 | 0.8 | 0.9 | 0.5 | −1.5(1.5) | 1.5 | 0.55 | 0.66 | 0.79 |
We consider three binary pre-treatment variables X = (X1, X2, X3). The variable is not available and is not used in the interim and final analyses. For patients enrolled in the hybrid trial (HT), the three pre-treatment variables are independent, with for Columns 2–4 report the distribution of the three independent variables in the external control (EC) population. Patient outcomes Y, given the pre-treatment variables, were randomly generated from a logistic model, and where . Columns 5–7 show the effects (, log odds ratio) of the pre-treatment variables on the expected outcome Y in the EC (S = 1) and HT (S = 0) populations. When we omit the value in parenthesis (). The treatment effect (TE, log odds ratio) for ineffective and effective experimental treatments equals . Columns 8–10 show the average response probability for the EC , the internal control (IC) , and the experimental treatment (EXPT, ) populations with and without treatment effects.
Operating characteristics of the HT, ECT and RCT designs
| No treatment effect (TE=0) | Positive treatment effect (TE > 0) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Design | HT | HT | HT | ECT | RCT | HT | HT | HT | ECT | RCT |
| randomization ratio | 1:1 | 1:2 | 0:1 | 0:1 | 1:1 | 1:1 | 1:2 | 0:1 | 0:1 | 1:1 |
| Type I error rate (%) | 6 | 4 | 6 | 4 | 5 | – | – | – | – | – |
| Power (%) | – | – | – | – | – | 70 | 71 | 73 | 93 | 67 |
| % of trials stopped at IA | 20 | 15 | 15 | 44 | 7 | 0 | 0 | 0 | 0 | 0 |
| Average study duration | 21 | 22 | 22 | 18 | 23 | 24 | 24 | 24 | 24 | 24 |
| Average sample size | 108 | 112 | 111 | 93 | 115 | 120 | 120 | 120 | 120 | 120 |
| Type I error rate (%) | 6 | 7 | 8 | 71 | 5 | – | – | – | – | – |
| Power (%) | – | – | – | – | – | 54 | 55 | 56 | 100 | 54 |
| % of trials stopped at IA | 8 | 8 | 8 | 2 | 7 | 0 | 0 | 0 | 0 | 0 |
| Average study duration | 23 | 23 | 23 | 24 | 23 | 24 | 24 | 24 | 24 | 24 |
| Average sample size | 115 | 115 | 115 | 119 | 115 | 120 | 120 | 120 | 120 | 120 |
| Type I error rate (%) | 5 | 5 | 6 | 0 | 5 | – | – | – | – | – |
| Power (%) | – | – | – | – | – | 54 | 53 | 54 | 12 | 53 |
| % of trials stopped at IA | 18 | 12 | 12 | 98 | 7 | 4 | 3 | 3 | 42 | 0 |
| Average study duration | 22 | 22 | 22 | 12 | 23 | 23 | 23 | 23 | 19 | 24 |
| Average sample size | 109 | 113 | 113 | 61 | 116 | 118 | 118 | 118 | 95 | 120 |
| Type I error rate (%) | 5 | 5 | 6 | >99 | 5 | – | – | – | – | – |
| Power (%) | – | – | – | – | – | 53 | 53 | 53 | 100 | 54 |
| % of trials stopped at IA | 7 | 7 | 8 | 0 | 7 | 1 | 0 | 1 | 0 | 1 |
| Average study duration | 23 | 23 | 23 | 24 | 23 | 24 | 24 | 24 | 24 | 24 |
| Average sample size | 116 | 116 | 115 | 120 | 116 | 120 | 120 | 120 | 120 | 120 |
| Type I error rate (%) | 6 | 7 | 9 | 15 | 5 | – | – | – | – | – |
| Power (%) | – | – | – | – | – | 65 | 65 | 65 | 93 | 53 |
| % of trials stopped at IA | 22 | 15 | 15 | 55 | 7 | 1 | 1 | 1 | 1 | 1 |
| Average study duration | 21 | 22 | 22 | 17 | 23 | 24 | 24 | 24 | 24 | 24 |
| Average sample size | 107 | 111 | 111 | 87 | 115 | 119 | 119 | 119 | 119 | 120 |
We consider different distributions of measured () and unmeasured () patient pre-treatment characteristics (see Table 1 for details). We provide results for an experimental treatment with (columns 7–11) and without (columns 2–6) positive treatment effects (TEs). For each scenario, we report the type I error rate (i.e., the probability of rejecting the null hypothesis when TE = 0), the power (i.e., the probability of rejecting the null hypothesis when TE > 0), the proportion of trials stopped early for futility, the average sample size, and average study duration (months) across 2000 simulations.
Fig. 2Operating characteristics of in silico HTs, ECTs and RCTs generated by resampling the control arms of the ES-SCLC studies.
The top row shows type I error rates (panel A, solid vertical lines with a cross), power (Panel A, dotted vertical lines with an arrow), and the variability/bias of the treatment effect estimates (panel B). In panel (B), the dots indicate the average treatment effect estimates across in silico trials (n = 75) and the vertical bars indicate the 5% and 95% quantiles. Panels (A) and (B) are representative of an ideal setting, without unmeasured confounders, and identical conditional outcome distributions of the SOC across studies. The bottom row (Panels C and D) shows the same operating characteristics as the top row when we used the leave-one-study-out resampling algorithm to generate in silico trials (n = 75).
Resampling-based evaluation of the operating characteristics of the HT, ECT, and RCT designs in GBM
| Design | HT | HT | HT | ECT | RCT |
|---|---|---|---|---|---|
| Assignment ratio | 1:1 | 1:2 | 0:1 | 0:1 | 1:1 |
| Chinot et al.[ | |||||
| Type I error rate (%) | 5 | 6 | 5 | 3 | 5 |
| % of trials stopped at IA | 24 | 24 | 24 | 42 | 7 |
| Average study duration | 17 | 17 | 17 | 16 | 19 |
| Average sample size | 88 | 88 | 88 | 78 | 96 |
| Average TE estimate | −0.02 | −0.02 | −0.02 | −0.01 | −0.01 |
| (10% and 90% quantiles) | (−0.16,0.12) | (−0.15,0.12) | (−0.14,0.10) | (−0.09,0.07) | (−0.16,0.12) |
| DFCI[ | |||||
| Type I error rate (%) | 6 | 5 | 4 | 3 | 6 |
| % of trials stopped at IA | 27 | 27 | 25 | 50 | 7 |
| Average study duration | 17 | 17 | 17 | 15 | 19 |
| Average sample size | 86 | 87 | 88 | 75 | 96 |
| Average TE estimate(10% and 90% quantiles) | −0.02 (−0.16,0.12) | −0.02 (−0.15,0.11) | −0.01 (−0.14,0.10) | −0.02 (−0.09,0.07) | −0.01 (−0.16,0.12) |
| UCLA[ | |||||
| Type I error rate (%) | 5 | 6 | 5 | 3 | 4 |
| % of trials stopped at IA | 25 | 24 | 24 | 42 | 6 |
| Average study duration | 17 | 17 | 17 | 16 | 19 |
| Average sample size | 88 | 88 | 88 | 79 | 97 |
| Average TE estimate | −0.02 | −0.02 | −0.02 | −0.02 | −0.01 |
| (10% and 90% quantiles) | (−0.14,0.12) | (−0.15,0.11) | (−0.14,0.09) | (−0.09,0.06) | (−0.14,0.12) |
| Chinot et al.[ | |||||
| Power (%) | 73 | 78 | 84 | 85 | 58 |
| % of trials stopped at IA | <1 | <1 | <1 | <1 | <1 |
| Average study duration | 20 | 20 | 20 | 20 | 20 |
| Average sample size | 100 | 100 | 100 | 100 | 100 |
| Average TE estimate | 0.15 | 0.15 | 0.15 | 0.15 | 0.15 |
| (10% and 90% quantiles) | (0.04,0.25) | (0.05,0.25) | (0.05,0.24) | (0.09,0.21) | (0.06,0.26) |
| DFCI[ | |||||
| Power (%) | 77 | 82 | 88 | 92 | 63 |
| % of trials stopped at IA | <1 | <1 | <1 | <1 | <1 |
| Average study duration | 20 | 20 | 20 | 20 | 20 |
| Average sample size | 100 | 100 | 100 | 100 | 100 |
| Average TE estimate | 0.17 | 0.17 | 0.17 | 0.18 | 0.17 |
| (10% and 90% quantiles) | (0.04,0.3) | (0.06, 0.27) | (0.08,0.26) | (0.11,0.24) | (0.06,0.28) |
| UCLA[ | |||||
| Power (%) | 73 | 78 | 85 | 86 | 58 |
| % of trials stopped at IA | <1 | 1 | <1 | <1 | <1 |
| Average study duration | 20 | 20 | 20 | 20 | 20 |
| Average sample size | 100 | 100 | 100 | 100 | 100 |
| Average TE estimate | 0.15 | 0.15 | 0.15 | 0.14 | 0.15 |
| (10% and 90% quantiles) | (0.04,0.26) | (0.04,0.25) | (0.05,0.24) | (0.08,0.21) | (0.04,0.26) |
We used individual-level data from patients treated with TMZ+RT from five GBM datasets. Rows 3–24 and 25–46 show results for an experimental treatment without a treatment effect (TE, rows 3–24) and with a positive TE (rows 25–46), respectively. We report the type I error rate (i.e., the probability of rejecting the null hypothesis when TE = 0), the power (i.e. the probability of rejecting the null hypothesis when TE > 0), the proportion of trials stopped early for futility, the average sample size, the average study duration (months), and the average (10% and 90% quantiles) estimate of the treatment effect, across 2000 in silico trials.
Fig. 3Graphical representation of the leave-one-study-out resampling algorithm.
Step (i), we randomly sample with replacement n patient profiles and the corresponding outcomes from the control arm (SOC) of study k. Step (ii), we use the control arms of the remaining studies as externally controlled (EC) data. Step (iii) we randomize n1 of the patients in Step (i) to the experimental treatment (EXPT) and the SOC arms of our in silico trial and compute the index If (), the futility interim analysis (IA) leverages (does not leverage) EC data, and we use the ratio r2,C:r2,E (r1,C:r1,E) for the remaining n2 = n−n1 patients during the 2nd stage. For the final analysis, we recompute the dissimilarity index , and use (don’t use) EC data for inference on treatment effects if (). We repeated these Steps (i) to (iii) 2000 times using different random samples.