| Literature DB >> 23242385 |
James Wason1, Dominic Magirr2, Martin Law3, Thomas Jaki2.
Abstract
Multi-arm multi-stage designs can improve the efficiency of the drug-development process by evaluating multiple experimental arms against a common control within one trial. This reduces the number of patients required compared to a series of trials testing each experimental arm separately against control. By allowing for multiple stages experimental treatments can be eliminated early from the study if they are unlikely to be significantly better than control. Using the TAILoR trial as a motivating example, we explore a broad range of statistical issues related to multi-arm multi-stage trials including a comparison of different ways to power a multi-arm multi-stage trial; choosing the allocation ratio to the control group compared to other experimental arms; the consequences of adding additional experimental arms during a multi-arm multi-stage trial, and how one might control the type-I error rate when this is necessary; and modifying the stopping boundaries of a multi-arm multi-stage design to account for unknown variance in the treatment outcome. Multi-arm multi-stage trials represent a large financial investment, and so considering their design carefully is important to ensure efficiency and that they have a good chance of succeeding.Entities:
Keywords: Clinical trial design; group-sequential designs; interim analysis; multi-arm multi-stage designs; multiple-testing; statistical design
Mesh:
Year: 2012 PMID: 23242385 PMCID: PMC4843088 DOI: 10.1177/0962280212465498
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Group size and power of designs 1-3 at different power scenarios. Design 1 has sample size chosen so that power at the LFC with δ1 = 0.545 and δ0 = 0.178 is 0.9; design 2 has sample size chosen so that power at the LFC with δ1 = 0.545 and δ0 = 0 is 0.9; design 3 has sample size chosen so that power to recommend any treatment when all have effect δ = 0.545
| Design 1 | Design 2 | Design 3 | |
|---|---|---|---|
| Required group size | 36 | 32 | 17 |
| ℙ (Recommend treatment 1) when δ1 = 0.545, δ0 = 0.178 | 0.904 | 0.872 | 0.605 |
| ℙ (Recommend treatment 1) when δ1 = 0.545, δ0 = 0 | 0.938 | 0.908 | 0.643 |
| ℙ (Recommend any treatment) when δ = (0.545,…, 0.545) | 0.996 | 0.992 | 0.905 |
Allocation ratio giving lowest maximum sample size as J (number of stages) and K (number of experimental arms) varies
| 2 | 3 | 4 | ||
| K | 2 | 1.24 | 1.20 | 1.18 |
| 3 | 1.35 | 1.32 | 1.35 | |
| 4 | 1.43 | 1.43 | 1.47 | |
| 6 | 1.59 | 1.49 | 1.47 | |
| 8 | 1.59 | 1.53 | 1.49 | |
Figure 1.Maximum sample size and maximum cost (arbitrary units) of treatment as allocation ratio changes. Designs are chosen using triangular stopping boundaries such that they give 5% type-I error and 90% power. Maximum cost assumes that the cost of allocating a patient to the control group is c, and the cost of allocating a patient to an experimental treatment is 1 where c ∈ {1, 0.5, 0.25, 0.1}.
FWER and power estimates as the true standard deviation varies from the assumed value of 1 for three-stage design with four experimental arms, n = 35, f = (0, 1.44, 2.34), e = (2.71, 2.39, 2.34). 100,000 independent replicates used to estimate type-I error and power. Z-test is using the original boundaries with a Z-statistic, t-test the original boundaries with a t-statistic while t-testcorr uses a t-statistic with corrected boundaries. Monte Carlo standard error for estimated type-I error ≈ 0.0007. Maximum Monte Carlo standard for power estimate ≈0.0015
| Type-I error | Power | |||||
|---|---|---|---|---|---|---|
| σ | ||||||
| 0.25 | 0.000 | 0.054 | 0.050 | 1.000 | 1.000 | 1.000 |
| 0.5 | 0.000 | 0.054 | 0.050 | 0.999 | 0.997 | 0.997 |
| 0.75 | 0.005 | 0.056 | 0.051 | 0.975 | 0.973 | 0.975 |
| 1 | 0.049 | 0.054 | 0.049 | 0.900 | 0.892 | 0.893 |
| 1.25 | 0.140 | 0.055 | 0.050 | 0.791 | 0.730 | 0.728 |
| 1.5 | 0.236 | 0.053 | 0.049 | 0.691 | 0.562 | 0.558 |
| 1.75 | 0.327 | 0.054 | 0.050 | 0.613 | 0.432 | 0.426 |
| 2 | 0.396 | 0.054 | 0.050 | 0.549 | 0.330 | 0.325 |
FWER and power estimates as the true standard deviation varies from the assumed value of 1 for three-stage design with four experimental treatments, n = 10, f = (0, 1.43, 2.34), e = (2.70, 2.39, 2.34). 100,000 independent replicates used to estimate type-I error and power. Z-test is using the original boundaries with a Z-statistic, t-test the original boundaries with a t-statistic while t-testcorr uses a t-statistic with corrected boundaries. Monte Carlo standard error for estimated type-I error ≈0.0007. Maximum Monte Carlo standard for power estimate ≈0.0015
| Type I error | Power | |||||
|---|---|---|---|---|---|---|
| σ | ||||||
| 0.25 | 0.000 | 0.069 | 0.053 | 1.000 | 1.000 | 1.000 |
| 0.5 | 0.000 | 0.069 | 0.052 | 0.999 | 1.000 | 1.000 |
| 0.75 | 0.005 | 0.069 | 0.052 | 0.976 | 0.993 | 0.993 |
| 1 | 0.051 | 0.070 | 0.052 | 0.910 | 0.918 | 0.911 |
| 1.25 | 0.140 | 0.068 | 0.051 | 0.853 | 0.758 | 0.740 |
| 1.5 | 0.238 | 0.070 | 0.053 | 0.777 | 0.587 | 0.562 |
| 1.75 | 0.326 | 0.069 | 0.052 | 0.707 | 0.455 | 0.429 |
| 2 | 0.398 | 0.069 | 0.052 | 0.642 | 0.355 | 0.328 |
Error rates when treatment is added at interim, keeping the original boundaries. Based on 100,000 simulations
| Design | n |
|
|
| ||
|---|---|---|---|---|---|---|
| OBF | (0,2.169) | (3.068,2.169) | 44 | 0.059 | 0.903 | 0.870 |
| P | (0,2.375) | (2.375,2.375) | 50 | 0.056 | 0.903 | 0.739 |
| T | (0.811,2.293) | (2.432,2.293) | 50 | 0.057 | 0.901 | 0.767 |
Error rates when treatment is added at interim, adjusting the upper boundary at the second stage. Based on 100,000 simulations
| Design |
|
|
|
|
|---|---|---|---|---|
| OBF | 2.245 | 0.051 | 0.893 | 0.862 |
| P | 2.455 | 0.051 | 0.894 | 0.730 |
| T | 2.384 | 0.051 | 0.892 | 0.755 |
Monte Carlo estimates of familywise error rate (target α = α+ = 0.05) when Knew new treatments are added independently or on the basis of disappointing first stage results. Based on original OBF design, l1 = 0, u1 = 3.068, n = 44 and 100,000 simulations
|
| ℙ( | ||
|---|---|---|---|
| 1 | 2.245 | 0.051 | 0.052 |
| 3 | 2.353 | 0.051 | 0.053 |
| 10 | 2.561 | 0.050 | 0.054 |