Literature DB >> 29900577

Bayesian selective response-adaptive design using the historical control.

Mi-Ok Kim^1,2, Nusrat Harun¹, Chunyan Liu³, Jane C Khoury³, Joseph P Broderick⁴.

Abstract

High quality historical control data, if incorporated, may reduce sample size, trial cost, and duration. A too optimistic use of the data, however, may result in bias under prior-data conflict. Motivated by well-publicized two-arm comparative trials in stroke, we propose a Bayesian design that both adaptively incorporates historical control data and selectively adapt the treatment allocation ratios within an ongoing trial responsively to the relative treatment effects. The proposed design differs from existing designs that borrow from historical controls. As opposed to reducing the number of subjects assigned to the control arm blindly, this design does so adaptively to the relative treatment effects only if evaluation of cumulated current trial data combined with the historical control suggests the superiority of the intervention arm. We used the effective historical sample size approach to quantify borrowed information on the control arm and modified the treatment allocation rules of the doubly adaptive biased coin design to incorporate the quantity. The modified allocation rules were then implemented under the Bayesian framework with commensurate priors addressing prior-data conflict. Trials were also more frequently concluded earlier in line with the underlying truth, reducing trial cost, and duration and yielded parameter estimates with smaller standard errors.

Entities: Chemical Disease Gene Species

Keywords: Bayesian design with commensurate priors; borrowing on the historical control data; doubly adaptive biased coin design; response-adaptive design

Mesh：

Year: 2018 PMID： 29900577 PMCID： PMC6221103 DOI： 10.1002/sim.7836

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

INTRODUCTION

Historical data have conventionally been used for establishing parameters needed for designing a proposed clinical trial but have rarely been used for evaluating the scientific aims. High quality historical control data exist, for example, when the treatment of intervention arm of a successful historical trial serves as the treatment in the control arm of a proposed trial, as illustrated in a published stroke trial described later. Since a seminal article by Pocock,1 several researchers have proposed combining historical and concurrent controls in analysis2, 3, 4 by discounting historical data to account for between‐trial heterogeneity. Despite careful selection of historical trials, study design, conduct, or subject population may differ so past information may not be relevant for the proposed trial. Bayesian designs were introduced to address such prior‐data conflict by data adaptively determining the degree of borrowing on the control arm using power priors,5, 6, 7 commensurate priors,8, 9, 10 or meta‐analytic‐predictive priors.11 All this work, however, focused on reducing the number of study subjects assigned to the control group without taking into account the true relative treatment effects. This meant assigning more trial participants to the intervention arm even if the intervention arm was inferior, which in many cases is not desirable, as the safety of the intervention arm has not been well established. In our work, we propose a Bayesian design that adaptively incorporates historical control data while selectively adapting the allocation ratios in response to the true response rates only when evaluation of the combined current trial and historical control data suggests superiority of the intervention arm. The proposed design, therefore, aims to reduce the number of subjects treated with the control arm only when the intervention arm is truly superior. In this sense, the proposed design is doubly adaptive, adaptive to both the prior‐data conflict and to the true response rates. The proposed design combines a response‐adaptive randomization scheme selected from the literature with a Bayesian design that adaptively borrows on the historical control. We specifically combine the response‐adaptive randomization scheme used in the doubly adaptive biased coin design (DBCD)12 and a Bayesian design using commensurate priors.10 These choices have some advantages; the DBCD randomization is known to perform less variably compared to sequential maximum likelihood estimate procedures.13 Nevertheless, the choices are largely arbitrary and other response‐adaptive randomization schemes including Bayesian designs borrowing on the historical control can be considered. From the response‐adaptive randomization point of view, the proposed design may better realize the purported therapeutic advantages of adaptive randomization in a two‐arm comparative trial. Compared with conventional fixed designs, response‐adaptive randomization is believed to improve the trial participants' outcomes by skewing allocation probabilities in favor of better performing arms at the time of randomization and, hence, are ethically superior and desirable.14, 15 Apart from controversies over the ethics of adaptive randomization,16 the purported therapeutic advantages have also been questioned: the advantages were reported to exist only when the true relative effects differ greatly, and consequently, the purported therapeutic advantages were rarely realized in practice.17, 18, 19 The advantages, if they exist, also disappeared when trials were allowed to stop early for efficacy or futility.17 We shall evaluate whether the proposed Bayesian design overcomes these shortcomings. Comparative two‐arm trials typically include interim analyses for early stopping for efficacy or futility, and the primary endpoints are frequently observed after a defined time delay. The challenge of delayed responses has been addressed by utilizing correlated short‐term endpoints, and varying degree of efficiency gains by the strength of the correlation was reported (see the work of Kim et al20 for example). Huang et al21 proposed a Bayesian joint modeling approach utilizing short‐term outcomes, which we adapted here. We also accommodated interim analyses in the proposed two‐arm trial design. This paper is organized as follows. Section 2 describes a published stroke trial for a motivating study. The proposed new Bayesian design will be introduced in Section 3. Empirical study results will be presented in Section 4, and a summary and discussion will be presented in Section 5.

CASE STUDY

Two well‐known stroke trials, the National Institute of Neurological Disorders and Stroke t‐PA Stroke Study (NINDS) and the Interventional Management of Stroke (IMS) III trial, are used to illustrate the proposed design. The NINDS trial demonstrated the efficacy of recombinant tissue plasminogen activator (rt‐PA) compared with a placebo control.22 Small randomized trial23 and two single‐group trials24, 25 reported improved efficacy of intravenous rt‐PA (IV rt‐PA) followed by endovascular therapy. Based on this preliminary work, the IMS III trial evaluated the efficacy of combining IV rt‐PA with intra‐arterial recanalization endovascular therapy against the IV rt‐PA alone control in moderate/severe stroke patients. The NINDS trial included 126 subjects with moderate and 56 subjects with severe stroke (severity at baseline as measured by the NIH Stroke Scale [NIHSS]) treated with IV rt‐PA alone who would have satisfied the eligibility criteria of the IMS III. Both trials were similar in the subject population and study conduct. We refer to the work of Khatri et al26 for details. The primary outcome in the IMS III trial was a successful recovery assessed at 90 days post stroke defined by the modified Rankin Score (0‐2 on a 6‐point scale). The trial targeted to enroll a maximum of 900 participants with three interim analyses planned when respectively 25%, 50%, and 75% of the target sample had the primary outcome observed. The participants were randomized in a 2:1 ratio to favor the combined therapy. The O'Brien and Fleming spending rules27 were used to determine the efficacy boundaries of the interim analyses to ensure the overall type I error rate at 5%. The criterion for early stopping for futility was prespecified as having less than 20% conditional power under the alternative hypothesis. The IMS III trial was designed to have 80% power detecting overall 10% absolute difference in the primary outcome (assuming 40% for control and 50% for the intervention arm). The trial stopped for futility after enrolling 656 participants. A post hoc analysis by stroke severity observed that the intervention arm had a lower response rate in the moderate stroke stratum by 1.0% but a higher response by 6.8% in the severe stroke stratum. The post hoc analysis and the preceding NINDS rt‐PA trial motivated the proposed design. The proposed design is stratified by baseline stroke severity to allow independent adaptation of treatment allocation and evaluation of the relative effect of the combined approach within each stratum. The fixed 2:1 ratio in the original trial also suggests that the new design incorporating the NINDS trial data to reduce the numbers of subjects needed in the concurrent control arm if the intervention arm is truly superior represents a potentially superior approach. The IMS III stroke trial exhibits common features of two‐arm comparative trials, which can be challenging to accommodate for a response‐adaptive trial. The primary outcome of a successful recovery assessed at 90 days post stroke was not completely observed in all previously enrolled study subjects when a new subject was available for randomization and allocation probabilities needed to be updated. Fifty percent of the time, 20 or more new subjects in the moderate stratum and 6 or more new subjects in the severe stroke stratum, had been enrolled before the primary outcome was completely observed in previously randomized subjects. Randomization can be adapted based on complete data only. The complete data only application, however, was shown reducing the efficiency of adaptive randomization.20, 21 Clinical trials commonly collect many short‐term outcomes, some of which are predictive of the long‐term outcomes. Approaches that mitigate the impact of the delay by utilizing observed predictors of the delayed outcome20, 21 have been proposed. In the IMS III stroke trial subjects were assessed at 24 hours post stroke for severity of their condition, an improvement (decrease) of 4 points or more from the baseline assessment measured on the NIHSS score was predictive of a successful recovery at 90 days post stroke. We utilized this short‐term predictor information similarly as the joint likelihood approach proposed21 to mitigate the impact of the delayed primary outcome.

METHODS

The proposed design incorporates the DBCD response‐adaptive randomization scheme12 in a Bayesian design using commensurate priors.10 It uses both the cumulating concurrent trial data and the historical control data to estimate the expected values of the primary outcome in each treatment group and updates allocation probabilities only if the primary outcome estimates suggest superiority of the intervention arm. We modify the treatment allocation probability computation to account for the additional information provided to the control group by the historical data. The effective historical sample size (EHSS) approach10 is used to quantify the borrowed historical data information in the control arm. For clarity of exposition, we first describe the likelihood for the delayed primary endpoint that incorporates the short‐term predictor outcome. We then describe how the commensurate priors and the effective sample size computation are adapted for the proposed design. Lastly, we describe how to modify the allocation probability computation to selectively adapt to the response and to account for the additional historical control data.

The joint outcome model

We use the trial context of the motivating example. We let T be the treatment indicator with T =1 for the intervention arm, X denote the strata, Z denote the short‐term predictor, and Y denote the primary endpoint. We assume Y may not be observable immediately, whereas X and Z are. In the motivating example, the primary endpoint of a successful recovery was determined at 90 days post stroke if a study subject survived beyond the assessment or was determined as a failure whenever subjects died within 90 days. In order to account for the delayed response, we let U denote times to death and are censored at some fixed time u 0 (post 90 days assessment time) with δ =1 denoting observed deaths within the time u 0. Given (X ,T ,Z ) = (x,t,z), we assume U are independent and identically distributed with the hazard and survival functions respectively denoted by h(u|x,t,z) and S(u|x,t,z). We further assume the following models: where α(x,t) = P(Z = 1|X = x,T = t) and β(x,t,z) = P(Y = 1|X = x,T = t,Z = z,δ = 0). Then, the probability of a successful recovery of the treatment arm t in the stroke stratum x is given by When the mth subject is enrolled and ready for treatment assignment, observations would be incomplete in the previously enrolled subjects who are alive (δ =0) but have not yet survived to the primary assessment time u 0. We treat the unobserved (U ,δ ,Y ) as missing at random and do single imputation. We suppose that a maximum N number of subjects would enroll in the current trial sequentially with d 1,…,d denoting the delayed entry times since the inception of the trial. When d − d denote the observed survival time censored at the entry of the mth subject, we impute using the posterior distribution conditioning on d − d . We let (u (m),δ (m),y (m)) denote the imputed values for U > d − d or the observed values for U ≤ d − d . When denotes a vector of parameters involved with α(x,t), β(x,t,z), and h(u|x,t,z) for the current trial, we have the following log‐likelihood for from the concurrent trial subject i: When denotes the current trial data available at the time of treatment allocation of the mth subject, the likelihood for is given by . We assume the same modeling assumptions also hold for the historical control data of sample size N 0. When 0 denotes the vector of the parameters involved with the historical control models, respectively denotes the control group log‐likelihood from the historical trial subject j with (u (N 0),δ (N 0),y (N 0)) denoting the completely observed observation. Then, the likelihood for 0 is given by: . Then, the likelihood that combines both the current trial and the historical control data is given by We extend the commensurate priors10 to this multivariate setting. We assume and 0 are ν−variate vectors with the elements denoted respectively by ( and 0( for l = 1,…,ν. The priors of given 0 is defined as follows: where φ(·| 0(,τ ) are the probability density functions of Gaussian random variables with the mean 0( and the variance 1/τ , l = 1,…,ν. The amount of commensurability for cross‐study borrowing is controlled by the hyperparameters τ , and the priors π(τ |) are given by “spike and slab” distributions28 that are mixture of uniform distributions and point probability distributions (see the Appendix for details). This prior has shown desirable bias‐variance trade‐offs for estimating concurrent effects in a univariate setting where a time‐to‐event outcome model was similarly assumed.10

Effective historical sample size

We compute EHSS following the work of Hobbs et al.10 Under the sequential enrollment considered here, when the mth study subject is ready for treatment assignment, the posterior distribution of based only on the current trial data is where p() is a noninformative prior distribution for . In contrast, the posterior distribution of based on both the historical and current trial data is where p 0() similarly denotes a noninformative prior distribution for 0. We let and be the posterior precisions of μ (t=0), x=0,1, the control arm response rate estimates by strata, drawn from the posterior distributions (5) and (6), respectively. We also let n ((x,t) denote the numbers of subjects previously assigned to the treatment arm t in the stratum x at the time when the mth subject is ready for treatment assignment. Given a linear relationship between the sample size and the posterior precisions, the EHSS in the stroke stratum x is approximated by If the historical data is highly commensurate with the current trial data, there will be large gain in precision and a large value for E H S S will result.

Response‐adaptive randomization

We adapt the DBCD response‐adaptive randomization scheme12 to incorporate borrowing from the historical control via E H S S and only to adapt if the superiority of the intervention arm is supported. DBCD is an adaptive randomization design constructed to target prespecified allocation proportions. With the probability of a successful recovery given by μ (t) in (2), we used the following allocation targets for the intervention group (t=1) by the stroke strata: The targets were chosen based on the minimum effect size considered in the motivating example, that is, increasing the response rates per stratum by 10% or more. The most common choice for the allocation target in the severe stroke stratum (x=0) is one with γ=1/2 but was not expected to yield the intended 10% or more increase. We suppose that the mth subject ready for randomization is enrolled in the stratum x. The DBCD randomization scheme12 assesses the proximity of the current sample proportion assigned to the intervention to the estimated target and determines the allocation probability of assigning the mth subject to the intervention arm using the following function: where and r (m) denote the estimated allocation target and the current sample proportion assigned to the intervention at the time, respectively. was computed using the posterior means and ξ=2 was recommended in the work of Rosenberger and Hu.29 We modified the computation of r (m) to account for the additional information provided by the historical control Compared with no borrowing, the denominator increases by the borrowed historical control information, making r (m) smaller. If adaptive randomization is invoked, then the allocation probability to the intervention arm will be more favorably skewed. The proposed Bayesian design adapts the allocation probabilities selectively only if an evidence exists for the superiority of the intervention arm. It has the following adaptive randomization scheme. Enroll a total N ∗ number of subjects across the strata using equal randomization within each stratum. If P{μ (t=1)−μ (t=0))>η 0}>ζ 0 for some 0<η 0,ζ 0<1, allocate each subject to the intervention arm adaptively by the allocation probabilities given in (8). Otherwise, allocate equally between the treatment arms. Conduct interim analyses when the total number of enrolled subjects across the strata reaches 50% or 75% of the maximum sample size N. In the interims, stop for efficacy if P{μ (t=1) − μ (t=0))>η }>ζ ; stop for futility if P{μ (t=1)−μ (t=0))>η }<ζ . When the total number of enrolled subjects across the strata reaches the maximum sample size N, conduct the final analysis. At the final analysis, conclude for efficacy if P{μ (t=1)−μ (t=0))>η }>ρ ; conclude for futility if P{μ (t=1)−μ (t=0))>η }<ρ . η 0,η , and η are predetermined by clinically meaningful differences as illustrated in the motivating example later. ζ 0,ζ ,ζ , ρ , and ρ are tuning parameters. ζ 0 is set to reduce the probability of assigning sizably more subjects, for example, 10% of the enrolled subjects, to the inferior treatment arm. ζ and ζ can be set differently for each interim but to collectively control the overall type I error rate and the overall type II error along with ρ and ρ at certain rates. More details of determining the values of the tuning parameters are provided in the next section.

EMPIRICAL STUDIES

We used the motivating stroke trial data and designed simulation studies to evaluate the performance of the proposed design. We assumed perfect commensurability and generated both the historical control and the concurrent trial data in the same way: simulated data consist of observations (d ,X ,Z ,U ,δ ,Y ), where (d ,X ,U ) were resampled from the observed data and (Z ,δ ,Y ) were generated from common outcome models. A subset of 182 subjects in the NINDS trial met the eligibility criteria of the IMS III trial, and we simulated a sample of 182 for the historical control data. For the current trial data, we simulated a sample of N = 900 (600 for the moderate and 300 for the severe stratum, approximately). We used one historical control data for all simulations conducted under the same parameter settings, whereas samples for the current trial data were created each time.

Data generation and simulation setup

Each observation was generated sequentially by the following scheme, ie, for each 1≤i≤N. Sample values of stroke severity X and enrollment time d from the observed data were used. For i≤150, treatment assignments T were made at a 1:1 ratio within each stratum via block randomization. For i>150, T were determined by the proposed randomization scheme in Section 3.3. Simulate Z : Z |X =x,T =t∼B e r n o u l l i(α(x,t)). Simulate δ : δ |X =x,T =t,Z =z∼B e r n o u l l i(1−S(u 0=90|x,t,z)) with 1−S(u 0|x,t,z)=P(U <90|x,t,z). If δ = 1, resample death time U . Otherwise, U = 90, and simulate Y : Y |X = x,T = t,Z = z ,δ = 0∼B e r n o u l l i(β(x,t,z)). The survival times greater than 90 days are arbitrarily censored since evaluation time is at 90 days. The observed 90 days survival rates (P(δ =0)) differed by the short‐term predictor but were similar between the treatment arms and across the strata conditioning on the short‐term predictor. We hence assumed S(u 0=90|x,t,z) = S(u 0=90|z) and resampled observed death time U stratified by strata. We used piecewise exponential distributions to model S(u 0 = 90|z) and logistic regressions to model α(x,t) and β(x,z,t). The related parameters were appropriately determined to equate 1 − S(u 0 = 90|z) with the observed rates and attain the response rates specified in Table 1. The response rates assumed under the null were based on the observed rates, whereas the ones assumed under the alternative were obtained by inflating the null rates to yield 80% power given the maximum concurrent trial sample size. We refer to the Appendix for detailed specifications of S(u 0 = 90|x,t,z), α(x,t), β(x,z,t) and other details including the full conditional posterior distributions.

Table 1

Table of outcome model settings

	Moderate				Severe
	H ₀		H ₁		H ₀		H ₁
	T = 0	T = 1	T = 0	T = 1	T = 0	T = 1	T = 0	T = 1
Per stratum response rate	50%		50%	63%	17%		17%	31%

Table of outcome model settings We first investigated the linearity assumption required for the approximated calculation of EHSS in (7). We simulated 1000 control group datasets under the null setting by varying the sample size from 100 through 1000 and computed the posterior precision of μ (t = 0) based on 5000 Markov chain Monte Carlo (MCMC) iterations from the posterior distribution in (5). The plot of the computed the posterior precision against the sample size shows that the purported linear relationship with the sample size holds reasonably in order to allow the approximation (Figure 1).

Figure 1

Posterior precision of the control arm response rate as a function of sample size to test the linearity assumption

Posterior precision of the control arm response rate as a function of sample size to test the linearity assumption We set η 0 = η = 10%. The 10% increase in the successful recovery was clinically meaningful in the stroke example and was the minimum effect size sought after in the original trial. Similarly as the original trial, the futility of each simulation was assessed against no difference in the successful recovery and we set η = 0. We then determined values of the tuning parameters ζ 0,ζ ,ζ ,ρ , and ρ via simulation. ζ 0 is involved with selectively invoking the response‐adaptive randomization scheme using the probability P{μ (t=1)−μ (t=0))>10%}. A large value of ζ 0 will reduce the likelihood of assigning more subjects to the intervention arm even under the null, whereas it will reduce the number of subjects assigned to the intervention under the alternative. This trade‐off is illustrated in Figure 2. Based on this consideration, we set ζ 0=0.4. We conducted 5000 simulations each under the null and the alternative to determine the values of ζ ,ζ ,ρ , and ρ in order to control the overall type I error rate at 5% and the type II error at 10%. The Bonferroni method was used to control the overall error rates across the strata.

Figure 2

Relationship between parameter values for triggering selective adaptive‐randomization and difference in percent treated

Relationship between parameter values for triggering selective adaptive‐randomization and difference in percent treated We used DBCD for the adaptive randomization for both borrowing and no borrowing. For the no‐borrowing case, we also considered equal randomization as a reference. The allocation probabilities were updated for a batch of every 10 and 20 new subjects in the severe and the moderate stratum, respectively, based on 500 MCMC iterations after 100 burn in. This decision was made out of concern on computation time and resources. In contrast, the interim and final analyses used 5000 MCMC iterations.

Simulation study results

The results presented in this section are based on 5000 Monte Carlo simulations each under the null and the alternative. The allocation probabilities to the intervention arm averaged over the Monte Carlo simulation changed over the course of the current trial adaptively to the underlying response rates (Figure 3). The proposed design is to selectively invoke the response‐adaptive randomization scheme only if the superiority of the intervention arm is supported. After a total of first 150 subjects equally allocated (∼ 50 in the severe and ∼100 in the moderate stratum), the allocation probabilities skewed to favor the intervention arm under the alternative as more subjects got enrolled. The probabilities were more heavily skewed with borrowing on the control from the historical data than without borrowing. The trend persisted throughout the trial. Under the null, the allocation probabilities initially skewed to favor the intervention arm but changed back to 0.5 as more subjects enrolled and the criterion invoking selective adaptive randomization got more reliably assessed. The allocation probabilities changed less stably with borrowing on the control arm. This is because the commensurability with the historical sample changed from simulation to simulation, although perfect commensurability was assumed.

Figure 3

Allocation probability to the intervention arm with accrual of patients by stratum

Allocation probability to the intervention arm with accrual of patients by stratum Table 2 summarizes the operating characteristics of the proposed design with borrowing and no borrowing and those of equal randomization when each design was calibrated to have the same type I and type II errors. Borrowing on the control improved operating characteristics of the response‐adaptive design. Under the alternative borrowing increased the power from 81.52% to 90.36% in the moderate stroke stratum and 61.62% to 77.16% in the severe stroke stratum. It also enabled the design to stop early correctly more frequently. The percentage that stopped early correctly under the alternative with and without borrowing is noticeably different in the severe stroke stratum specifically (69.52 versus 27.52). The information borrowed from the historical data consequently led to reduction in the sample size and trial duration. Under H1, on average 63 or 25% less subjects were enrolled in the severe stroke arm with borrowing, which is anticipated to reduce the trial duration by 637 days based on the real trial enrollment data. In the moderate stroke arm, 45 or 11% less were enrolled, which is anticipated to reduce the trial duration by 192 days. Under the alternative, more subjects were treated with the intervention and observed response rates were higher with borrowing. The benefits were more pronounced when compared with equal randomization fixed designs with no borrowing. Whether to borrow or not, with the response‐adaptive design the observed response rates were higher (27.55% and 26.07% versus 24.41% in the severe stroke stratum, 58.52% and 58.35% versus 57.34% in the moderate stroke stratum). Without borrowing, it is known that the responsive adaptive randomization may have lower power than equal randomization, since the goal of treating more patients with better treatment may conflict with the goal of maximizing power. That was clearly the case in the severe stroke stratum. The percentage of correctly stopping early was also lower (27.52% vs 58.62%), and hence, the adaptive randomization required the larger sample on average (256.31 vs 210.80) under the alternative. Borrowing on the control, however, offsets such compromises, even increasing power and reducing the sample size compared with equal randomization.

Table 2

Table of operating characteristics

	Severe Stroke Stratum
	H0			H1
	Borrowing	No Borrowing		Borrowing	No Borrowing
	AR	AR	ER	AR	AR	ER
Power	1.26	1.26	1.26	77.16	61.62	78.16
% Early stopped correctly	83.56	82.52	83.36	69.52	27.52	58.62
% Early stopped wrongfully	0.94	0.94	0.94	7.04	7.02	7.04
% Successful recovered (SD)	16.20 (3.51)	16.98 (2.87)	16.98 (2.94)	27.55 (3.66)	26.07 (2.91)	24.41 (3.07)
Average sample size	180.33	183.57	181.42	192.91	256.31	210.80
	Moderate Stroke Stratum
	H0			H1
	Borrowing	No Borrowing		Borrowing	No Borrowing
	AR	AR	ER	AR	AR	ER
Power	1.04	1.04	1.04	90.36	81.52	81.46
% Early stopped correctly	94.74	91.94	90.80	79.14	69.16	70.48
% Early stopped wrongfully	0.92	0.92	0.92	7.00	7.00	7.00
% Successful recovered (SD)	50.26 (2.41)	50.20 (2.75)	50.21 (2.75)	58.52 (2.83)	58.35 (2.56)	57.34 (2.49)
Average sample size	324.61	341.67	346.21	365.05	409.66	410.03

Table of operating characteristics We computed the observed treatment allocations by strata in each simulation in order to further examine whether the purported advantage of the response‐adaptive design was better realized by the proposed design with borrowing on the control arm (Figure 4). Interim analyses concluded to stop early in some simulations and the observed treatment allocations were normalized as percentages over the per stratum enrolled sample sizes. Borrowing on the control led to assigning many more subjects to the intervention arm under the alternative. On average, 69.2% and 60.2% subjects were assigned to the intervention in the severe and the moderate stratum respectively with borrowing on the control. These were higher than the observed averages of 62.3% and 57.4% obtained via response‐adaptive randomization alone without borrowing. In order for the response‐adaptive design alone to target similarly higher proportions of intervention treated subjects without borrowing, the true response rates needed to be as high as 50.0% in the severe stratum and 66.9% in the moderate stratum as opposed to 31% and 63% assumed here. Under the null, the proposed design performed similarly whether to borrow or not. As the design adapted the allocation rates only selectively if the superiority of the intervention is supported at the time of assessment, on average more than 50% subjects were treated with the intervention but only to modest degree.

Figure 4

Percent treated with intervention by stratum with and without borrowing

Percent treated with intervention by stratum with and without borrowing In the moderate stroke stratum, the expected response rate under equal randomization is 56.6%, whereas the average observed rates were 58.52% with borrowing and 58.35% without borrowing. In the severe stroke stratum, the expected response rate is 24.0% with equal randomization, whereas the average observed rates were 27.55% with borrowing and 26.07% without borrowing. The distributions of observed successful recovery rates are available in Figure S3 of the Supplemental Material. Compared with the expected response rate under equal randomization, in the moderate stroke stratum on average, ∼2 more subjects for every 100 subjects would recover from stroke with no or little disability. This amounts to ∼12 more subjects with desirable outcomes if the maximum target would be enrolled. In the severe stroke stratum on average, 3∼4 more subjects borrowing and 2 more without borrowing for every 100 subjects would have desirable outcomes. Since a nonsuccessful recovery is death or a significant disability throughout the rest of a patient's life, the observed increase in the percentage of a successful recovery is meaningful with or without borrowing. The additional increase with borrowing was modest in both strata since it corresponds to the difference in the response rate between the treatment arms realized in the subjects additionally treated with the intervention with borrowing. Suppose a case that 10% more patients were assigned to the intervention arm in the severe stratum. The increase in the number treated with the intervention leads to increase in the successful recovery rate only by 1.4%, as not all 10% but its fraction that would not have recovered successfully if had treated by the control arm only contributes to the increase in the successful recovery rate: 10% × (0.31−0.17) = 1.4%. Borrowing on the control also increased the total amount of information and led to better precision. With the selective randomization adaptation, we anticipate borrowing on the control would lead to a better precision only in the control arm under the null, whereas it would improve the precision in the intervention arm under the alternative. The posterior standard error of the successful recovery rate observed in the severe stratum by treatment arm with and without borrowing in Figure 5 supports this conjecture partially. Under the null the posterior standard error of the successful recovery rate was smaller with borrowing in both treatment arms. Under the alternative, the posterior standard error was smaller with borrowing in the intervention arm but not in the control arm. The slightly larger control group standard error may be due to varying commensurability from sample to sample. Although perfect commensurability was assumed, the commensurability with the historical sample changed from simulation to simulation. Similar results were observed in the moderate stratum (Figure S4 in the Supplemental Material).

Figure 5

Standard error of successful recovery rate with and without borrowing by treatment arm

Standard error of successful recovery rate with and without borrowing by treatment arm The EHSS has been summarized in Figure S2 of the Supplemental Material. The EHSS in the severe stroke stratum under H1 is much larger than that under H0. However, the EHSS in the moderate stroke stratum is similar under H1 and H0. This is because the EHSS quantifies relative gain in the posterior precision due to borrowing from the historical control and is related to the relative size of the historical control sample to the current trial control sample. The much smaller current trial control sample under H1 than H0 in the severe stroke arm (on average 30.8% vs 47.5% treated with the control) made the impact of the historical control much greater and consequently the EHSS larger under H1. In the moderate stroke arm, the difference was smaller (39.8% versus 48.2%) and the first ∼100 patients equally allocated made the impact of the historical control data similar under H1 and H0.

CONCLUSION

In this paper, we propose a Bayesian design that is adaptive both in incorporating historical control data and to the relative treatment effects. This proposal differs from the existing work on borrowing on the control group from historical data. As opposed to reducing the number of subjects assigned to the control group regardless of the true relative effects, we aim to reduce the current control group sample size adaptively to the true response rates and also only if the intervention arm is superior. This required modifying existing response‐adaptive randomization schemes to selectively adapt the allocation ratios only when evaluation of cumulated current trial data combined with the historical control suggests the superiority of the intervention arm and to account for information borrowed from the historical control as well. We used the EHSS approach10 and modified the response‐adaptive randomization scheme of the DBCD12 to incorporate EHSS in determining the allocation probabilities adaptively to the response. The modified response‐adaptive randomization scheme was incorporated in a Bayesian design with commensurate priors. The primary limitation of using historical controls is changes in standard of practice of care. Other priors that address the potential conflict such as meta‐analytic‐predictive priors11 exist and can be used instead. Similarly, other response‐adaptive randomization schemes can be used if they can be appropriately modified. Only binary outcomes were considered here but the proposed design can be readily modified for other outcome types. Valid high quality historical control data need to exist for the proposed design. As shown in the motivating stroke example, data from a historical trial similarly conducted in the same subject population with similar study design is not uncommon. With such high quality historical data existing, as compared with no borrowing, the proposed response‐adaptive design with borrowing on the control arm treated more subjects with the intervention arm and was more likely to early conclude in line with the underlying truth with better precision, even accommodating early stopping by interim analyses. The additional improvement in the subjects' outcomes brought by borrowing, however, was modest for treatment differences commonly observed in practice as shown in the motivating stroke example. As borrowing on the historical control does not add logistical burden in the implementation, the additional improvement, albeit modest, is worthy in clinical settings where a nonfavorable outcome is death or involves significant life‐long burdens. SIM_7836‐sup‐0001‐supplementary material.pdf Click here for additional data file. SIM_7836‐sup‐0002‐rcode_revised.R Click here for additional data file.

23 in total

1. The Interventional Management of Stroke (IMS) II Study.

Authors:
Journal: Stroke Date: 2007-05-24 Impact factor: 7.914

2. Are outcome-adaptive allocation trials ethical?

Authors: Spencer Phillips Hey; Jonathan Kimmelman
Journal: Clin Trials Date: 2015-02-03 Impact factor: 2.486

Review 3. The combination of randomized and historical controls in clinical trials.

Authors: S J Pocock
Journal: J Chronic Dis Date: 1976-03

4. A multiple testing procedure for clinical trials.

Authors: P C O'Brien; T R Fleming
Journal: Biometrics Date: 1979-09 Impact factor: 2.571

5. Hierarchical commensurate and power prior models for adaptive incorporation of historical information in clinical trials.

Authors: Brian P Hobbs; Bradley P Carlin; Sumithra J Mandrekar; Daniel J Sargent
Journal: Biometrics Date: 2011-03-01 Impact factor: 2.571

6. Worth adapting? Revisiting the usefulness of outcome-adaptive randomization.

Authors: J Jack Lee; Nan Chen; Guosheng Yin
Journal: Clin Cancer Res Date: 2012-07-02 Impact factor: 12.531

7. Tissue plasminogen activator for acute ischemic stroke.

Authors:
Journal: N Engl J Med Date: 1995-12-14 Impact factor: 91.245

8. Methodology of the Interventional Management of Stroke III Trial.

Authors: Pooja Khatri; Michael D Hill; Yuko Y Palesch; Judith Spilker; Edward C Jauch; Janice A Carrozzella; Andrew M Demchuk; Renee' Martin; Patrick Mauldin; Catherine Dillon; Karla J Ryckborst; Scott Janis; Thomas A Tomsick; Joseph P Broderick
Journal: Int J Stroke Date: 2008-05 Impact factor: 5.266