Literature DB >> 25903293

Estimation after subpopulation selection in adaptive seamless trials.

Peter K Kimani¹, Susan Todd², Nigel Stallard¹.

Abstract

During the development of new therapies, it is not uncommon to test whether a new treatment works better than the existing treatment for all patients who suffer from a condition (full population) or for a subset of the full population (subpopulation). One approach that may be used for this objective is to have two separate trials, where in the first trial, data are collected to determine if the new treatment benefits the full population or the subpopulation. The second trial is a confirmatory trial to test the new treatment in the population selected in the first trial. In this paper, we consider the more efficient two-stage adaptive seamless designs (ASDs), where in stage 1, data are collected to select the population to test in stage 2. In stage 2, additional data are collected to perform confirmatory analysis for the selected population. Unlike the approach that uses two separate trials, for ASDs, stage 1 data are also used in the confirmatory analysis. Although ASDs are efficient, using stage 1 data both for selection and confirmatory analysis introduces selection bias and consequently statistical challenges in making inference. We will focus on point estimation for such trials. In this paper, we describe the extent of bias for estimators that ignore multiple hypotheses and selecting the population that is most likely to give positive trial results based on observed stage 1 data. We then derive conditionally unbiased estimators and examine their mean squared errors for different scenarios. ©2015 The Authors. Statistics in Medicine Published by JohnWiley & Sons Ltd.

Entities: Chemical Disease Gene Species

Keywords: adaptive seamless designs; multi-arm multi-stage trials; phase II/III clinical trials; subgroup analysis; subpopulation

Mesh：

Year: 2015 PMID： 25903293 PMCID： PMC4973856 DOI： 10.1002/sim.6506

Source DB: PubMed Journal: Stat Med ISSN： 0277-6715 Impact factor: 2.373

Introduction

In drug development, it is not uncommon to have a hypothesis selection stage followed by a confirmatory analysis stage. In the hypothesis selection stage, data are collected to test multiple hypotheses, with the hypothesis that is most likely to give positive trial results selected to be tested in the confirmatory analysis stage. In this paper, we will consider two‐stage adaptive seamless designs (ASDs) in which the hypothesis selection stage (stage 1) and the confirmatory analysis stage (stage 2) are two parts of a single trial, with hypothesis selection performed at an interim analysis. An alternative to an ASD is to have two separate trials, separate in the sense that stage 1 data are only used for hypothesis selection and the confirmatory analysis uses stage 2 data only. However, an ASD is more efficient than having two separate trials because, as data from both stages of an adaptive seamless trial are used in the final confirmatory analysis, for the same power, fewer patients would be required in stage 2 of an adaptive seamless trial than in the setting with two separate trials hence saving resources. The two‐stage adaptive seamless trial can also be designed so that it is more efficient than having a trial with a single stage, where a single analysis is used to select and test the best hypothesis, for example using the Bonferroni test or the Dunnett test 1. Much work has been undertaken on ASDs where the multiple hypotheses arise as a result of comparing a control to several experimental treatments in stage 1. Based on stage 1 data, the most promising experimental treatment is selected to continue to stage 2 together with the control. We refer to this as treatment selection. In stage 1, available patients are randomly allocated to the control and all the experimental treatments while in stage 2, patients are randomly allocated to the control and the most promising experimental treatment. The experimental treatments may be distinct treatments or different doses of a single experimental treatment. Treatment selection in ASDs is described in more detail in 2, 3, 4, 5, 6, 7, 8, 9, 10 among others. A challenge with such adaptive seamless trials is that selecting the most promising experimental treatment in stage 1 introduces selection bias because the superiority of the selected experimental treatment may be by chance. Consequently, appropriate confirmatory analysis needs to account for using biased stage 1 data. Hypothesis testing methods that control type I error rate have been developed or described in 2, 3, 4, 5, 6, 7, 8. Point estimators that adjust for treatment selection have been developed in 11, 12, 13, 14, 15 while confidence intervals that adjust for treatment selection have been considered in 7, 13, 16, 17, 18, 19. In this paper, we consider the case where multiple hypotheses arise because in stage 1, a control is compared with a single experimental treatment in several subpopulations. Based on stage 1 data, the subpopulation in which the experimental treatment shows most benefit over the control is selected to be tested further in stage 2. We refer to this as subpopulation selection. In stage 1, patients are recruited from all subpopulations while in stage 2, patients are recruited from the selected subpopulation only and randomly allocated to the control and the experimental treatment. Subpopulation (subgroup) analysis has been considered in many trials, encompassing many disease areas such as Alzheimer's 20, epilepsy 21 and cancer 22. Most of these trials are single stage but investigators are beginning to design two‐stage adaptive seamless trials for subpopulation selection such as the trial described in 23. The subpopulation may be defined based on baseline disease severity 20, 21, age group 24 or a genetic biomarker 22 among other criteria. As in 23, we will assume that the subpopulations are pre‐specified. The case of subpopulation selection in ASDs is described in more detail in 23, 25, 26, 27, 28. As in the case of treatment selection, subpopulation selection introduces selection bias because the most promising subpopulation is selected to be tested in stage 2. Methods for hypothesis testing in two‐stage adaptive seamless trials with subpopulation selection that control type I error rate have been developed 5, 23, 26. Some of these methods were initially developed for hypothesis testing following treatment selection. It has been possible to test hypotheses after subpopulation selection using some hypothesis testing methods developed for treatment selection because these methods are not fully parametric. For example, Brannath et al. 23 have shown that the method described in 5, 8 can be used for hypothesis testing in the case of subpopulation selection. Estimation after adaptive seamless trials with subpopulation selection has not been considered. However, for confidence intervals, it is possible to use the duality between hypothesis testing and confidence intervals as described for the case of treatment selection in 7, 18, 19. For point estimation, the methods proposed for treatment selection 11, 12, 13, 14, 15 are based on explicit distributions and so their extension for use in subpopulation selection testing is not straightforward. In this paper, we will consider point estimation after two‐stage ASDs where stage 1 data are used to perform subpopulation selection. Spiessens and Debois 25 have described the possible scenarios for subgroup analysis based on how the subpopulations are nested within each other and about which subpopulations the investigators want to draw inference. We will consider the scenario where the effect is considered in the full population and in a single subpopulation. This scenario seems to be of most practical importance having been considered in methodological work related to actual trial designs 23, 26. In the discussion, we will describe how the estimators we develop can be extended to some of the other scenarios in 25. We organise the rest of the paper as follows. In Section 2, we first describe the setting of interest while introducing notation and then define the naive estimator, which ignores subpopulation selection before deriving a conditionally unbiased estimator. Section 3 gives an example that is used to demonstrate how to compute the naive and unbiased estimators and compare the two estimators for specific cases. We assess the mean squared error of the unbiased estimator in relation to the naive estimator in Section 4. The findings in the paper are discussed in Section 5.

Estimation in adaptive seamless designs for subpopulation selection

Setting and notation

As described in Section 1, we will consider an ASD in which a control is compared with an experimental treatment in a population of patients that consists of a subpopulation that may benefit from the experimental treatment more than the full population. In stage 1, patients are recruited from the full population but it is expected that a subpopulation may benefit more so that the focus at the end of the trial may be in the subpopulation only. Figure 1 shows how the patients in stage 1 are partitioned. The subpopulation, defined by some characteristics such as a biomarker and which we refer to as S, is part of the full population. We refer to the full population as F and the part of F that is not part of S as S . We assume S comprises a proportion p of F. At first, we focus on the case of known p before considering the case of unknown p in Section 2.4. We will use subscripts S, S and F to indicate notation that corresponds to populations S, S and F, respectively. The patients are randomised to the control treatment and the experimental treatment. We assume randomisation is stratified such that in each of S and S , the number of patients randomised to the control is equal to the number of patients randomised to the experimental treatment. Based on stage 1 data, the trial continues to stage 2 either with F or with S.

Figure 1

Partitioning of the full population.

Partitioning of the full population. We assume outcomes from patients are normally distributed with unknown means and known common variance σ 2. We are interested in the unknown treatment difference between means for the control and the experimental treatment. Table 1 shows the key notation that we will use in this paper. We denote the unknown true treatment differences in S and S by θ and , respectively. We denote stage 1 sample mean differences for S and S by X and Y, respectively and the stage 1 sample mean difference for F by Z, which can be expressed by , where . We assume that a total of n 1 patients are recruited in stage 1 so that S =p n 1 patients are from S with S /2 randomly allocated each to the control and the experimental treatment. The remaining (n 1−S ) patients are from S with (n 1−S )/2 randomly allocated each to the control and the experimental treatment. Note that , where and , where .

Table 1

Summary of notation.

			Stage 1		Stage 2		Stages 1 and 2
Selected	Sub‐	True parameter	Sample	Variance of	Sample	Variance of	Naive	Sufficient	Unbiased
population	population	value	mean	sample mean	mean	sample mean	estimator	statistic	estimator
S	S	θ _S	X	σX2	U	τU2	D _S,N	Z _S	D _S,U
	S ^c	θSc	Y	σY2	—	—	—	—	—
F	S	θ _S	X	σX2	V	τV2	DS,NF	ZSF	DS,UF
	S ^c	θSc	Y	σY2	W	τW2	DSc,NF	ZSc	DSc,UF
	S + S ^c	θ _F	Z	—	—	—	D _F,N	—	D _F,U

Summary of notation. The observed values for X, Y and Z are denoted by x, y and z, respectively. The trial continues to stage 2 with S if x > (z + b), which is equivalent to x > y + b/(1 − p ), where b is a number chosen such that the trial continues with S if the effect of the new treatment is sufficiently larger in S than in F. The trial continues to stage 2 with F if , which is equivalent to . In stage 2, a total of n 2 patients are recruited. If S is selected, all the n 2 patients will be from S with n 2/2 patients randomly allocated each to the control and the experimental treatment. If F is selected, S =p n 2 patients will be from S and (n 2−S ) patients will be from S . If S is selected to continue to stage 2, the objective is to estimate θ while if F is selected to continue to stage 2, the objective is to estimate . Therefore, the parameter of interest at the end of the two‐stage trial, which we denote by θ, is random and is defined by We will consider two estimators for θ, namely, the naive and the unbiased estimators. As shown in Table 1, when S is selected, we denote the naive estimator for θ by D . When F is selected, we denote the naive estimators for θ , and θ by , and D , respectively. We define the naive estimator for θ as We give the expressions for the naive estimators D , , and D and derive their bias functions in Section 2.2. In the following, we derive uniformly minimum variance unbiased estimators (UMVUEs) for θ and . As indicated in Table 1, when S is selected, we denote the UMVUE for θ by D . When F is selected, we denote the UMVUEs for θ and by and , respectively. Note that is an unbiased estimator for θ . We define the unbiased estimator for θ as We derive the expressions for UMVUEs D , and in Section 2.3. We will compare the naive (D ) and the unbiased (D ) estimators for θ by evaluating the bias for D and the mean squared errors (MSEs) for D and D . We will evaluate biases and MSEs conditional on the selection made and so for the naive estimator, we will derive expressions for biases for D and D separately. Similarly, the MSEs for D and D will be evaluated conditional on the selection made. Note that if an estimator is unbiased conditional on selection, it is also unconditionally unbiased.

The naive estimator

In this section, we describe the naive estimator for θ defined by equation (2) and derive simple expressions for its bias function. When S is selected, a possible naive estimator for θ , which we denote by D in expression (2), is the two‐stage sample mean given by where U denotes the stage 2 sample mean for patients in S and t =S /(S +n 2) is the proportion of patients in S who are in stage 1. The expected value for D can be expressed as E(D ) = t E(X|X > Y *) + (1 − t )θ , where Y *=Y + b/(1 − p ) so that the bias for D is given by where denotes the indicator function for X > Y *. Following the last expression in Appendix C.1 in 15, Pr(X > Y *) can be expressed as follows where σ and σ are as defined in Section 2.1, and φ and Φ denote the density and distribution functions of the standard normal, respectively. Also, following Appendix C.1 in 15, can be expressed as The expressions for Pr(X > Y *)and are substituted in expression (5) to obtain the bias function for D . If F is selected to continue to stage 2, we are seeking an estimator for θ . Let and denote the proportion of patients recruited in stage 1 from S and S , respectively, and as indicated in Table 1, let V and W denote the stage 2 sample means for S and S , respectively. If F is selected to continue to stage 2, possible naive estimators for θ and , which we denote by and , respectively in Section 2.1, are the two‐stage sample means and . Consequently, a naive estimator for θ , which we denote by D in expression (2), could be The bias for D can be expressed as where and are given by where denotes the indicator function for . As for the expressions for Pr(X > Y *) and , and in the earlier expressions can be respectively expressed as where and For the case we consider here where the population has two partitions, a simple expression for is . Appendix C.2 in 15 has expressions with a single integral that can be modified when the partitioning of the population is more complex. The aforementioned expressions for , and are used to obtain the bias functions for , and D . We will use the bias functions for D , , and D that we have derived in this section to show the extent of the bias for the naive estimator in Section 4.1, which necessitates the need for an unbiased estimator for θ such as the one we derive in the following section.

Conditionally unbiased estimator for θ when the prevalence of the subpopulation is known

In this section, we derive an estimator for θ that is unbiased conditional on the selection made. To do this, we need the densities of the stage 2 means. The notation for the variances for the stage 2 sample means is given in Table 1. If S is selected to continue to stage 2, U is normally distributed with variance . If F is selected to continue to stage 2, V and W are normally distributed with variances and , respectively. Also, to derive the unbiased estimators, we need sufficient statistics and these will be vectors that include the weighted sums of stages 1 and 2 means. The notation for the weighted means for the two alternative choices of population is given in the second last column in Table 1. To obtain the unbiased estimator, we use the Rao–Blackwell theorem (for example, 29). This states that, to obtain the UMVUE for a parameter, one identifies an unbiased estimator for the parameter of interest and then derives its expectation conditional on a complete and sufficient statistic. Let Q denote the event X > Y + b/(1 − p ). Conditional on Q , U is an unbiased estimator for θ so that if we can identify a sufficient and complete statistic for estimating θ , we can use the Rao–Blackwell theorem to derive the UMVUE for θ . Define Z =(τ /σ )X + (σ /τ )U. We describe in Appendix A that conditional on Q , (Y,Z ) is the sufficient and complete statistic for θ and that the UMVUE for θ , E[U|Y,Z ,Q ], which we denote by D in Section 2.1, is given by where, after substituting p with S /n 1 in the expression for f (x,y) given in Appendix A, We have substituted p with S /n 1 in the expression for f (X,Y) and also in the expressions for f (X,Y) and f (X,Y) defined in the following so that estimators in this section and corresponding estimators in Section 2.4 have the same expressions. Let Q denote the event . Conditional on Q , V and W are unbiased estimators for θ and , respectively so that if appropriate sufficient and complete statistics for θ and can be identified, the UMVUEs for θ and can be obtained using the Rao–Blackwell theorem. Define and . We show in Appendix B that conditional on Q , and are sufficient and complete statistics for θ and , respectively and that the UMVUE for θ , , which we denote by in Section 2.1, is given by where, after substituting p with S /n 1 in the expression for f (x,y) given in Appendix B, and that the UMVUE for , which we denote by in Section 2.1, is given by where, after substituting p with S /n 1 in the expression for f (x,y) given in Appendix B, Consequently, an unbiased estimator for θ is , where and are given by expressions (12) and (13), respectively.

Conditionally unbiased estimator for θ when the prevalence of the subpopulation is unknown

In the previous sections, we have assumed that p , the true proportion of patients in S, is known. In some instances, this may not be a reasonable assumption. In this section, we derive conditionally unbiased estimator for θ when p is unknown. Unlike in Sections 2.2 and 2.3, for this case, S and S are random. We will assume S , the number of patients from S in stage 1, is Binomial(n 1,p ) so that consequently and are now random. Define , where s is the observed value for S and . We assume that the trial continues to stage 2 with S if x > (z *+b), which is equivalent to and with F if , which is equivalent to . Note that when S is selected, if we derive an estimator for θ that is unbiased conditional on S =s , then the estimator is unconditionally unbiased. We show in Appendix C that the UMVUE for θ when S is random is given by expression (11). For the case where F is selected, we assume S , the number of patients in S in stage 2, is Binomial(n 2,p ) so that consequently and are now random. We show in Appendix D that the UMVUEs for θ and are given by expressions (12) and (13), respectively. Let As and similarly , is an unbiased estimator for θ .

Worked example

In this section, we use an example to demonstrate how the various estimates described in Sections 2.2 and 2.3 are computed and how they compare. Computation of most estimates described in Section 2.4 would be similar to the computation of estimates in Section 2.3. Several trials for Alzheimer's disease (AD) consider continuous outcomes. In some AD trials, the primary outcome is continuous 30 so that our methodology can be used. Also, subgroup analysis has been considered in AD trials 20. Therefore, to construct the example, we use the AD trial reported in 31. This trial recruited patients with moderate or severe AD, with subgroup analysis performed later for patients with severe AD 20. We take the full population to consist of the patients with moderate or severe AD and the subpopulation to be the patients with severe AD that are thought to potentially benefit more from the new treatment. The primary outcome in 31 is not continuous and so for our example, we imagine that the primary outcome is Severe Impairment Battery (SIB) score, a 51‐item scale with scores ranging from 0 to 100. This was a secondary outcome in the original trial. For the AD trial in 31, the observed mean differences in SIB scores for patients with severe AD and the full population are 7.42 20 and 5.62 31, respectively. Based on these values, the observed mean difference for patients with moderate AD is approximately 3.82. Using the results for the severe AD patients, we will assume σ = 13.2. The AD trial 20 is single stage with approximately 290 patients. In the examples constructed here, we will assume a two‐stage ASD with n 1=n 2=200. Using the definitions of Section 2.1, patients with severe AD form subpopulation S. Therefore, we denote the proportion and the true mean difference in SIB scores for patients with severe AD by p and θ , respectively. In stage 1, the observed mean difference in SIB scores for patients with severe AD is denoted by x, and in stage 2, the observed mean difference in SIB scores for patients with severe AD is denoted by u if testing is only conducted for patients with severe AD and by v if the full population is tested. Also, from the definitions in Section 2.1, patients with moderate AD would form S so that we denote the proportion and the true mean difference in SIB scores for patients with moderate AD by and , respectively. In stage 1, the observed mean difference in SIB scores for patients with moderate AD is denoted by y, and in stage 2, if the full population is tested, we denote the observed mean difference in SIB scores for patients with moderate AD by w. The proportion of patients with severe AD in 31 is approximately 0.5 so that for the example we take . Because n 1=n 2=200 and p =0.5 so that S =p n 1=100 and S =p n 2=100, using the definitions of Section 2.2, t =S /(S +n 2) = 1/3, and . We assume the trial continues with S if the effect for patients with severe AD is greater than the effect for all patients so that b = 0 and b/(1 − p ) = 0. To compute the various unbiased estimates, we need and . If stage 2 data are only available for patients with severe AD, to compute an unbiased estimate for θ , we need . If the full population is tested in stage 2, to obtain the unbiased estimates for θ and , we need and . We will compute estimates for four scenarios. In the first two scenarios, S (patients with severe AD) is selected to continue to stage 2. In both scenarios, we suppose that u = 7.42. For Scenario 1, we suppose x = 6.5 and y = 5.6 and so the naive estimate for the mean difference for patients with severe AD, d =t x + (1 − t )u = 7.11. For the unbiased estimate, we use equation (11), with the unbiased estimate . The values for and have been evaluated earlier and so that d =6.67. In the second scenario, we suppose x = 6.5 and y = 3.8 and using similar computation, d =7.11 and d =6.97. The naive estimates for Scenarios 1 and 2 are equal while the unbiased estimates are not equal, with the unbiased estimate for Scenario 2 closer to the naive estimate. Scenarios 1 and 2 differ in the values for y only and this is why the naive estimates are equal because, conditional on selecting S, the naive estimates depend on x and u only. However, the unbiased estimates depend on y, and as can be deduced from the expression for f (x,y), acquire further from the naive estimate as the difference between the naive estimate and y decreases. This is reasonable because when data suggest that treatment effects for patients with moderate and severe AD are similar, selection bias is likely to be high. Same naive estimates and different unbiased estimates for Scenarios 1 and 2 may indicate more variability for the unbiased estimator for θ (D ) developed in Section 2.3 compared with the naive estimator D . The other two scenarios are for the case where the full population is tested in stage 2 and in both scenarios, we suppose that v = 7.42 and w = 3.48. For the third scenario, we suppose that x = 5.4 and y = 6.0 so that the naive estimates for θ , and θ are , and , respectively. Using equations (12) and (13), the unbiased estimates for θ and are given by and , respectively, where and . Consequently, , and the unbiased estimate for θ , . The corresponding naive and unbiased estimates are not equal. In the fourth scenario, we suppose x = 5.7 and y = 5.7 and using similar formulae as in Scenario 3, , , d =5.66, , , d =5.63. In both scenarios, the naive estimates for θ are equal. The unbiased estimates for θ in Scenarios 3 and 4 are also equal. However, the corresponding naive estimates for θ and in Scenarios 3 and 4 are different. The corresponding unbiased estimates for θ and in Scenarios 3 and 4 are also different. For both θ and , the difference between the naive estimates in Scenarios 3 and 4 is greater than the difference between the unbiased estimates in Scenarios 3 and 4. This is because the unbiased estimators for θ and are functions of all stage 1 data while the naive estimators for θ and only use data from populations S and S , respectively. The larger differences between the unbiased estimates in Scenarios 3 and 4 than the differences between the naive estimates, may indicate more variability for the unbiased estimator for θ (D ) developed in Section 2.3 than the naive estimator D . In Scenario 4, compared with Scenario 3, and are further from and , respectively. This is reasonable because although in both scenarios is smaller than so that the observed data suggest a correct decision for both scenarios would be to continue to stage 2 with patients with severe AD only, in Scenario 4, is much smaller than providing more evidence that the correct decision would have been to continue with patients with severe AD only, and hence more adjustments to the naive estimates are required.

Comparison of the estimators

In this section, we assess the bias of the naive estimator and use a simulation study to compare the mean squared errors for the naive and unbiased estimators for several scenarios.

Characteristics of the calculated bias for the naive estimator

From the bias functions derived in Section 2.2, we note that the bias for the naive estimator depends on θ , and p and so we will vary the values of these parameters. The bias also depends on t =n 1/(n 1+n 2) but we will only present results for the scenario where n 1=n 2=200 so that t =0.5. From the expressions for biases, one can demonstrate that biases increase as one makes selection later in the trial, that is, as t increases. The other parameters that bias depends on are b, σ 2 and n 1+n 2. In this section, we will take σ 2=1. For a given value of t , to make the results approximately invariant of σ 2 and n 1+n 2, we will divide D by , which is the approximate standard error (SE) for D and we will divide estimators , and D by , which is the approximate SE for D . We will comment how b influences bias after describing results in Figure 2 for which b = 0. The top row in Figure 2 explores the bias for the naive estimator and how the naive estimators for treatment effects in S and S contribute to the bias. From left to right, the plots correspond to p =0.3, p =0.5 and p =0.7. The y‐axes give the biases. The x‐axes correspond to different values for θ and in all plots, . We have taken a fixed value for because bias depends on θ and only through . This can be observed by noting that if we add some value δ to θ and , the expressions for bias in Section 2.2 change by having t − δ − θ , , and in place of t − θ , , and , respectively. If we let r = t − δ and integrate with respect to r and use subscript δ for the new expressions used to obtain bias, these can be expressed as Pr(X > Y *) = Pr(X > Y *), , and . Substituting the new expressions in equations (5) and (10), we obtain the same forms for bias and hence the same bias when δ is added to both θ and .

Figure 2

Plots showing bias (top row) and mean squared error (bottom row) for the case where n 1=n 2=200, σ 2=1 and . The x‐axes correspond to the values for θ . Each column corresponds to a different value for p . MSE, mean squared error; SE, standard error. In Figure 2, the legends at the bottom of the plots describe the line types for each estimator. In the first legend, the continuous lines (—) correspond to the case where S is selected to continue to stage 2 and hence gives the bias for D as an estimator for θ . The bias for D decreases as increases. This is reasonable because as θ becomes larger than , Pr(X > Y *) approaches 1 so that the density of X conditional on X > Y * approaches the unconditional density of X and consequently the bias for D approaches zero. The decrease of bias for D can also be explained by the expressions for Pr(X > Y *) and that are given by equations (6) and (7), respectively. As θ becomes larger than , the density for the sample mean difference in S becomes stochastically larger than the density for the sample mean difference in S and consequently, for the values of t, where the term with φ in Pr(X > Y *) and is non‐zero, the term with Φ approaches 1. Hence, Pr(X > Y *) and approach one and E(X), respectively so that the bias approaches zero. The dotted lines (···) show the bias for , the naive estimator for when F is selected. The bias is positive and increases as increases. The dashed lines (‐ ‐ ‐) show the bias for , the naive estimator for θ when F is selected. The bias is negative and increases as increases. The explanation for the behaviours of the biases for and is similar to the explanation for the behaviour of the bias for D . When p =0.5 (middle panel), except for the sign, the bias for is equal to the bias for . For the other values for p (other panels in Figure 2), except for the sign, we note that multiplied by p is equal to multiplied by . Therefore, if F is selected, although the naive components and are biased, as can be seen from the short and long dashed lines (– ‐ – ‐ –), the naive estimator D is unbiased. The dashed and dotted line (·−·−·) shows the bias for D , the naive estimator for θ. The bias is maximal when . Based on results not presented here, for b ≠ 0, the lines in Figure 2 shift by b/(1 − p ) so that bias is maximal when . Noting that the selection is based on max{X,Y + b/(1 − p )}, the proof that the bias is maximal when is given by Carreras and Brannath 32. From the aforementioned assessment of the bias for the naive estimator, we note that, when S is selected, the bias for the naive estimator for θ is substantial. If F is selected, the naive estimator for θ is unbiased. However, the naive estimators for θ and are substantially biased. It is our view that we need unbiased estimators for θ and such as those developed in Section 2.3 when F is selected because we believe investigators would still want to learn about θ and .

Simulation of mean squared errors

In this section, we perform a simulation study to compare the MSEs for the naive and unbiased estimators. In Section 4.1, for the case where F is selected, as well as exploring the bias for the naive estimator for θ , we have also explored the bias for the naive estimators for θ and , which are the components for θ . In this section, we will only focus on the estimators that will be used for inference after stage 2. Hence, when F is selected, we will only compare MSEs for the naive and unbiased estimators for θ , and when S is selected, we will only compare MSEs for the naive and unbiased estimators for θ . For each combination of θ , , t and p , we run 1,000,000 simulation runs. As in Section 4.1, we will only present the MSE results for the case where b = 0. For simulations with b = 0, in each simulation run, we simulate stage 1 data (x and y) and if x > y in which case S would be selected to continue to stage 2, we simulate u and if in which case F would be selected to continue to stage 2, we simulate v and w. The bottom row in Figure 2 gives the square root of the MSEs divided by approximate SE. As indicated earlier, these plots are for the cases where b = 0. Based on results not presented here, for b ≠ 0, the lines in Figure 2 shift by b/(1 − p ). For both the naive and unbiased estimators, we take when S is selected to continue to stage 2 and when F is selected to continue to stage 2. From left to right, p =0.3, p =0.5 and p =0.7, respectively. The second legend at the bottom of the plots describes the line types for each estimator. The continuous lines (—) correspond to D , the naive estimator for θ when S is selected to continue to stage 2. The MSE for D decreases as increases and varies with the values for p but not monotonically. The thick continuous lines (—) correspond to D , the UMVUE for θ when S is selected to continue to stage 2. The MSE for D decreases as increases and seems to increase as the values for p increase. For most scenarios, the MSE for D is larger than the MSE for D . As for bias, the dashed and dotted lines (·−·−·) correspond to D , the naive estimator for θ when F is selected to continue to stage 2 and for all scenarios, the is approximately 1. The thick dashed and dotted lines (‐ · ‐·‐) correspond to D , the unbiased estimator for θ when F is selected to continue to stage 2. The MSE for D increases with and p . For the case where S is selected to continue to stage 2, comparing the biases and MSEs for D and D , we would recommend using D . This is because although for most scenarios in Figure 2, the MSE for D is greater than the MSE for D , the gain achieved by D being an unbiased estimator outweighs the loss of precision by using D . For example, from the results in the top left and bottom left plots, when θ =0, (Bias(D ))/S E is 0.32 while is 0.07 less than so that D removes substantial bias at the expense of a slight loss of precision around the true treatment effect. Similar results are observed in the other plots. For the case where F is selected to continue to stage 2, from the results in Figure 2, D seems a better estimator for θ than the estimator D because both are mean unbiased but D has smaller MSE. The summary findings from the simulation study is that bias for the naive estimators can be substantial but the naive estimators have lower MSEs than the unbiased estimators we derived in Section 2.3. Balancing between the gain of having an unbiased estimator and the loss of precision, when S is selected, we recommend using the unbiased estimator for θ given by expression (11). When F is selected, both the naive estimator D and the unbiased estimator D are mean unbiased but D has better precision than D and so for the case when F is selected, we recommend using the naive estimator for θ (D ) given by expression (8).

Properties of the estimators when the prevalence of the subpopulation is unknown

The results in Sections 4.1 and 4.2 are for the case of known p . In this section, we assess the performance of the various estimators when p is unknown. To do this, we use the true value p to simulate the number of patients in S in stage 1 (s ) as Binomial(n 1,p ) and calculate . After simulating s , we then simulate stage 1 sample mean differences x and y for populations S and S , respectively. As in Sections 4.1 and 4.2, we only present results for the case where b = 0. For this case, because , we select S if x > y and select F if . If S is selected, we simulate stage 2 sample mean difference u from a sample consisting of n 2/2 patients in each of the control and experimental arms. The naive estimate for θ is d =(s x + n 2 u)/(s +n 2). The unbiased estimate for θ , d , is obtained using expression (11). If F is selected, we use the true value p to simulate the number of patients in S in stage 2 (s ) as Binomial(n 2,p ). For each of S and S , we assume the number of patients are equally allocated to the control and the experimental treatments. Based on s patients from S and (n 2−s ) patients from S , we simulate stage 2 sample mean differences v and w for S and S , respectively. Let , we compute the naive estimates for θ , and θ as , and , respectively. Note that when p is known so that the estimator can be reasonably compared with the estimator D given by expression (8). The unbiased estimates for and are obtained using expressions (12) and (13), respectively. The unbiased estimate for θ is calculated as . Figure 3 gives the simulation results for the configurations considered in Figure 2. The form of the SEs used in Figure 3 are the same as those used in Figure 2. We have not presented bias plots because the estimators obtained assuming that p is known, have very similar biases to the estimators obtained assuming that p is unknown. When S is selected, for the naive estimator for θ , there is no noticeable difference in MSEs between the case when p is assumed known and the case when p is assumed unknown so that S is random. Similar results are observed for D , the unbiased estimator for θ when S is selected. For the case where F is selected, for both the naive estimator and the unbiased estimator for θ , the MSEs for the case where p is assumed known and the case where p is estimated are approximately equal when . When , the MSEs for and for the case where p is estimated are slightly higher than MSEs for D and D , respectively, which are the estimators for the case where p is assumed known. Based on results not presented here, when b ≠ 0 so that we select S if and select F if , as in the case where p is assumed known, we noted that the lines in Figure 3 shift by b/(1 − p ).

Figure 3

Mean squared error for the case where n 1=n 2=200, σ 2=1 and . The x‐axes correspond to the values for θ . Each column corresponds to a different value for p . MSE, mean squared error; SE, standard error. To summarise, the results obtained when p is estimated are very similar to results when p is assumed known. The biases for the different estimators for θ and θ are almost identical and MSEs are only slightly higher. The reason that the increases in MSEs are not substantial for the case when p is estimated may be as a result of adequate sample size in stage 1 and hence good precision for the estimator for p . An estimator for p with good precision would not add a great deal of variability to the estimators for θ and θ . Thus, if stage 1 data are adequate to estimate p , the estimators developed in this paper perform almost as good as when p is known.

Discussion

In order to make testing of new interventions more efficient, ASDs have been proposed. Such designs have been used for trials with subpopulation selection. This is the case we consider in this paper. Specifically, we have considered a design that has two stages, with data collected from stage 1 used to select the population to test in stage 2. In stage 2, additional data for a sample drawn from the selected population are collected. The final confirmatory analysis uses data from both stages. Statistical methods that have previously been developed to adjust for selection bias that arise from using stage 1 data have addressed hypothesis testing without inflating type I error. In this work, we have focussed on point estimation. We have derived formulae for obtaining unbiased point estimators. We have derived the formulae for the case where the prevalence of the subpopulation is considered known and also for the case where the prevalence of the subpopulation is unknown. To acquire unbiased estimators when the prevalence of the subpopulation is unknown, we have derived formulae for unbiased estimators when the proportion of patients from the subpopulation does not have to be equal to the prevalence of the subpopulation. This means that the estimators we have derived can be used to obtain unbiased estimates for trials that use enrichment designs, where proportion of the subpopulation in the trial is not equal to the prevalence of the subpopulation. The rest of this discussion focusses on the case where the prevalence of the subpopulation is assumed known but most points also hold for the case where the prevalence is unknown. The unbiased estimators we have developed have higher MSEs compared with the naive estimators. Balancing between unbiasedness and precision, when the subpopulation is selected, we recommend using the unbiased estimator we have derived and when the full population is selected, we recommend using the naive estimator. The unbiased estimator for θ that we derived conditional on continuing to stage 2 with the full population, although based on UMVUEs for θ and , may not be a UMVUE among estimators for θ that are functions of unbiased estimators for θ and and so more research is required to check whether it is an UMVUE and if not, seek an UMVUE. The estimators we have developed in this paper are unbiased conditional on the selection made. For the case where the full population is selected, we have derived separate unbiased estimators for the treatment effects for the subpopulation and its complement. These estimators are unbiased only if we do not make a selection after stage 2. That is, for the case where the full population continues to stage 2, if we use the observed separate estimates to make a claim that the treatment effect is larger in the subpopulation or in the full population, then the estimators developed in this paper are no longer unbiased. In this case, the same data are used both for selection and estimation, and Stallard et al. 33 have shown that there is no unbiased estimator. We have considered the case whether the selection rule is pre‐defined and based on the efficacy outcome. In terms of estimation, a pre‐defined selection rule makes it possible to derive point estimators and evaluate their biases because bias is an expectation, and it is not clear what all possible outcomes are when the selection rule is not pre‐defined. The Food and Drug Administration draft guidance also acknowledges the difficulty in interpreting trial results when adaptation is not pre‐defined 34. A compromise between a pre‐defined selection rule used in this paper and a setting where the selection rule is not pre‐defined is a pre‐defined selection rule that includes additional aspects such as safety. More work is required to develop point estimators for such settings. The unbiased estimators we have developed are for the case where the subpopulations are pre‐specified and they cannot be assumed to be unbiased in trials where subpopulations are not pre‐specified. It is flexible not to pre‐specify subpopulations but it is hard to evaluate bias of point estimators because bias is an expectation, and it is not clear what all possible outcomes are when the subpopulations are not pre‐specified. Hence, it is not possible to quantify the bias of the estimators developed here when the subpopulations are not pre‐specified 35. Depending on the number of subpopulations and how they are defined, there are several configurations on how the subpopulations can be nested within each other 25. We have focussed on a simple and common configuration where a single subpopulation is thought to benefit more, so that based on stage 1 data, the investigators want to choose between continuing with the full population or the subpopulation. By noting that to obtain unbiased estimators we have partitioned the full population into distinct parts, the formulae we have developed for this configuration can be extended to other configurations. If the full population is not of interest and the other subpopulations are not nested with each other, the subpopulations already form distinct parts and the formulae derived for treatment selection such as in 15 can be used directly. If the full population is of interest or some subpopulations are nested within each other, it is possible to partition the full population into distinct parts and use our methodology to obtain unbiased estimators for the distinct parts. However, following our findings that for some selections the naive estimator is unbiased and has better precision than an estimator that combines unbiased estimators for the distinct parts, in order to make a recommendation on the best estimator for the case where there is nesting of subpopulations, we suggest comparing the characteristics of the naive estimators to the estimator that combines unbiased estimators for the distinct parts in the population. We have assumed that whether the subpopulation or the full population is selected, the total sample size in stage 2 is fixed. The results also hold for the case where stage 2 sample sizes for continuing with the subpopulation and the full population are different but prefixed for each selection made. The results may not hold when the stage 2 sample size depends on the observed data in some other way. Finally, if there is a futility rule that requires the trial to continue to stage 2 only if the mean difference for the selected population exceeds some pre‐specified value and as in 15 estimation is conditional on continuing to stage 2, the unbiased estimators developed in Section 2.3 can be extended to account for this. If we denote the pre‐specified futility value by B so that the trial stops if max{x,z}

25 in total

1. Combining different phases in the development of medical treatments within a single trial.

Authors: P Bauer; M Kieser
Journal: Stat Med Date: 1999-07-30 Impact factor: 2.373

2. An adaptive seamless phase II/III design for oncology trials with subpopulation selection using correlated survival endpoints.

Authors: Martin Jenkins; Andrew Stone; Christopher Jennison
Journal: Pharm Stat Date: 2010-12-08 Impact factor: 1.894

3. Testing and estimation in flexible group sequential designs with adaptive treatment selection.

Authors: Martin Posch; Franz Koenig; Michael Branson; Werner Brannath; Cornelia Dunger-Baldauf; Peter Bauer
Journal: Stat Med Date: 2005-12-30 Impact factor: 2.373

4. Unbiased estimation of selected treatment means in two-stage trials.

Authors: Jack Bowden; Ekkehard Glimm
Journal: Biom J Date: 2008-08 Impact factor: 2.207

5. Efficacy and safety of donepezil in patients with more severe Alzheimer's disease: a subgroup analysis from a randomized, placebo-controlled trial.

Authors: Howard Feldman; Serge Gauthier; Jane Hecker; Bruno Vellas; Yikang Xu; John R Ieni; Elias M Schwam
Journal: Int J Geriatr Psychiatry Date: 2005-06 Impact factor: 3.485

6. Confirmatory adaptive designs with Bayesian decision tools for a targeted therapy in oncology.

Authors: Werner Brannath; Emmanuel Zuber; Michael Branson; Frank Bretz; Paul Gallo; Martin Posch; Amy Racine-Poon
Journal: Stat Med Date: 2009-05-01 Impact factor: 2.373

7. Wild-type KRAS is required for panitumumab efficacy in patients with metastatic colorectal cancer.

Authors: Rafael G Amado; Michael Wolf; Marc Peeters; Eric Van Cutsem; Salvatore Siena; Daniel J Freeman; Todd Juan; Robert Sikorski; Sid Suggs; Robert Radinsky; Scott D Patterson; David D Chang
Journal: J Clin Oncol Date: 2008-03-03 Impact factor: 44.544

Review 8. Adaptive designs for confirmatory clinical trials with subgroup selection.

Authors: Nigel Stallard; Thomas Hamborg; Nicholas Parsons; Tim Friede
Journal: J Biopharm Stat Date: 2014 Impact factor: 1.051

9. A comparison of methods for constructing confidence intervals after phase II/III clinical trials.

Authors: Peter K Kimani; Susan Todd; Nigel Stallard
Journal: Biom J Date: 2013-10-31 Impact factor: 2.207

10. Estimation after subpopulation selection in adaptive seamless trials.

Authors: Peter K Kimani; Susan Todd; Nigel Stallard
Journal: Stat Med Date: 2015-04-22 Impact factor: 2.373

9 in total

1. Point and interval estimation in two-stage adaptive designs with time to event data and biomarker-driven subpopulation selection.

Authors: Peter K Kimani; Susan Todd; Lindsay A Renfro; Ekkehard Glimm; Josephine N Khan; John A Kairalla; Nigel Stallard
Journal: Stat Med Date: 2020-05-03 Impact factor: 2.373

2. Bayesian group sequential enrichment designs based on adaptive regression of response and survival time on baseline biomarkers.

Authors: Yeonhee Park; Suyu Liu; Peter F Thall; Ying Yuan
Journal: Biometrics Date: 2021-01-27 Impact factor: 1.701

3. Estimation after subpopulation selection in adaptive seamless trials.