Literature DB >> 31517032

Two-period linear mixed effects models to analyze clinical trials with run-in data when the primary outcome is continuous: Applications to Alzheimer's disease.

Guoqiao Wang¹, Andrew J Aschenbrenner², Yan Li², Eric McDade², Lei Liu¹, Tammie L S Benzinger³, Randall J Bateman², John C Morris², Jason J Hassenstab^2,4, Chengjie Xiong¹.

Abstract

INTRODUCTION: Study outcomes can be measured repeatedly based on the clinical trial protocol before randomization during what is known as the "run-in" period. However, it has not been established how best to incorporate run-in data into the primary analysis of the trial.
METHODS: We proposed two-period (run-in period and randomization period) linear mixed effects models to simultaneously model the run-in data and the postrandomization data.
RESULTS: Compared with the traditional models, the two-period linear mixed effects models can increase the power up to 15% and yield similar power for both unequal randomization and equal randomization. DISCUSSION: Given that analysis of run-in data using the two-period linear mixed effects models allows more participants (unequal randomization) to be on the active treatment with similar power to that of the equal-randomization trials, it may reduce the dropout by assigning more participants to the active treatment and thus improve the efficiency of AD clinical trials.

Entities: Chemical

Keywords: Alzheimer's disease; Linear mixed effects model; Run-in clinical trials; Two-period models; Unequal randomization

Year: 2019 PMID： 31517032 PMCID： PMC6732759 DOI： 10.1016/j.trci.2019.07.007

Source DB: PubMed Journal: Alzheimers Dement (N Y) ISSN： 2352-8737

Introduction

To facilitate the development of disease-modifying therapies for Alzheimer's disease (AD), trial-ready cohorts have been established where participants provide longitudinal measurements on clinical, cognitive, or other measures while investigational drugs are being identified [1], [2]. In this prerandomization period, the primary end points for the future clinical trials, such as clinical or cognitive tests, are assessed based on the master protocol of the platform trials allowing for easy incorporation of the prerandomization data into the primary analysis. This longitudinal period before randomization is historically referred to as the run-in period during which potential participants who have met all entry criteria for a randomized clinical trial are assigned no regiment or the same regimen (e.g., placebo) [3]. Planning a run-in period before randomization has been extensively implemented in many landmark clinical trials [4], [5], [6], [7] including trials for AD [7], and it is expected to continue to be an essential design element [8]. The run-in design has been implemented in the dominantly inherited Alzheimer network (DIAN) trial unit platform trial [1] and the European Prevention of Alzheimer's Dementia Proof of Concept Platform [2]. In these settings, each participant's duration and the number of primary end point assessments in the run-in period may vary and depend on the timing of enrollment. The assessments of the primary outcome collected during run-in can potentially be used in the primary efficacy analysis at the end of the clinical trials. However, it has not been fully established how best to incorporate run-in data into final analyses. When only a single assessment is collected in the run-in period, the run-in data are often used as a covariate in the primary analysis model [9], whereas when multiple assessments are available, the rate of change (slope) in the run-in period can be used as a covariate [10] within linear mixed effects (LMEs) models or mixed effects models for repeated measures frameworks. Although these methods are helpful, they did not fully take advantage of the run-in data especially when multiple run-in assessments are present. In addition, when the run-in duration varies by individual, the variability of the run-in data over time is not fully accounted for. In AD clinical trials, the primary end points are continuous and the primary efficacy inference is based on the slowing of the rate of decline in cognition. For these types of end points, we propose a two-period (run-in period and randomization period) LME model to simultaneously model the run-in data and the randomization data. We investigated the behavior of the two-period LME by simulating clinical trials using parameters estimated from the DIAN study and evaluated the gain in power compared with the LME models using run-in data (baseline or rate of change) as a covariate. The remainder of this article is as follows. Section 2 presents the model formulations of the LME with a covariate and the two-period LME. Section 3 evaluates model behavior through simulated hypothetical clinical trials. Section 4 presents the power formulas, and Section 5 presents the discussion.

Methods

Using information from run-in period as a covariate

As mentioned, the traditional model to analyze clinical trials with run-in data is LME model. The baseline assessment or the rate of change estimated using the run-in assessments will be included in the LME as a covariate. This traditional model can be expressed as follows. Let y denote the longitudinal assessments for subject i at time t for treatment group k, and it can be modeled aswhere u0, u1 are the random effects for the intercept and the slope and follow a bivariate normal distributionthe residual follows normal distributions , β's are the coefficients associated with the corresponding covariate X1, μ0 is the baseline group mean and is assumed to be the same for the treatment group and the placebo group because of randomization, μ1 represents the rate of change, i=1, 2, …, n, j=0, 1, …, n, and k=1, 2 represents the placebo group and the treatment group. The primary efficacy test is to compare the rate of change of the treatment group (μ12) to that of the placebo group (μ11) during the randomization period.

Two-period LME

We propose the two-period LME to model the run-in period and the randomization period simultaneously. We investigate two scenarios: the slope of the placebo group in the run-in period is the same as (scenario 1) or is different from (scenario 2) that in the postrandomization period of the placebo group.

Scenario 1

When the slopes are the same, the two-period LME model can be presented aswhere Δμ represents the treatment effect and equals to 0 for the placebo group; t represents the baseline time of the randomization period; (t− t)+ = max (t − t, 0); j = 0, 1, …, bl, bl + 1, bl + 2, bl + 3,…; t = 0 represents the baseline of the run-in period; μ0, u0, u1, and ε are defined in the same way as in Section 2.1; μ1 is the slope of the placebo group in the run-in period and the randomization period.

Scenario 2

Similarly, when the slopes are different, the two-period LME model can be presented aswhere μ1 and μ2 are the slopes of the placebo arm during the run-in period and the randomization period; Δμ, (t − t)+, and t are defined as in equation (2); μ0 and ε are defined in the same way as in Section 2.1, whereas u0, u1, and u2 follow a multivariate normal distribution: The duration of the run-in period could be different for each individual, and there can be multiple assessments during the run-in period.

Evaluation of the behavior of various LMEs

Participants from DIAN study

The DIAN study is an international, longitudinal observational study established in 2008. As of June 2018 it has enrolled 529 participants from families with confirmation of a causal autosomal dominant Alzheimer's disease mutation and a 50% chance of inheriting the mutation. The details of participants' demographics, clinical, cognitive, imaging, and biochemical measures have been reported in previous publications [11], [12]. For this study, only mutation carriers were included because mutation noncarriers are healthy control subjects and are not allowed to be given any treatment. The data include DIAN quality-controlled data from July 2008 to June 2018 consisting of 310 mutation carriers. As many clinical trials use a cognitive composite score as the primary outcome [1], [13], we formed a cognitive composite consisting of a digit symbol substitution task test from the Wechsler Adult Intelligence Scale-Revised [14], the Mini-Mental State Examination [15], the DIAN word list delayed recall test [16], and the Wechsler Memory Scale-Revised logical memory delayed recall test [17]. The cognitive composite is an average of the z-score of these four tests [11], [12].

Power comparison

We first estimated the baseline mean (μ0), the annual slope (μ1), and the variance-covariance for the random intercept and the random slopeand the residual . Furthermore, we assume μ2 = 0.9∗μ1, , the correlation between u0 and u2 is 0.4, and between u1 and u2 is 0.8. The values of these variables are presented in Table 1.

Table 1

Estimated simulation parameters for the cognitive composite using the DIAN observational study

Parameter	Variance-covariance matrix			Mean
Parameter	u_0i	u_1i	u_2i	Mean
u_0i	1.0656	0.09253	0.05674	−0.6289
u_1i	0.09253	0.02331	0.01678	−0.09506
u_2i	0.05674	0.01678	0.01888	−0.08555
σe2	0.05160

Abbreviation: DIAN, dominantly inherited Alzheimer network.

Estimated simulation parameters for the cognitive composite using the DIAN observational study Abbreviation: DIAN, dominantly inherited Alzheimer network. To evaluate the advantage of the two-period model relative to the traditional LME with/without run-in data as a covariate, we simulated clinical trials based on data of the DIAN study to closely mimic AD trials. This creates four models for comparison: (1) traditional LME without run-in, (2) traditional LME with the first run-in assessment as a covariate, (3) traditional LME with the slope of change across all run-in visits included as a covariate, and (4) the two-period model with run-in. Simulation SAS codes are provided in the Supplementary Material. We simulated trials with 1:1 and 3:1 treatment to placebo randomization ratio for a total 400 patients. Overall, we make the following assumptions for our simulated trials: Four-year trial after randomization without/with run-in period (Fig. 1).

Fig. 1

The run-in period and the randomization period. The run-in period was simulated using a uniform distribution (0.3, 1.2). The “BL” assessments of the randomization period were measured at the time of randomization and could be very close to the last run-in assessments (participant 2). The run-in period had at least one (participant 3) and up to three (participant 2) assessments. Abbreviation: BL, baseline. Individual duration of the run-in period: uniform distribution (0.3, 1.2) (Fig. 1). Primary outcome measured every 0.5 year in the run-in period until the individual was randomized to the treatment, and then every 1 year in the randomization period. The last measurement in the run-in period is also the first one in the randomization period, and it was measured at the time of randomization regardless how far this measurement was from the last measurement in the run-in period (Fig. 1). The slopes of the placebo group in the run-in period and the randomization period were the same and the primary outcome was simulated based on formula (2). The slopes of the placebo group in the run-in period and the randomization period were different and the primary outcome was simulated based on formula (3). Effect size (% reduction in the slope): 0%, 30%, 40%, 50%, and 60%. For each of the models mentioned previously, we simulated 1000 clinical trials, and calculated type I error and power as the proportion of 1000 simulated trials per scenario with P values less than .05. The 4-year trials without run-in were used as the anchor point to demonstrate the power improvement of run-in trials. The power/type I error comparison is presented in Figs. 2 and 3. Each figure includes the comparison among the four types of design/models with 1:1 randomization (left panel) and the comparison between the 1:1 randomization and the 3:1 randomization (right panel). Fig. 2 represents the scenario where the slope of the placebo group in the run-in period is the same as that in the randomization period, whereas Fig. 3 displays the case where the two slopes are different. For both scenarios, the type I error is well controlled for all models. The two-period LME leads up to 15% increase in power for the same slope scenario with 1:1 randomization. When comparing the 3:1 with the 1:1 randomization, the two-period LME yields almost identical power, whereas the traditional LME yields more power for the equal randomization. For the two-slope scenario, the power improvement for the two-period model is up to 11% compared with the LME with a covariate. The 3:1 randomization has slightly less power than 1:1, but the discrepancy for two-period LME is much smaller than that for the traditional LME.

Fig. 2

Fig. 3

Power/type I error for each design (with/without RI), different analysis models, and different randomization ratios assuming the different rates of change in the RI period and the randomization period. Sample size for the left panel: 200/arm. With RI/slope: with RI, LME with individual slope as a covariate; with RI/baseline: with RI, LME with individual baseline value as a covariate. 300:100 RI/baseline: 300 on treatment and 100 on placebo; 200:200:200 on treatment and 200 on placebo. Abbreviations: LME, linear mixed effect; RI, run-in.

Power/type I error for each design (with/without RI), different analysis models, and different randomization ratios assuming the same rate of change in the RI period and the randomization period. Sample size for the left panel: 200/arm. With RI/slope: with RI, LME with individual slope as a covariate; with RI/baseline: with RI, LME with individual baseline value as a covariate. 300:100 RI/baseline: 300 on treatment and 100 on placebo; 200:200:200 on treatment and 200 on placebo. Abbreviations: LME, linear mixed effect; RI, run-in. Power/type I error for each design (with/without RI), different analysis models, and different randomization ratios assuming the different rates of change in the RI period and the randomization period. Sample size for the left panel: 200/arm. With RI/slope: with RI, LME with individual slope as a covariate; with RI/baseline: with RI, LME with individual baseline value as a covariate. 300:100 RI/baseline: 300 on treatment and 100 on placebo; 200:200:200 on treatment and 200 on placebo. Abbreviations: LME, linear mixed effect; RI, run-in.

Power estimation of the two-period LME

Under the framework of LME, we first presented the power estimation formulas for the two-period model assuming no dropout and no intermittent missing data, then proposed the algorithm to account for the dropout.

The same slope for the placebo group in the run-in period and the randomization period

To get a closed formula, we rewrote the treatment group of equation (2) as Further simplification yielded The null hypothesis is H0: Δμ2=0 and the alternative is H1: Δμ2≠0. For the fixed effects, the design matrix () of the treatment group isalthough it only includes the first two columns for the placebo group. The design matrix for the random effect also includes only the first two columns. Thus E (|) = + , where represents fixed effects, represents the random effects, ~ (0,). The fixed effect can be estimated by: βˆ = (XTΣ−1X)−1 XTΣ−1Y, and V (βˆ) = (XTΣ−1X)−1, where Σ = + , is the diagonal residual matrix. To determine the power for a complex run-in design, we adopted the same strategy as in a previous study [10]. This is to calculate the variance/standard deviation (s) for a single subject and then estimate the standard error for a given sample size. Briefly, first, using pilot data or published results, we estimated the residual variance R and the covariance of the random intercepts and random slopes. Then plugging the design matrix and for a single subject into Σ and V(βˆ) sequentially to estimate s for Δμ. Next, the power for a trial with NT subjects in the treatment group and NP subjects in the placebo group can be determined fromwhere α is the type I error and is often set to be 5% and γ is the type II error and is often set to be 20%; zα is upper αth quantile of the standard normal distribution. It is noted that the variance of Δμ2 is estimated using all the data from the NT+NP subjects, but the standard error (s√NT) is only related to NT. Thus, theoretically, given the total sample size, the larger the NT, the more power the run-in design has, leading to more power for the unequal randomization than the equal randomization. This benefit is attributed to two facts: (1) the same slope for the placebo group in both periods; and (2) the run-in data help estimate the slope of the placebo group and the variances of the random effects and the residuals.

Different slopes for the placebo group in the run-in period and the randomization period

In this scenario, we rewrote equation (3) aswhere μ2 = μ2+Δμ, k=1, 2 represent the placebo group and the treatment group. The null hypothesis is H0: μ21 − μ22 = 0 and the alternative is H1: μ21 − μ22≠0. Then the design matrices for the fixed effects and the random effects for formula (3) are the same, and they are also the same for both groups:Like Section 4.1, V (βˆk) can be obtained for a single subject using the aforementioned formulas for Σ and for V (βˆ). The power for a total sample size of NT + NP can be estimated fromwhere α, γ, and zα are defined as in Section 4.2.

Algorithm to account for dropout

For scenarios with dropout, the sample size in the power formulas can be approximated by , where m is the annual dropout rate, n is the total duration in years, Ndropout and Nno-dropout are the sample sizes for each treatment group with/without dropout. This method assumes that participants who drop out before the end of study do not contribute to the estimate of the treatment effect and its variance at all, and thus will underestimate the power and overestimate the sample size. An alternative method that accounts for the contribution of the early dropout participants has been proposed in previous research [10], [18]. Briefly, assuming the proportion and the sample size for each dropout pattern are p and n for a given treatment group, then the total sample size for that treatment group is approximated by [10], [18]where k is the total number of dropout patterns for this given treatment group. This method, however, assumes no intermittent missing data within each dropout pattern, or data after the intermittent missing data do not contribute.

Discussion

In this article, we proposed the two-period LME model to analyze clinical trials with run-in design when the efficacy inference is based on the rate of change. This two-period LME model offers two important benefits when compared with a traditional LME that uses measures from run-in as covariates: (1) model the run-in data directly instead of using them as covariates; and (2) assign more participants to the active treatment without losing power compared with the traditional equal randomization clinical trials because of the fact that the run-in data serve as placebos. The first advantage allows the luxury to fully account for the run-in information in terms of the number and the frequency of assessments, and yields more accurate estimation of the variance-covariance matrix of the random effects and the within-subject error. The latter may greatly appeal to participants to enroll and remain the trials and maintain drug compliance (as they are more likely to be assigned to the treatment arm), which is especially important for diseases without any effective treatments such as AD. Furthermore, we also provided concise power estimation formulas for the two-period LME model by manipulating the design matrices of the fixed effects and the random effects. Similar manipulation of the design matrices will generalize the two-period model to other variation of run-in designs such as all participants in the run-in period are given the active treatment. The proposed two-period model is very flexible, in that it allows the fixed effects (slopes), the random effects, and even the ancillary parameters to be different in the two periods. The flexibility can alleviate various concerns about the run-in design. For example, assuming the slope in the run-in period to be different from that in the randomization period takes care of the concern that participants may behave differently before and after randomization. Using the parameters estimated from the DIAN study, we conducted extensive simulations to evaluate the model behavior mimicking real AD clinical trials. Also we showed that the two-period LME model yielded accurate estimations of the treatment effect, controlled type I error, and led to large increases in power compared with models that used the run-in data as covariates. An additional advantage of the two-period LME is that it can be implemented using the well-established SAS procedures such as PROC NLMIXED (see Supplementary Material for details), which makes these models easier to use. It is important to note that our focus is to propose an optimal model for analysis of run-in clinical trials, it was not our intent to compare trials with and without run-in design although we anchored the comparison based on the trials without run-in. For such comparison, extensive research has been done by Frost et al. [10]. Under the framework of LME and using three data points (one run-in assessment, baseline assessment, and one postrandomization assessment), Frost et al. demonstrated that given the same follow-up duration the run-in designs can be more efficient (requiring smaller sample size) than designs without run-in provided that true between-subject variability in the rate of change (slopes) is large relative to within-subject error [10]. Our study was inspired by theirs, but different in that the two-period LME is more general, and its power calculation formula can handle any number of assessments and any assessment schedule both in the run-in period and the randomization period. Because both studies are under the same framework, the conclusions of Frost et al. also apply to the two-period LME model. For AD clinical trials, the primary outcome is usually a cognitive test [19], [20], [21] or a composite of multiple cognitive tests [1], [13]. For these cognitive outcomes, the between-subject variability in the rate of change (slopes) is typically smaller relative to within-subject error, thus given the same follow-up duration and the same sample size, trials without the run-in design should have larger power than those with run-in because the former put participants on the treatment from the beginning and the latter after the run-in period. Of course, it is always optimal to start participants on a treatment as soon as possible. In other words, a 4-year AD trial with 1 year of run-in (in which treatment only begins after the first year) is always less powerful/optimal than a 5-year AD trial without run-in (in which treatment begins from the baseline). However, our results show that if run-in data are available (e.g., from a prior observational study) or if some cognitive data can be collected when other aspects of the clinical trial are still being developed (e.g., when a drug is being finalized) then the two-period model provides an optimal way to combine run-in data with trial data to maximize the probability of detecting a significant treatment effect. Our study has some limitations. First, the two-period LME assumes the rate of change during the follow-up is linear. Although multiple studies have shown that the decline in cognition was linear, especially within a relatively short period like 2 years [22], [23], it is not clear if this linearity assumption is still true over a longer course of follow-up or under the influence of disease-modifying treatments. Second, although some clinical trials with run-in designs have been conducted, we were not able to obtain these real clinical trial data to validate the two-period LME model. Instead, we simulated clinical trials using parameters estimated from a longitudinal observational AD study to mimic real clinical trials as closely as possible. In summary, the two-period LME model optimizes the use of run-in data, is flexible to account for design variations, can increase the power of clinical trials, and allows more participants (unequal randomization) to be on the active treatment without losing power compared with the equal-randomization trials. It may serve as a superior primary analysis model for platform clinical trials where “trial-ready” populations are enrolled in longitudinal observational studies waiting for randomization to clinical trials such as DIAN and European Prevention of Alzheimer's Dementia. Systematic review: We reviewed the existing literature about statistical models that can be used to analyze clinical trials with run-in design. Most methods use the run-in data as a covariate, leading to inefficient use of the run-in data. Interpretation: The proposed two-period linear mixed effects models jointly model the run-in data and the double-blinded randomized data, can lead up to 15% power increase, and allow unequal randomization without losing significant power compared with equal randomization. Future directions: The generalization of the two-period models to other mixed effects model such as the mixed effects model for repeat measures using time as categorical is of great interest as mixed effects model for repeat measures does not have the linearity assumption.

22 in total

1. Longitudinal decline in mild-to-moderate Alzheimer's disease: Analyses of placebo data from clinical trials.

Authors: Ronald G Thomas; Marilyn Albert; Ronald C Petersen; Paul S Aisen
Journal: Alzheimers Dement Date: 2016-02-23 Impact factor: 21.566

2. Longitudinal cognitive and biomarker changes in dominantly inherited Alzheimer disease.

Authors: Eric McDade; Guoqiao Wang; Brian A Gordon; Jason Hassenstab; Tammie L S Benzinger; Virginia Buckles; Anne M Fagan; David M Holtzman; Nigel J Cairns; Alison M Goate; Daniel S Marcus; John C Morris; Katrina Paumier; Chengjie Xiong; Ricardo Allegri; Sarah B Berman; William Klunk; James Noble; John Ringman; Bernardino Ghetti; Martin Farlow; Reisa A Sperling; Jasmeer Chhatwal; Stephen Salloway; Neill R Graff-Radford; Peter R Schofield; Colin Masters; Martin N Rossor; Nick C Fox; Johannes Levin; Mathias Jucker; Randall J Bateman
Journal: Neurology Date: 2018-09-14 Impact factor: 9.910

3. Effect of enalapril on survival in patients with reduced left ventricular ejection fractions and congestive heart failure.

Authors: Salim Yusuf; Bertram Pitt; Clarence E Davis; William B Hood; Jay N Cohn
Journal: N Engl J Med Date: 1991-08-01 Impact factor: 91.245

Review 4. Mixed model of repeated measures versus slope models in Alzheimer's disease clinical trials.

Authors: M C Donohue; P S Aisen
Journal: J Nutr Health Aging Date: 2012-04 Impact factor: 4.075

5. Final report on the aspirin component of the ongoing Physicians' Health Study.

Authors:
Journal: N Engl J Med Date: 1989-07-20 Impact factor: 91.245

6. A double-blind, placebo-controlled multicenter study of tacrine for Alzheimer's disease. The Tacrine Collaborative Study Group.

Authors: K L Davis; L J Thal; E R Gamzu; C S Davis; R F Woolson; S I Gracon; D A Drachman; L S Schneider; P J Whitehouse; T M Hoover
Journal: N Engl J Med Date: 1992-10-29 Impact factor: 91.245

7. A novel cognitive disease progression model for clinical trials in autosomal-dominant Alzheimer's disease.

Authors: Guoqiao Wang; Scott Berry; Chengjie Xiong; Jason Hassenstab; Melanie Quintana; Eric M McDade; Paul Delmar; Matteo Vestrucci; Gopalan Sethuraman; Randall J Bateman
Journal: Stat Med Date: 2018-05-14 Impact factor: 2.373

8. Effect of enalapril on mortality and the development of heart failure in asymptomatic patients with reduced left ventricular ejection fractions.

Authors: S Yusuf; B Pitt; C E Davis; W B Hood; J N Cohn
Journal: N Engl J Med Date: 1992-09-03 Impact factor: 91.245

9. The A4 study: stopping AD before symptoms begin?

Authors: Reisa A Sperling; Dorene M Rentz; Keith A Johnson; Jason Karlawish; Michael Donohue; David P Salmon; Paul Aisen
Journal: Sci Transl Med Date: 2014-03-19 Impact factor: 17.956

10. Using baseline cognitive severity for enriching Alzheimer's disease clinical trials: How does Mini-Mental State Examination predict rate of change?

Authors: Richard E Kennedy; Gary R Cutter; Guoqiao Wang; Lon S Schneider
Journal: Alzheimers Dement (N Y) Date: 2015-06