Li Su1, Joseph W Hogan. 1. MRC Biostatistics Unit, Institute of Public Health, University Forvie Site, Robinson Way, Cambridge CB2 0SR, UK. li.su@mrc-bsu.cam.ac.uk
Abstract
Dropout is a common occurrence in longitudinal studies. Building upon the pattern-mixture modeling approach within the Bayesian paradigm, we propose a general framework of varying-coefficient models for longitudinal data with informative dropout, where measurement times can be irregular and dropout can occur at any point in continuous time (not just at observation times) together with administrative censoring. Specifically, we assume that the longitudinal outcome process depends on the dropout process through its model parameters. The unconditional distribution of the repeated measures is a mixture over the dropout (administrative censoring) time distribution, and the continuous dropout time distribution with administrative censoring is left completely unspecified. We use Markov chain Monte Carlo to sample from the posterior distribution of the repeated measures given the dropout (administrative censoring) times; Bayesian bootstrapping on the observed dropout (administrative censoring) times is carried out to obtain marginal covariate effects. We illustrate the proposed framework using data from a longitudinal study of depression in HIV-infected women; the strategy for sensitivity analysis on unverifiable assumption is also demonstrated.
Dropout is a common occurrence in longitudinal studies. Building upon the pattern-mixture modeling approach within the Bayesian paradigm, we propose a general framework of varying-coefficient models for longitudinal data with informative dropout, where measurement times can be irregular and dropout can occur at any point in continuous time (not just at observation times) together with administrative censoring. Specifically, we assume that the longitudinal outcome process depends on the dropout process through its model parameters. The unconditional distribution of the repeated measures is a mixture over the dropout (administrative censoring) time distribution, and the continuous dropout time distribution with administrative censoring is left completely unspecified. We use Markov chain Monte Carlo to sample from the posterior distribution of the repeated measures given the dropout (administrative censoring) times; Bayesian bootstrapping on the observed dropout (administrative censoring) times is carried out to obtain marginal covariate effects. We illustrate the proposed framework using data from a longitudinal study of depression in HIV-infectedwomen; the strategy for sensitivity analysis on unverifiable assumption is also demonstrated.
Many longitudinal studies suffer from dropout, which is termed “informative” if the dropout process depends on the unobserved outcomes even after conditioning on the observed data. To account for informative dropout, a number of model-based approaches have been proposed for the joint modeling of the dropout and longitudinal outcome processes (Little, 1995; Hogan and Laird, 1997b; Kenward and Molenberghs, 1999). These approaches can be generally classified as “selection models” (Wu and Carroll, 1988; Diggle and Kenward, 1994; Follman and Wu, 1995; Ten Have and others, 1998), “pattern-mixture models” (PMMs) (Wu and Bailey, 1989; Little, 1993, 1994; Hogan and Laird, 1997a), or “shared parameter models” (Wulfsohn and Tsiatis, 1997; Henderson ; Tsiatis and Davidian, 2004). In all these frameworks, inference is based on the assumptions that are not verifiable from the observed data, and tools for evaluating the sensitivity to these assumptions are required (Rotnitzky ; Verbeke ; Molenberghs ; Lee, 2007).Focusing on the pattern-mixture modeling approach, in this article we develop a general framework of varying-coefficient models (VCMs) (Hastie and Tibshirani, 1993) for longitudinal data, where measurement times may be irregular across individuals and where dropout can occur at any point in continuous time (not just at observation times) and be unobserved due to administrative censoring. Specifically, the conditional distribution of the longitudinal outcomes given the dropout time (or the administrative censoring time) follows a VCM, where the outcome model parameters such as regression coefficients, variance components, and correlation parameters depend on the dropout time (or the administrative censoring time) through unspecified smooth functions. Two separate VCMs are specified to distinguish the administratively censored individuals from those who actually drop out. The full-data distribution is a mixture over the dropout/administrative censoring time distribution, which is left unspecified.The proposed framework generalizes the pattern-mixture models (Little, 1993, 1994; Fitzmaurice and Laird, 2000), the conditional linear models (CLMs) (Wu and Bailey, 1989), and the class of VCMs developed for continuous outcomes in Hogan . Specifically, our approach is distinguished from the work in Hogan by (a) handling administrative censoring separately from other types of dropout that could be related to the outcome process, (b) allowing all model parameters to depend on the dropout process, and (c) accommodating binary outcomes. In this article, we demonstrate the proposed approach using both continuous and binary longitudinal data with continuous-time dropout. The unspecified smooth functions are modeled by Bayesian penalized splines (Ruppert ). When the marginal covariate effects on the outcome process are of interest, Rubin's (1981) Bayesian bootstrap (BB) is used for averaging over the dropout time distribution with administrative censoring. The advantage of building our VCM framework within the Bayesian paradigm is that there is no need to model the continuous dropout time distribution parametrically. With a frequentist approach to model the dropout times nonparametrically, extra simulation by bootstrapping the continuous dropout times is necessary for standard error estimation if the delta method fails (Hogan ). On the other hand, the BB is naturally merged with the Markov chain Monte Carlo (MCMC) for the outcome process model, and the variability of the observed dropout/administrative censoring times is appropriately taken into account when making inferences on the marginal covariate effects.The HIV epidemiology research study (HERS) (Smith ; Ickovics and others, 2001) was a longitudinal study of women with, or at high risk for, HIV infection. Twelve core visits (each in a calendar time window) were scheduled for 1310 women, where a variety of clinical, behavioral, and sociologic outcomes were to be recorded approximately every 6 months. If women came to the sites on a date out of the visit window, the visit procedures were not performed. Further, mid-interval visits were added for severely immunosuppressed women (CD4 count < 100). The actual measurement times correspond to assessment dates and vary across participants.Our interest is in studying the course of depression in the 753 women who had HIV infection at baseline and did not drop out of the study due to HIV-related death before the study end. Depression was measured using the Center for Epidemiologic Studies Depression Scale (CES-D). The CES-D includes 20 questions related to mood, each of which can take a value from 0 (symptom rarely present) to 3 (symptom almost always present); scores therefore range from 0 to 60. Because the distribution of CES-D for a general population can be very skewed, in practice transformations or nonparametric methods need to be applied (Radloff, 1977). In HIV research, a score of 16 or greater for CES-D is frequently used as a cutoff for clinical depression (Ickovics and others, 2001; Cook ; Leserman, 2008). This can avoid the potential nonnormality problem for continuous CES-D data and can be useful for screening depression cases in study populations. As in the original analysis of the HERS CES-D data presented in Ickovics and others (2001), our proposed VCMs were originally motivated by the analysis of the dichotomized HERS CES-D data. However, we will also illustrate our framework using continuous HERS CES-D data and compare the results from the 2 analyses.A challenge with the analysis of these HERS depression data is that dropout could be related to the disease progression and associated depressive symptoms. Figure 1 presents the Kaplan–Meier curves for the dropout time by race and baseline CD4 count. Only 173 women finished the 12 scheduled visits, and their dropout times are treated as administratively censored. We distinguish these women from those who dropped out prematurely due to reasons other than HIV-related death.
Fig. 1.
Kaplan–Meier curves of the dropout time by the race and baseline CD4 count groups in the HERS application. Cross represents right-censored dropout time due to administrative reasons.
Kaplan–Meier curves of the dropout time by the race and baseline CD4 count groups in the HERS application. Cross represents right-censored dropout time due to administrative reasons.The remainder of this article is organized as follows. The proposed modeling framework is described in Section 2. Estimation procedures are detailed in Section 3. In Section 4, we apply our methods to the HERS depression data. Conclusions and discussion follow in Section 5.
VCMS FOR INFORMATIVE DROPOUT
Suppose that the data come from N individuals, and for the ith (i = 1,…, N) individual, there is an outcome process {Y(t)}, where t(t ≥ 0) is the time since enrollment. Correspondingly, there is a p-dimensional covariate process {x(t)} associated with {Y(t)}. In the absence of dropout, the conditional distribution of the variable Y(t) given x(t) can be described by a model F with parameters θ,For example, if Y(t) is continuous, we might assume that F{θ;x(t)} is a Gaussian process with mean function μ(t) = x(t)β and variance–covariance function cov{Y(s),Y(t)∣x(s),x(t)} = V(s,t) (s ≤ t), where β is a p × 1 vector of regression coefficients. Parametric forms can be used for V(s,t), for instance, V(s,t) = σ2exp( − γ|t − s|). In this case, θ = (βT,σ2,γ)T.If {Y(t)} is a binary process, we might assume that F{θ;x(t)} follows a marginalized transition model (MTM) (Heagerty, 2002), with marginal mean g{μ(t)} = x(t)β, where g(·) is a link function. The serial dependence is modeled by the conditional mean of Y(t) given its history ℋ(t − ) before t and the covariate process history ℋ(t), that is, μ(t) = E{Y(t)∣ℋ(t − ),ℋ(t);φ}. Here, φ is the dependence parameter vector and θ = (βT,φT)T.For the ith individual, let T denote the administrative censoring time (or the scheduled study end) and let D denote the dropout time. The observed data consist of the total follow-up time, U = min(D,T), and the indicator for dropout, δ = I(D < T). In other words, for individuals who are administratively censored (or finish the study), D is right censored and δ = 0; for individuals who drop out prematurely, D is observed. At the time points t,…, t (t ≤ U), we also observe the outcome measurements Y = {Y(t),…, Y(t)}.When D ≥ T for all i, it is not necessary to consider the dropout process while modeling the outcome process {Y(t)}. Otherwise, the dropout process is potentially informative. To deal with this situation, we assume that the full data comprise {Y(t),x(t),U,δ} and factor the joint distribution asTo induce the dependence between y and (u,δ) in the first factor, we assume thatwhere F is an appropriate outcome model andHere, θ1(·) and θ0(·) are the vectors of functions for the dropout time D and the administrative censoring time T. Therefore, the administratively censored individuals are distinguished from those who drop out by allowing them to have different model parameters for the outcome process. The second factor f(u,δ∣x) can be specified using any distribution for event times, where the dependence on x can be checked using standard event time regression analysis methods. In the HERS application, the covariates are discrete and we allow f(u,δ∣x) to be completely unspecified within the levels of x.Different assumptions can be made for the form of θ(u,δ). For example, if θ(u,δ) are constant in u, then the dropout process is ignorable and methods for modeling Y(t) given x(t) can be used without explicitly considering (U,δ). When the dropout/administrative censoring times and the values of θ(u,δ) are discrete, we have a PMM (Little, 1993; Fitzmaurice and Laird, 2000). When the dropout/administrative censoring times are continuous and θ(u,δ) are polynomial functions, we have the CLM (Wu and Bailey, 1989). The VCMs by Hogan generalized the CLM for continuous outcome data by allowing the mean parameters to be unspecified smooth functions. Unlike in Hogan , our approach handles the administrative censoring differently from other outcome-related dropout. For example, in the HERS analysis reported in Section 4, we assume that θ1(u) are unspecified smooth functions for the dropped-out individuals and θ0(u) are constants for the administratively censored individuals. Furthermore, we extend the work in Hogan by allowing all model parameters to depend on u and accommodating binary outcomes.In a linear mixed model (LMM) for the HERS depression data introduced in Section 1, “missingness at random” (MAR) is assumed such that the conditional distribution of missing CES-D scores given the observed ones for those who remained in the study at u is the same as the corresponding conditional distribution for those who left the study at u (Molenberghs ), that is,For our VCM, if the marginal distribution of {Y(t)∣x(t)} is of interest, we assume that conditional on u and the covariates, the outcome distribution after u can be characterized by the same parameters in the distribution for the observed data. For example, the time trend of the CES-D score estimated from the observed data can be extrapolated for the missing CES-D scores beyond u up to the study end.Neither the assumption in the LMM nor the one in the VCM can be verified from the observed data. One advantage of the pattern-mixture modeling approach is that the extrapolation of the missing data is transparent, which makes the substantive critique and empirical sensitivity analysis relatively straightforward (Little and Wang, 1996; Daniels and Hogan, 2000; Rotnitzky ). For example, in the HERS analysis reported in Section 4, we can assume a different time slope for the CES-D scores beyond u. The sensitivity parameters would be the difference between the time slopes before u and beyond u, which cannot be identified by the observed data. Then, we can recompute the quantities of interest (such as the marginal CES-D profiles) to check their sensitivity to the nonidentifiable parameters. Because the unidentifiable part of the model is distinguished from those identifiable from the observed data, in the VCM the inferences based on the observed data remain the same regardless of the sensitivity parameters.
A model for continuous longitudinal data
Hogan developed varying-coefficient LMMs for continuous longitudinal data, where the mean parameters were allowed to depend on the dropout time, but the variance components were constants. In addition, they did not distinguish administrative censoring from other outcome-related dropouts. We generalize their model by allowing variance-component parameters to vary by the dropout/ administrative censoring times.Recall that for the ith subject, Y is an n × 1 continuous outcome vector, U = u is the observed dropout/administrative censoring time, and δ = 0,1 is the indicator for dropout. Let x = {x(t)T,…, x(t)T}T be the n × p exogenous covariate matrix associated with the fixed effects and z = {z(t)T,…, z(t)T}T be an n × q covariate matrix associated with the random effects. Conditional on (U,δ), we assume thatwhere β1(u) and β0(u) are 2 p × 1 vectors of unknown regression coefficient functions and φ1(u) and φ0(u) are the vectors of unknown variance-component functions.We use a Cholesky decomposition for modeling the variance components as the functions of u (Daniels and Zhao, 2003). Other formulations using multivariate normal distributions are possible; this one is chosen for convenience. Details are given in the supplementary material available at http://www.biostatistics. oxfordjournals.org.In the HERS analysis reported in Section 4.1, we assume that β1(u) and φ1(u) are unspecified smooth functions that are modeled by penalized splines. Because the administrative censoring times in these data are similar, we assume that β0(u) and φ0(u) are constant functions. In practice, when study participants have staggered entry and the administratively censored individuals are a heterogeneous group with respect to the outcome distribution, we could also allow β0(u) and φ0(u) to be unspecified smooth functions. Note that with variance components varying by the dropout/administrative censoring times, we need to pay attention to the effective number of parameters that are incorporated in the VCM (Spiegelhalter ). If the results for variance-component functions suggest particular parametric forms, we could reduce the model complexity accordingly.
A model for binary longitudinal data
To illustrate the VCM for binary longitudinal data, we build on MTMs (Heagerty, 2002), where the mean and serial dependence structures, and their dependence on the dropout process, are separately specified.Specifically, let μ(u) = E{Y(t)∣x(t),U = u,δ} (j = 1,…, n) andwhere g(·) is a link function, x(t) is a 1 × p covariate vector, and β1(u) and β0(u) are 2 p × 1 vectors of unknown regression coefficient functions.Serial dependence between the outcomes within individuals follows an rth-order Markov model; that is,The dependence structure is modeled viaalthough in principle any valid link function can be used (Heagerty, 2002). Note that, for simplicity, the dependence of Δ and γ(u) on x(t) is suppressed for now. The log-odds ratios γ(u) measure the dependence between Y(t) and Y(t),…, Y(t) among those with U = u and δ = δ; the intercept Δ is determined such that the mean structure in (2.2) and the Markov dependence structure in (2.3) are simultaneously satisfied (Azzalini, 1994; Heagerty, 2002).We further assume that the serial dependence γ(u) can be modeled viawhere z(t) is a subset of the covariates x(t), while α(u) and α(u) are 2 d × 1 (l = 1,…, r) vectors of unknown functions of u. For example, if γ(u) = α(u) + α(u)·Z, where Z is a treatment group indicator, individuals for whom Z = 1 are allowed to have different serial dependence compared with individuals for whom Z = 0, given that they drop out at u.As with the VCM for the continuous HERS depression data, we assume that each element of the MTM parameters is an unspecified smooth function of u for individuals who dropped out of the HERS, while for administratively censored individuals, we assume that the MTM parameters are constant in u.
ESTIMATION
Joint likelihood
Suppose π indexes the dropout/administrative censoring time distribution f(u,δ∣x;π), and let Θ denote the set of parameters in the VCM for the outcome process. The likelihood from the ith individual can be partitioned asIf the priors for π and Θ are independent, it follows that π is not a part of the posterior for Θ. The inference for f(u,δ∣x;π) can be based on the marginal likelihood ∏f(u,δ∣x;π), whereas the inference for Θ is based on ∏f(y∣x,u,δ;Θ).
Likelihood for the model with continuous data.
In the HERS analysis reported in Section 4.1, the set of parameters Θ here includes those indexing the smooth functions for the regression coefficients and variance components when the dropout time is observed and the parameter vector (constant in u) when the dropout time is administratively censored. Using the same notation as in Section 2.1, the log-likelihood associated with the continuous outcome process can be written aswhere V = zG{φ(u)}z′ + R{φ(u)}.
Likelihood for the model with binary data.
In the HERS analysis reported in Section 4.1, we assume a first-order serial dependence structure. The likelihood contribution for the ith individual corresponding to the model in (2.2–2.4) can be written as
Bayesian penalized splines
The smooth functions in (2.1) and (2.2–2.4) are modeled by Bayesian penalized splines with low-rank thin-plate spline bases (Ruppert ; Crainiceanu ).The low-rank thin-plate spline representation of a scalar smooth function θ(·) iswhere η = (ξ0,ξ1,ψ1,…, ψ)T is a vector of regression coefficients and ν1 < ⋯ < ν are fixed knots. We set ν at the k/(K + 1) sample quantile of us (Ruppert, 2002; Ruppert ; Crainiceanu ). Let ξ = (ξ0,ξ1)T , ψ = (ψ1,…, ψ)T, U1 = (1,u), U2 = (|u − ν1|3,…, |u − ν|3), and Ω be a K × K matrix whose (l,k)th entry is |ν − ν|3. Using the reparameterization and , (3.1) can be rewritten as .In the HERS analysis reported in Section 4, we assign to ξ independent normal priors with mean zero and large variance and to the prior N(0,λ·I), where I is a K × K identity matrix. Estimating the smoothing parameter λ is similar to estimating variance components in Bayesian hierarchical models (Gelman, 2006), and the curve estimation by penalized splines can be sensitive to the choice of prior for λ. Crainiceanu discussed this issue and found that inverse-Gamma priors can be used in practice when certain conditions are met such that the posterior inference of λ is insensitive to the hyperparameters in the prior for λ. In our applications, we use inverse-Gamma priors for λ and the estimated curves fit the observed data reasonably well. Additional analyses using Uniform priors for λ1/2 give similar results for curve estimation. Therefore, we only present the results with inverse-Gamma priors for λ.
Bayesian bootstrap
In our VCM approach, we leave the dropout/administrative censoring time distribution f(u,δ∣x;π) completely unspecified and use Rubin's (1981) BB (Kim ) to obtain the posterior for P(U = u,δ = δ∣x).We now briefly describe the BB procedure. Suppose 𝒰 = (U1,…, U) is a random sample from an unknown distribution. For simplicity, we assume that there are no ties in 𝒰. The BB posterior for π = P(U = U) can be obtained byBecause there is only one observation at each U, the empirical likelihood is given byUsing a noninformative prior ∏π − 1 for (π1,…, π), we have Rubin's BB posteriorIn the HERS analysis reported in Section 4, at each iteration of the MCMC we then simulate P(U = u,δ = δ∣x) from Dirichlet(1,…, 1) for each combination of the discrete covariates.
Summarizing marginal covariate effects
To obtain inference on the marginal mean E(Y∣x), the empirical averages ∑P(U = u,δ = δ∣x)E(Y∣x,u,δ) can be computed using the posterior samples of P(U = u,δ = δ∣x) and E(Y∣x,u,δ) from the VCM, where N* is the sample size corresponding to a specific combination of the discrete covariate values.For example, in (2.1), when the identity link is used for modeling the mean structure and f(u,δ∣x) = f(u,δ), the marginal covariates effects can be approximated by ∑P(U = u,δ = δ)β(u). However, when other link functions are used for the mean structure and/or f(u,δ∣x)≠f(u,δ), the marginal covariate effects might not be readily available. Here, the effect of covariate difference x − x′ iswhich cannot be simplified to (x − x′)E{β(u)} because of its dependence on x. In a simple scenario with treatment groups and measurement times as the covariates, we can compute ∑P(U = u,δ = δ∣x)E(Y∣x,u,δ) and plot summaries of the posteriors to demonstrate the marginal covariates effects. For other more complicated situations with many confounders or a number of quantitative covariates of interest, a simple summary of the marginal effects in PMMs might not be immediately obtainable (Fitzmaurice and Laird, 2000).
Markov Chain Monte Carlo
The prior specification for Bayesian penalized splines is discussed in Section 3.2. For the constant parameter vector in the administrative censoring group, independent vague normal priors with mean zero and large variance are assigned. We use MCMC to sample from the posterior distributions; summary statistics such as posterior means and 95% credible bands are then used for inference. The MCMC is implemented in the WinBUGS package (version 1.4.1) and its development interface (WBDev; Spiegelhalter ). The programs for the HERS analysis reported in Section 4 are provided in the supplementary material available at Biostatistics online.
APPLICATION: THE HERS STUDY
Our goal is to describe the depression changes over time by baseline characteristics, such as race (Black, White, Latina including others) and baseline CD4 count (CD4 > 200), for the 753 women who did not suffer HIV-related deaths in the HERS. We first present the analysis using the continuous CES-D data and then analyze the binary data using the cutoff CES-D ≥ 16 to define clinical depression.
Continuous CES-D data
Models.
We fit 3 models. The first is a LMM assuming that (Y∣b, X, Z) ∼ N(Xβ + Zb,τ2), where Y is the CES-D score at time t; X is the covariate vector that includes intercept, race, baseline CD4 count, time, and the interaction between time and baseline CD4 count; and Z is the covariate vector associated with a random intercept and a random time slope b = (b,b)T. In addition, we fit 2 VCMs. The first has unspecified smooth functions of u in the mean structure only (VCM1). Specifically, it is assumed that (Y∣b,X,Z,u,δ)∼N(Xβ(u) + Zb,τ2), where β0(u) = β0 is a vector that is constant in u for the administrative censoring group. The second VCM also includes the variance components as smooth functions of u (VCM2). Further details about variance-component parametrization, prior specification as well as posterior inference can be found in the supplementary material available at Biostatistics online.
Results.
We first focus on the results from VCM2. Additional results from the LMM and VCM1 as well as an example of sensitivity analysis based on VCM2 can be found in the supplementary material available at Biostatistics online. Figure 2 gives the results for β1(u) and β0. The intercept and race effects are fairly constant over u. The main effect of baseline CD4 count is decreasing as u increases. The main effect of time has a downward trend toward zero, which suggests that for the group with baseline CD4 ≤ 200 earlier dropout was associated with steeper change in expected CES-D scores over time. The interaction between time and baseline CD4 count increases toward zero, which shows that the positive time slopes for the group with baseline CD4 > 200 are less steep than those with baseline CD4 ≤ 200. Overall, we expect that VCM2 will adjust the expected CES-D profiles upward and the adjustment might differ between the baseline CD4 groups.
Fig. 2.
Estimated smooth functions of the observed dropout times in the mean structure from VCM2 for the continuous CES-D data in the HERS. Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates of the regression coefficients β0 in the administrative censoring group.
Estimated smooth functions of the observed dropout times in the mean structure from VCM2 for the continuous CES-D data in the HERS. Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates of the regression coefficients β0 in the administrative censoring group.Figure 3 shows the estimated smooth functions for the variance components. Since later dropouts are generally associated with more observations within patients, we would expect that the estimated variability of the random intercepts, random slopes, and residual errors decreases as u increases. As seen from Figure 3, this is true for the estimated random-slope standard deviation (SD) and the error SD; but the estimated random-intercept SD increases as u increases. This upward trend for random-intercept SD suggests that those early dropouts in the HERS might be a more homogeneous group in terms of their baseline CES-D levels. Overall, all estimated variance components are not constant over u. In fact, by allowing the variance components to vary by u, the within-individual correlation structure in VCM2 is different from the one in VCM1 (see the supplementary material available at Biostatistics online for variance-component estimates from the LMM and VCM1). It is well known that with complete data and likelihood-based approaches, properly modeling the within-individual correlation structure can affect the variability estimates more than the point estimates of the mean regression coefficients (Diggle ). However, with missing data, even point estimates can be biased if the correlation structure is not carefully modeled (Daniels and Hogan, 2008). In our case, because of the apparent dependence between variance components and dropout times in the observed data, we would expect that the 2 VCMs might provide different point estimates and variability estimates for the marginal covariate effects.
Fig. 3.
Estimated smooth functions of the observed dropout times for the variance components from VCM2 for the continuous CES-D data in the HERS. Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates of the variance components in the administrative censoring group.
Estimated smooth functions of the observed dropout times for the variance components from VCM2 for the continuous CES-D data in the HERS. Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates of the variance components in the administrative censoring group.Table 1 gives the results for the intercept and the time slope in estimated CES-D profile by race and baseline CD4 count. The intercept estimates are close across all models, which is expected because in early study period the influence of dropout is minimal. However, both VCMs adjust the time slope estimates upward compared with the LMM, where for the group with baseline CD4 ≤ 200, the adjustment is larger and the time slopes are changed completely to be positive. Therefore, without taking into account the dropout process such as in the LMM, we might incorrectly conclude that both baseline CD4 groups had downward trends for the CES-D scores over time. Further, both VCMs give similar time slope estimates for the group with baseline CD4 > 200, but the point estimates and variability estimates of the time slopes in the group with baseline CD4 ≤ 200 differ between the VCMs. This might be explained by the different levels of missingness between the baseline CD4 groups. Allowing the variance components to vary by u in VCM2 therefore might have larger impact on the point and variability estimates for the group with baseline CD4 ≤ 200.
Table 1.
Estimated intercept and time slope (posterior mean and 95% credible interval) in the expected linear CES-D profiles (by race and baseline CD4 groups) under 3 different models for the continuous CES-D data in the HERS
LMM
VCM1
VCM2
Intercept
Slope
Intercept
Slope
Intercept
Slope
CD4 ≤ 200
Latina
22.6
− 1.3
22.5
2.7
22.6
1.9
(19.8, 25.3)
(− 3.0, 0.2)
(19.5, 25.5)
(− 2.6, 14.7)
(19.8, 25.4)
(− 3.3, 10.3)
Black
18.6
− 1.3
18.3
1.5
18.4
1.0
(16.2, 21.1)
(− 3.0, 0.2)
(15.5, 21.0)
(− 2.3, 8.7)
(15.9, 21.0)
(− 2.8, 6.3)
White
20.5
− 1.3
20.1
2.5
20.1
1.6
(17.8, 23.4)
(− 3.0, 0.2)
(16.6, 23.6)
(− 2.8, 16.4)
(16.7, 23.5)
(− 3.5, 10.7)
CD4 > 200
Latina
24.3
− 2.1
24.5
− 1.0
24.5
− 1.0
(22.6, 26.1)
(− 2.6, − 1.5)
(22.7, 26.3)
(− 3.0, 1.9)
(22.6, 26.3)
(− 3.0, 2.0)
Black
20.4
− 2.1
20.3
− 1.4
20.3
− 1.4
(19.3, 21.4)
(− 2.6, − 1.5)
(19.2, 21.4)
(− 2.6, 0.2)
(19.2, 21.4)
(− 2.7, 0.3)
White
22.3
− 2.1
22.4
− 1.4
22.2
− 1.4
(20.7, 23.9)
(− 2.6, − 1.5)
(20.6, 24.1)
(− 2.5, 0.0)
(20.6, 24.0)
(− 2.6, 0.1)
Estimated intercept and time slope (posterior mean and 95% credible interval) in the expected linear CES-D profiles (by race and baseline CD4 groups) under 3 different models for the continuous CES-D data in the HERSIn summary, regardless of baseline CD4 count, we observed that Whites and Latinas (including others) had larger CES-D scores over time than Blacks. For all races, the expected CES-D scores for the patients with baseline CD4 ≤ 200 were increasing over time, while the CES-D scores for those with baseline CD4 > 200 were decreasing over time.
Binary CES-D data
First we fit a MTM(1). The same set of covariates as in Section 4.1 is used. The marginal mean of depression follows logit(μ) = Xβ, and the dependence structure is assumed to follow a first-order Markov model with constant serial dependence logit(μ) = Δ + α·y. We then fit the varying-coefficient MTM(1) as in Section 2.2. The mean structure for the dropout group followswhile the dependence structure follows logit{μ(u)} = Δ + α(u)·y. Further details are provided in the supplementary material available at Biostatistics online.For individuals with observed dropout times, Figure 4 presents the estimated smooth functions. The main effect for baseline CD4 count shows a downward trend over u. The main effect for time decreases to approximately zero as u increases, which again suggests that earlier dropout was associated with larger time slope in depression probability for the group with baseline CD4 ≤ 200. The interaction between time and baseline CD4 count increases toward zero, which suggests that the group with baseline CD4 > 200 had time slopes that are less varying over u. Overall, we expect that the VCM could adjust the marginal probability profiles of depression upward for the group with baseline CD4 ≤ 200. The within-individual serial dependence is positive and increases slightly as u increases. Note that in Figure 4 the corresponding estimates from the administrative censoring group are close to those at the right boundary of the observed dropout times.Table 2 presents the estimated marginal covariate effects from the fitted MTM(1) assuming MAR. The estimated interaction between time and baseline CD4 count is positive, which means that regardless of race, the group with baseline CD4 ≤ 200 had steeper decline in depression prevalence over time. Based on substantive knowledge and the results in Section 4.1, this is not sensible and may be an artefact selection bias due to informative dropout. Marginal probability profiles estimated from the VCM by race and baseline CD4 count are presented in Figure 4 of the supplementary material available at Biostatistics online. Apparently, for the group with baseline CD4 ≤ 200, the VCM adjusts the marginal probability of depression; and the downward trends shown in the MTM(1) under MAR are moved upward. The adjustment for the group with baseline CD4 > 200 is minimal. As a result, the estimated interaction between time and baseline CD4 count becomes negative in the VCM, which is shown by the difference between marginal probability profiles in Figures 5 and 6 of the supplementary material available at Biostatistics online. Note that after averaging over the dropout/administrative censoring time distribution, the effects of race, baseline CD4 count, and time are no longer independent. Results for the administrative censoring group are also given in the supplementary material available at Biostatistics online.
Table 2.
Estimated covariate effects (posterior mean and 95% credible interval) from the fitted MTM(1) (assuming MAR) for the binary CES-D data in the HERS
Estimated smooth functions of the observed dropout times (on logit scale) in varying-coefficient MTM(1) for the binary CES-D data in the HERS; these include the coefficients for baseline CD4 > 200 (β3(u)), the time (β4(u)), the interaction between time and baseline CD4 count (β5(u)), and the serial dependence (α(u)). Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates in the administrative censoring group.
Estimated covariate effects (posterior mean and 95% credible interval) from the fitted MTM(1) (assuming MAR) for the binary CES-D data in the HERSLCL, lower credible limit; UCL, upper credible limit.Estimated smooth functions of the observed dropout times (on logit scale) in varying-coefficient MTM(1) for the binary CES-D data in the HERS; these include the coefficients for baseline CD4 > 200 (β3(u)), the time (β4(u)), the interaction between time and baseline CD4 count (β5(u)), and the serial dependence (α(u)). Gray shades are the pointwise 95% credible bands and dashed lines are the corresponding estimates in the administrative censoring group.In summary, regardless of baseline CD4 count, we observed that Latinas (including others) had higher prevalence of depression over time than Blacks and Whites. Given the race groups, the patients with baseline CD4 ≤ 200 had similar depression prevalence over time as for the patients with baseline CD4 > 200; unlike the results based on continuous CES-D data, the depression prevalence remained relatively constant over time for all race and baseline CD4 groups. It should be noted that the analyses based on continuous and binary CES-D data focused on different scientific questions. With continuous CES-D data, we are interested in the covariate effects on the absolute levels of CES-D, while with binary CES-D data the targets are the covariate effects on the prevalence of clinical depression. We have seen that in both cases the race effects are similar, but the baseline CD4 effects differ.
DISCUSSION
We have proposed a Bayesian VCM approach for longitudinal data with continuous-time informative dropout. Our framework assumes that the parameters in the outcome process depend on the dropout time through unspecified functions, where administratively censored dropout times are handled separately and no modeling of the continuous dropout time distribution is needed in order to obtain the inference for marginal covariates effects. While the VCM is widely applicable, we used both continuous and binary data from an HIV longitudinal study to show that our approach has the potential to adjust for selection biases induced by early dropouts of poor responders.Our VCM approach provides a convenient framework for sensitivity analysis because the unidentifiable part of the model can be distinguished from the identifiable part and for the latter the inferences remain the same regardless of the sensitivity parameters. In our analysis of the HERS depression data, we emphasized that sensitivity analysis should be based on those parameters that cannot be identified by the observed data. More in-depth research on this aspect is needed, building on general sensitivity analysis strategies developed for PMMs (Scharfstein ; Daniels and Hogan, 2000; Molenberghs ). For example, informative priors for sensitivity parameters can be introduced using expert opinions and/or prior elicitation based on previous studies (Lee, 2007).Appropriate summary of marginal covariate effects is a challenge in the pattern-mixture modeling approach to informative dropout. In practice, we might prefer to specify the marginal covariate effects directly in the model. Thus, approaches to marginalizing PMMs are worth further research (Wilkins and Fitzmaurice, 2006; Wilkins and Fitzmaurice, 2007; Roy and Daniels, 2007). In our ongoing research, we plan to extend the VCM for binary data by separately specifying the marginal model and the conditional model given the dropout/administrative censoring time, while constraints are imposed such that they are satisfied simultaneously.In our VCM, we distinguished administrative censoring from other dropout. In the HERS application, we assumed that for the administrative censoring group the outcome model parameters do not vary with the administrative censoring times but are distinct from the parameters in the dropout group. However, our VCM specification is flexible, and in practice, similar unspecified smooth functions can also be used to capture the heterogeneity within the administrative censoring group with respect to the outcome process. This is particulary useful when study participants have staggered entry and the observed administrative censoring times vary considerably. When there is no dropout, variation in these administrative censoring times is usually ignorable. However, when informative dropout is present, for example, in the context of the HERS analysis, it is possible that the longer a participant stays on a study without dropping out, the less steep the patient's true depression trend is likely to be. In this situation, modeling the relationship between the outcome precess and the administrative censoring times would be necessary.Outcome-related death mixed with dropout is another problem that warrants further research. Because extrapolating the missing data beyond death is inappropriate, instead of modeling the marginal mean of the outcomes, a more meaningful quantify of interest would be the mean of the longitudinal outcomes conditional on being alive (Kurland and Heagerty, 2005). When the survival information is available, we could extrapolate the missing data in the VCM up to the observed survival times for summarizing marginal covariate effects. If survival times are censored, further work on joint modeling is needed.
FUNDING
The National Institutes of Health (R01-AI-50505, R01-HL-79457); the US Centers for Disease Control and Prevention (U64-CCU10675). Funding to pay the Open Access publication charges for this article was provided by Medical Research Council (UK) grant U.1052.00.009.
Authors: D K Smith; D L Warren; D Vlahov; P Schuman; M D Stein; B L Greenberg; S D Holmberg Journal: Am J Epidemiol Date: 1997-09-15 Impact factor: 4.897
Authors: Judith A Cook; Dennis Grey; Jane Burke; Mardge H Cohen; Alejandra C Gurtman; Jean L Richardson; Tracey E Wilson; Mary A Young; Nancy A Hessol Journal: Am J Public Health Date: 2004-07 Impact factor: 9.308
Authors: Zhigang Li; H R Frost; Tor D Tosteson; Lihui Zhao; Lei Liu; Kathleen Lyons; Huaihou Chen; Bernard Cole; David Currow; Marie Bakitas Journal: Stat Med Date: 2017-08-17 Impact factor: 2.373
Authors: Camille M Moore; Samantha MaWhinney; Jeri E Forster; Nichole E Carlson; Amanda Allshouse; Xinshuo Wang; Jean-Pierre Routy; Brian Conway; Elizabeth Connick Journal: Stat Methods Med Res Date: 2015-06-15 Impact factor: 3.021