| Literature DB >> 29450705 |
Abstract
A major challenge for representative longitudinal studies is panel attrition, because some respondents refuse to continue participating across all measurement waves. Depending on the nature of this selection process, statistical inferences based on the observed sample can be biased. Therefore, statistical analyses need to consider a missing-data mechanism. Because each missing-data model hinges on frequently untestable assumptions, sensitivity analyses are indispensable to gauging the robustness of statistical inferences. This article highlights contemporary approaches for applied researchers to acknowledge missing data in longitudinal, multilevel modeling and shows how sensitivity analyses can guide their interpretation. Using a representative sample of N = 13,417 German students, the development of mathematical competence across three years was examined by contrasting seven missing-data models, including listwise deletion, full-information maximum likelihood estimation, inverse probability weighting, multiple imputation, selection models, and pattern mixture models. These analyses identified strong selection effects related to various individual and context factors. Comparative analyses revealed that inverse probability weighting performed rather poorly in growth curve modeling. Moreover, school-specific effects should be acknowledged in missing-data models for educational data. Finally, we demonstrated how sensitivity analyses can be used to gauge the robustness of the identified effects.Entities:
Keywords: Competence development; Longitudinal design; Selection bias; Selectivity analysis
Mesh:
Year: 2018 PMID: 29450705 PMCID: PMC6267521 DOI: 10.3758/s13428-018-1021-z
Source DB: PubMed Journal: Behav Res Methods ISSN: 1554-351X
Fig. 1Example of selection bias in a simulated sample with regression lines, for a response rate of 80%. The correlations between the outcome and the response group were .50 (middle panel) and – .50 (right panel)
Fig. 2Path diagram of the selection and pattern mixture models. The dashed lines mark the selection equation for modeling participation at the second measurement point, indicated by the dummy-coded indicator Ri (A and B), or for determining the membership within one of the two latent dropout classes c (C)
Sensitivity analyses for attrition in longitudinal, multilevel settings
| Missing-Data Model | Individual- and Group-Specific Random Effects | Selection Variables Possible | Dependency on Time Trajectory | Statistical Software (Selected) |
|---|---|---|---|---|
|
| ||||
| Full-information maximum likelihood (FIML) | Botha | yes, for computing the model’s correlation matrixb | overall | Mplus, R (sem, lavaan, openMx), Stata (sem) |
| Multivariate imputation by chained equations (MI) | Either individual- or group-specific random effect | yes, in imputation model | overall | Mplusc, |
| Inverse probability weighting (WE) | Both | yes, in the response model yielding the weights | overall | Mplus, |
|
| ||||
| Diggle–Kenward selection model (DK) | No individual- or group-specific effects in selection equation, but both in the analysis model | yes, in the selection model | past and present | Mplus Stata (gllamm)e |
| Wu–Carroll selection model (WC) | Individual-specific effect in selection equation, both in the analysis model | yes, in the selection model | overall | Mplus |
| Pattern mixture model (PM) | No individual- or group-specific effects for assigning latent groups, but both in the analysis model | no | overall | Mpluse,f |
aMplus facilitates the modelling of individual and group-specific effects, whereas the related R and Stata functions only allow the modelling of individual-specific effects. bOnly Mplus implements this feature, but solely for 1-level models. cMplus offers multiple imputation analysis options as well; however, the specification of the imputation model is hidden from the user. Thus, it is not possible to implement the chained regression approach as in mice or ice. dSo far, R does not explicitly allow inverse probability weights in random effects models. eR can also be used to implement this kind of missing data models, however, this requires to write own estimation routines from scratch since (up to now) these models are not part of R’s officially contributed packages. fStata does not offer a single command for estimating this model, but the related routines may be implemented using Stata and its programming language Mata
Means, standard deviations, and bivariate correlations between study variables
|
|
| 1. | 2. | 3. | 4. | 5. | 6. | 7. | 8. | 9. | 10. | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1. | Mathematical competence at | 0.037 | 1.281 | ||||||||||
| 2. | Mathematical competence at | 1.081 | 1.111 | .702 | |||||||||
| 3. | Self-concept ( | 2.522 | 0.921 | .347 | .405 | ||||||||
| 4. | Reasoning ( | 8.654 | 2.457 | .494 | .408 | .222 | |||||||
| 5. | Sex a | 0.497 | 0.500 | – .159 | – .245 | – .263 | – .024 | ||||||
| 6. | Migration b | 0.256 | 0.436 | – .188 | – .160 | – .037 | – .144 | .015 | |||||
| 7. | Age (in years) | 14.92 | 0.625 | – .255 | – .221 | – .053 | – .213 | – .094 | .154 | ||||
| 8. | Assessment mode c | 0.687 | 0.464 | – .490 | – .361 | – .087 | – .328 | – .071 | .097 | .273 | |||
| 9. | Basic secondary school d | 0.238 | 0.426 | – .368 | – .253 | .005 | – .355 | – .069 | .174 | .256 | .377 | ||
| 10. | Intermediate secondary school d | 0.213 | 0.409 | – .070 | – .170 | – .017 | .033 | – .004 | – .029 | .011 | .351 | – .290 | |
| 11. | Remaining school types d | 0.200 | 0.400 | – .192 | – .169 | – .032 | – .093 | – .005 | – .012 | .031 | .132 | – .279 | – .260 |
All correlations are significant at p < .001. On the basis of pairwise complete observations. Basic secondary school = “Hauptschule,” Intermediate secondary school = “Realschule.” aCoded as 0 = boys and 1 = girls; bCoded as 0 = no migration background and 1 = with migration background; cCoded as 1 = tested in schools and 0 = tested individually at home; dDummy coded with upper secondary school (= “Gymnasium”) as reference category
Fig. 3Distribution of mathematical competence in grade 9 by dropout group
Logit regression analysis for nonresponse at the second measurement point
|
| 95% CI | |
|---|---|---|
| Intercept | – 1.941* | [– 2.451, – 1.431] |
| Reasoning | – 0.016 | [– 0.096, 0.064] |
| Self-concept | – 0.079* | [– 0.151, – 0.006] |
| Sexa | – 0.185* | [– 0.322, – 0.049] |
| Migrationb | 0.039 | [– 0.120, 1.992] |
| Age | – 0.159 | [– 0.291, – 0.027] |
| Assessment modec | 5.037* | [ 4.738, 5.336] |
| Basic secondary schoold | 0.750* | [– 0.043, 1.544] |
| Intermediate secondary schoold | 1.290* | [ 0.314, 2.217] |
| Remaining school typesd | 0.606* | [– 0.207, 1.418] |
| Competence score | – 0.137* | [– 0.223, – 0.052] |
| Competence score squared | – 0.002 | [– 0.031, 0.027] |
| Random effect ( | 2.890 | [ 2.625, 3.182] |
N = 13,417. Dependent variable is dropout (coded as 1 = dropout and 0 = no dropout). aCoded as 0 = boys and 1 = girls. bCoded as 0 = no migration background and 1 = with migration background. cCoded as 0 = tested in school and 1 = tested individually at home. dDummy-coded with upper secondary school (= “Gymnasium”) as reference category. Reasoning and self-concept were z-standardized. *p < .05
Fig. 4Estimated coefficients with 95% confidence intervals for the analysis model. LWD = listwise deletion; FIML = full-information maximum likelihood; MI = multivariate imputation via chained equations; WE = inverse probability weighting; DK = Diggle–Kenward selection model; WC = Wu–Carroll selection model; PM1/PM0 = pattern mixture model with two latent classes for all-time participants/dropout cases
Fig. 5Estimated variance components, with 95% confidence intervals. LWD = listwise deletion; FIML = full-information maximum likelihood; MI = multivariate imputation via chained equations; WE = inverse probability weighting; DK = Diggle–Kenward selection model; WC = Wu–Carroll selection model; PM1/PM0 = pattern mixture model with two latent classes for all-time participants/dropout cases (for identification purposes, the variances are assumed to be equal for PM1 and PM0)