| Literature DB >> 35369860 |
Melissa Middleton1,2, Cattram Nguyen3,4, Margarita Moreno-Betancur3,4, John B Carlin3,4, Katherine J Lee3,4.
Abstract
BACKGROUND: In case-cohort studies a random subcohort is selected from the inception cohort and acts as the sample of controls for several outcome investigations. Analysis is conducted using only the cases and the subcohort, with inverse probability weighting (IPW) used to account for the unequal sampling probabilities resulting from the study design. Like all epidemiological studies, case-cohort studies are susceptible to missing data. Multiple imputation (MI) has become increasingly popular for addressing missing data in epidemiological studies. It is currently unclear how best to incorporate the weights from a case-cohort analysis in MI procedures used to address missing covariate data.Entities:
Keywords: Case-cohort study; Inverse probability weighting; Missing data; Multiple imputation; Simulation study; Unequal sampling probability
Mesh:
Year: 2022 PMID: 35369860 PMCID: PMC8978363 DOI: 10.1186/s12874-021-01495-4
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Detailed description of case study variables used during simulation and their distribution within the Barwon Infant Study
| Variable | Variable Type | Label | n (%) | n (%) missing |
|---|---|---|---|---|
| Food Allergy at 1 year (present) | Binary; Present/Absent | 61 (7.8) | 288 (26.8) | |
| Vitamin D Insufficiency at Birth (present) | Binary; Present/Absent | 149 (44.5) | 739 (68.8) | |
| Ethnicity (Caucasian) | Binary; Caucasian/Not Caucasian | 772 (72.1) | 3 (0.3) | |
| Maternal Vitamin D Supplements Usage (present) | Binary; Present/Absent | 564 (78.8) | 358 (33.3) | |
| Family History of Allergy (present) | Binary; Present/Absent | 911 (86.1) | 16 (1.5) | |
| Number of Siblings | 3-Level Categorical | 0 (0.00) | ||
| None | 453 (42.2) | |||
| One | 383 (35.7) | |||
| Two or more | 238 (22.2) | |||
| Family Pet Ownership (present) | Binary; Present/Absent | 815 (80.5) | 62 (5.8) | |
| Formula Feeding at 6 months# | 3-Level Categorical | 189 (17.6) | ||
| Exclusively Breast Fed | 429 (46.6) | |||
| Exclusively Formula Fed | 320 (34.8) | |||
| Mixed Feeding | 171 (18.6) | |||
| Formula Feeding at 12 months# | 3-Level Categorical | 154 (14.3) | ||
| Exclusively Breast Fed | 271 (30.6) | |||
| Exclusively Formula Fed | 354 (40.0) | |||
| Mixed Feeding | 260 (29.4) | |||
| Maternal Age at Birth | 32.1 (4.78) | 3 (0.3) | ||
| Family SEIFA Classification | 3-Level Categorical | 20 (1.9) | ||
| Low | 268 (25.4) | |||
| Middle | 204 (19.4) | |||
| High | 582 (55.2) | |||
*Mean and standard deviation given for maternal age; percentage given is exclusive of missing data
#Formula feeding variables were not included in the simulation study
SEIFA Socioeconomic index for area
Description of multiple imputation approaches considered to handle missing covariate data
| Method* | Accommodation of Weighting in MI | MI Framework | Label |
|---|---|---|---|
| Complete case | No imputation completed. Analysis applied to observations with complete covariate data. | N/A | |
| Weight only | Imputation models include weights (through the outcome) as a predictor of missingness | FCS | |
| MVNI | |||
| Weight interactions | Interaction between outcome (proxy for weight) and exposure/covariates included in imputation model through passive imputation (FCS) or ‘just another variable’ (MVNI), in addition to outcome as a predictor. | FCS | |
| MVNI | |||
| Stratum specific imputation | Covariates imputed separately by weight status | FCS | |
| MVNI | |||
| Weighted model | Imputation model weighted with inverse probability of selection, outcome included as a predictor. | FCS |
*All methods involve using multiple imputation to address the missing covariates, excluding the complete case analysis, with a weighted analysis model to address the unequal probabilities and missing exposure
FCS Fully Conditional Specification, MVNI Multivariate Normal Imputation, MI Multiple Imputation
Fig. 1Missingness directed acyclic graph (m-DAG) depicting the assumed causal structure between simulated variables and missingness indicators under the dependent missing mechanisms. For the independent missing mechanism, the dashed lines are absent. For simplicity, associations between baseline covariates have not been shown
Fig. 2Relative bias in the coefficient under the extreme scenarios with 30% missing covariate information. Error bars represent 1.96xMonte Carlo standard errors
Fig. 3Empirical standard error and model based standard error under the extreme scenarios with 30% missing covariate information. Error bars represent 1.96xMonte Carlo standard errors
Fig. 4Coverage probability across 2000 simulations under the extreme scenarios with 30% missing covariate information. Error bars represent 1.96xMonte Carlo standard errors
Fig. 5Estimated parameter value with 95% confidence interval in case study dataset