| Literature DB >> 27430559 |
Mark Belger1, Josep Maria Haro2, Catherine Reed3, Michael Happich3, Kristin Kahle-Wrobleski4, Josep Maria Argimon5, Giuseppe Bruno6, Richard Dodel7, Roy W Jones8, Bruno Vellas9, Anders Wimo10.
Abstract
BACKGROUND: Missing data are a common problem in prospective studies with a long follow-up, and the volume, pattern and reasons for missing data may be relevant when estimating the cost of illness. We aimed to evaluate the effects of different methods for dealing with missing longitudinal cost data and for costing caregiver time on total societal costs in Alzheimer's disease (AD).Entities:
Keywords: Alzheimer’s disease; Cost of illness; Missing data analysis; Missing data mechanisms; Multiple imputation
Mesh:
Year: 2016 PMID: 27430559 PMCID: PMC4950752 DOI: 10.1186/s12874-016-0188-1
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Patterns of missing data
| Missing data pattern | Description |
|---|---|
| Missing completely at random (MCAR) | • Data are missing for reasons not related to observed or unobserved variables |
| Missing at random (MAR) | • Probability of missingness is related to observed data but not to unobserved data |
| Missing not at random (MNAR) | • Missingness is related to unobserved data |
It is not possible to distinguish between MAR and MNAR based on observed data
Fig. 1Bias in mean costs for imputation methods for different mechanisms and levels of missing data. Abbreviations: MCAR: missing completely at random; MAR: missing at random; MNAR: missing not at random; MCMC: Markov Chain Monte Carlo; MI: multiple imputation
Summary statistics of simulationsa with missing data patterns (MCAR, MAR, MNAR)
| Imputation method for missing data pattern | 10 % missing | 20 % missing | 30 % Missing | 40 % missing | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean cost (€) | Bias (%) | SSE | SEE | CP | Mean cost | Bias (%) | SSE | SEE | CP | Mean cost | Bias (%) | SSE | SEE | CP | Mean cost | Bias (%) | SSE | SEE | CP | |
| Complete sample | 2102 | – | 59 | 61 | – | – | – | 59 | 61 | – | – | – | 59 | 61 | – | – | – | 59 | 61 | – |
|
| ||||||||||||||||||||
| Complete cases | 2102 | −0.7 (−0.03 %) | 63 | 64 | 1.00 | 2121 | 18 (0.9 %) | 68 | 69 | 1.00 | 2156 | 53 (3 %) | 75 | 76 | 1.00 | 2181 | 79 (4 %) | 87 | 87 | 0.99 |
| MI MCMC | 2150 | 48 (2 %) | 68 | 58 | 1.00 | 2218 | 115 (5 %) | 78 | 57 | 0.46 | 2300 | 198 (9 %) | 93 | 57 | 0.03 | 2410 | 308 (15 %) | 118 | 57 | 0.00 |
|
| ||||||||||||||||||||
| Complete cases | 1871 | −231 (−11 %) | 56 | 57 | 0.00 | 1723 | −379 (−18 %) | 54 | 55 | 0.00 | 1624 | −478 (−23 %) | 59 | 59 | 0.00 | 1499 | −603 (−29 %) | 59 | 61 | 0.00 |
| MI MCMC | 2158 | 55 (3 %) | 112 | 57 | 0.76 | 2039 | −64 (−3 %) | 83 | 48 | 0.65 | 2218 | 116 (5 %) | 202 | 52 | 0.38 | 2683 | 581 (28 %) | 417 | 69 | 0.09 |
|
| ||||||||||||||||||||
| Complete cases | 1544 | −558 (−27 %) | 27 | 28 | 0.00 | 1291 | −812 (−39 %) | 22 | 22 | 0.00 | 1096 | −1007 (−48 %) | 20 | 19 | 0.00 | 929 | −1173 (−56 %) | 17 | 16 | 0.00 |
| MI MCMC | 1602 | −501 (−24 %) | 28 | 26 | 0.00 | 1383 | −720 (−34 %) | 22 | 19 | 0.00 | 1212 | −890 (−42 %) | 20 | 15 | 0.00 | 1043 | −1059 (−50 %) | 19 | 11 | 0.00 |
% bias was calculated as (estimated−actual)/actual cost × 100), where actual cost was the mean cost for the complete sample
Abbreviations: CP coverage probability, MCMC Markov Chain Monte Carlo, MI multiple imputation, SEE standard error estimate, SSE sampling standard error
a1000 simulations and sample size 1497 for different levels of missing data (10–40 %)
Summary statistics of simulationsa with missing data pattern reflecting GERAS study data at 18 monthsb
| GERAS-1c | GERAS-2d | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Imputation method | Mean cost (€) | Bias (%)e | SSE | SEE | SEE/SSE | CP | Mean cost (€) | Bias (%)e | SSE | SEE | SEE/SSE | CP |
| Complete sample | 2101 | – | 64 | 62 | 0.97 | – | 2103 | – | 60 | 61 | 1.02 | – |
| Naïve imputation method | ||||||||||||
| Complete cases | 1957 | −144 (−7 %) | 67 | 66 | 0.99 | 0.38 | 1689 | −414 (−20 %) | 52 | 54 | 1.04 | 0.00 |
| Multiple imputation method | ||||||||||||
| MI MCMC |
|
|
|
|
|
| 1969 | −134 (−6 %) | 77 | 41 | 0.53 | 0.22 |
| Combination of imputation methods | ||||||||||||
| Combination Scenario Af |
|
|
|
|
|
| 1947 | −157 (−7 %) | 69 | 41 | 0.59 | 0.12 |
| Combination Scenario Bg | 2296 | 195 (9 %) | 62 | 49 | 0.79 | 0.02 |
|
|
|
|
|
|
Numbers in bold text show the imputation method(s) that perform the best (lowest bias) for each of the two datasets (GERAS-1 and GERAS-2)
Abbreviations: CP coverage probability, MAR missing at random, MCAR missing completely at random, MCMC Markov Chain Monte Carlo, MI multiple imputation, MNAR missing not at random, SEE standard error estimate, SSE sampling standard error
a1000 simulations and sample size 1497
bData missing for 33 % patients at 18 months: 15 % patients institutionalised, 6 % died, 12 % lost to follow-up
cGERAS-1: assumed patients institutionalised were based on a predictive equation (i.e. data MAR)
dGERAS-2: assumed patients institutionalised if their caregiver time was >470 h/month (i.e. data MNAR)
e% bias was calculated as ((estimated−actual)/actual cost × 100), where actual cost was the mean cost for the complete sample
fCombination Scenario A: patients lost to follow-up (data MCAR) had costs imputed using group means method, patients institutionalised (data MAR) were imputed using MI MCMC method (including factors MMSE, ADCS-ADL and caregiver time), and patients who died (data MAR) had costs imputed using the MI MCMC method (including factors MMSE, patient age and ADCS-ADL)
gCombination Scenario B: same imputation methods as Combination Scenario A, but a fixed cost (€2940 per month) was used for patients who were institutionalised (data MNAR)