| Literature DB >> 26809812 |
Catrin O Plumpton1, Tim Morris2,3, Dyfrig A Hughes4, Ian R White5.
Abstract
BACKGROUND: Missing data in a large scale survey presents major challenges. We focus on performing multiple imputation by chained equations when data contain multiple incomplete multi-item scales. Recent authors have proposed imputing such data at the level of the individual item, but this can lead to infeasibly large imputation models.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26809812 PMCID: PMC4727289 DOI: 10.1186/s13104-016-1853-5
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Fig. 1Responses rate by question and country. Questions not-applicable due differences in healthcare systems appear as breaks in the plot. MARS Medication Adherence Rating Scale, LOTR Life Orientation Test Revised, BMQ Beliefs about Medicines Questionnaire, TPB Theory of Planned Behaviour, EUROPEP European Task Force on Patient Evaluations of General Practice, BRIGHT Building Research Initiative Group Illness Management and Adherence in Transplantation, BIPQ Brief Illness Perception Questionnaire Reprinted from Value in Health, 18(2), Morrison VL, Holmes EAF, Parveen S, Plumpton CO, Clyne W, De Geest S, Dobbels F, Vrijens B, Kardas P, Hughes DA, Predictors of Self-Reported Adherence to Antihypertensive Medicines: A Multinational, Cross-Sectional Survey, 206–216, Copyright (2015), with permission from Elsevier [21]
Missing data by country, and how it was handled in imputation models
| Variables | % incomplete (for scale variables, % of missing responses to scale items, not scale totals) | Imputed as | Used in imputation models as | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Austria | Belgium | England | Germany | Greece | Hungary | Netherlands | Poland | Wales | |||
| Total N | 323 | 180 | 323 | 274 | 289 | 323 | 237 | 323 | 323 | ||
| Gender | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 binary item | Single item |
| Age | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 continuous item | Single item |
| Education | 3 | 0 | 1 | 0 | 2 | 1 | 0 | 0 | 0 | 1 categorical item | Single item |
| Marital status | 2 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 binary item | Single item |
| Employment | 2 | 0 | 0 | 1 | 1 | 1 | 0 | 2 | 0 | 1 binary item | Single item |
| Number of | |||||||||||
| Medical conditions | 1 | 3 | 0 | 1 | 1 | 0 | 0 | 3 | 0 | 1 continuous item | Single item |
| Medicines | 1 | 1 | 0 | 2 | 1 | 1 | 2 | 6 | 1 | 1 continuous item | Single item |
| Tablets | 3 | 3 | 1 | 3 | 1 | 1 | 3 | 12 | 1 | 1 continuous item | Single item |
| Items prescribed | 9 | 6 | 3 | 3 | 1 | 8 | 7 | 26 | 6 | 1 continuous item | Single item |
| Dosage frequency | 1 | 2 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 ordered categorical item | Single item |
| Morisky adherence | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 binary items | Binary item |
| Medication adherence rating scale | 2 | 3 | 1 | 2 | 1 | 2 | 1 | 4 | 0 | 5 ordinal items | Scale |
| Prescription payment | 1 | 2 | 0 | 1 | 1 | 1 | * | 1 | * | 1 categorical item | Single item |
| Affordability problem | 1 | 3 | 2 | 1 | 1 | 3 | 1 | 0 | * | 1 binary item | Single item |
| Cost coping strategies | 2 | 7 | 8 | 3 | 2 | 1 | * | 8 | * | 6 ordered categorical items | 6 single items |
| Income | |||||||||||
| Source | 11 | 23 | 22 | 28 | 12 | 8 | 37 | 46 | 15 | 1 categorical item | Single item |
| Perception | 8 | 25 | 20 | 28 | 12 | 7 | 37 | 46 | 14 | 2 items: ordered categorical item conditional on binary item | 2 items |
| Ease of borrowing | 9 | 23 | 20 | 28 | 12 | 7 | 38 | 48 | 14 | 2 items: ordered categorical item conditional on binary item | 2 items |
| Total | 9 | 24 | 21 | 28 | 12 | 6 | 38 | 46 | 14 | 2 items: ordered categorical item conditional on binary item | 2 items |
| Health status | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 ordered categorical item | Single item |
| Practitioner | |||||||||||
| Type | 7 | 16 | 17 | 23 | 6 | 7 | 29 | 32 | 10 | 1 categorical item | Single item |
| Gender | 10 | 17 | 18 | 22 | 14 | 6 | 29 | 38 | 12 | 1 binary item | Single item |
| Satisfaction with | |||||||||||
| Practitioner | 11 | 18 | 18 | 28 | 2 | 3 | 30 | 35 | 7 | 17 ordered categorical items | Scale |
| Practice | 11 | 23 | 18 | 23 | 6 | 3 | 30 | 34 | 11 | 6 ordered categorical items | Scale |
| Optimism | 6 | 15 | 16 | 20 | 5 | 2 | 26 | 26 | 10 | 6 ordered categorical items | Scale |
| Illness perception questionnaire: analysed as 8 individual items | 9 | 19 | 19 | 25 | 7 | 4 | 31 | 37 | 11 | 8 ordered categorical items | Scale |
| Necessities | 6 | 15 | 16 | 20 | 5 | 2 | 27 | 27 | 10 | 5 ordered categorical items | Scale |
| Medicine concerns | 6 | 15 | 16 | 20 | 5 | 2 | 27 | 27 | 10 | 6 ordered categorical items | Scale |
| Attitude | 9 | 17 | 16 | 24 | 7 | 4 | 30 | 35 | 10 | 7 ordered categorical items | Scale |
| Barriers (theory of planned behaviour) | 10 | 17 | 16 | 23 | 7 | 3 | 30 | 33 | 11 | 1 ordered categorical item | Single item |
| Facilitators | 11 | 18 | 16 | 24 | 8 | 5 | 30 | 34 | 11 | 3 ordered categorical items | Scale |
| Intention | 10 | 18 | 17 | 24 | 8 | 4 | 30 | 33 | 10 | 2 ordered categorical items | Scale |
| Self efficacy | 7 | 1 | 16 | 23 | 6 | 3 | 30 | 31 | 10 | 2 ordered categorical items | Scale |
| Normative beliefs | 10 | 20 | 17 | 24 | 7 | 4 | 30 | 33 | 11 | 3 ordered categorical items | Scale |
| Barriers | 21 | 22 | 22 | 38 | 18 | 6 | 35 | 35 | 9 | 15 ordered categorical items | Scale |
| Social support | 10 | 19 | 19 | 23 | 20 | 4 | 31 | 43 | 11 | 7 ordered categorical items | Scale |
| Time preference | 14 | 25 | 23 | 23 | 7 | 21 | 32 | 34 | 19 | 4 ordered categorical items | 4 single items |
| Discrete choice experiment: not included in analysis | 2 | 5 | 7 | 6 | 2 | 1 | 7 | 12 | 2 | 9 binary items | 9 single items |
Unless stated, variables correspond to a single predictor during analysis. Due to differences in healthcare and prescription systems between countries, not all questions applied to each country. Additionally, in Wales, one question from the barriers scale was not applicable, thus this scale has only 14 items. Whilst illness perception questions were imputed as scale items, they were analysed individually
* Variables not analysed due to differences in prescription policies
Summary of methods compared in simulation
| Method description | Assumptions | Comments |
|---|---|---|
| 1. Exclude observations with any missing data from the analysis | Missing values are independent of Morisky score given the other variables | Complete case analysis |
| 2. For partially observed scales, sum the observed values, weighted by (1/proportion of items observed). Exclude observations with completely missing scales | Partially observed items are MAR given other items in the scale and completely missing scales are MCAR | Effectively single imputation as the mean of observed items within a scale |
| 3. For partially observed scales, set the score to missing. Multiply impute the scale sums from a multivariate normal model with Morisky score and age as covariates | Missingness is MAR, and this process is the same for missing scales or missing items within scales | Wasteful of observed data |
| 4. For partially observed scales, sum the observed values, weighted by (1/proportion of items observed). For completely missing scales, multiply impute the scale sums from a multivariate normal model with Morisky score and age as covariates | Completely missing scales are MAR | Uses single imputation in the same way as approach 2 |
| 5. Multiply impute missing items based on the total of the other scale, and the other items within the scale for the item being imputed (with Morisky score and age as covariates). This requires the use of chained equations with linear regression imputation rather than a multivariate normal model | Missing at random for both variables, but that the regression for one item on the other scale items is the same as the regression on the other scale total | Proposed adaptation |
| 6. Multiply impute all items using all other items via a multivariate normal model, including Morisky score and age as covariates | Multivariate normality | It is in some senses the benchmark |
Fig. 2Social-support scale for Poland, 70 % totals and 43 % individual items missing
Fig. 3Individual item, time preference variable 2, for Poland, 36 % missing
Fig. 4Forest plots illustrating odds ratios from the logistic regression
Summary of proportional decrease in standard error, between complete case and multiple imputation analyses
| Mean (%) | Min (%) | Max (%) | Median (%) | Standard deviation (%) | |
|---|---|---|---|---|---|
| Overall | 39 | 0 | 100 | 38 | 19 |
| Austria | 38 | 22 | 55 | 38 | 8 |
| Belgium | 5 | 0 | 13 | 4 | 5 |
| England | 58 | 33 | 100 | 45 | 27 |
| Germany | 21 | 12 | 26 | 22 | 5 |
| Greece | 50 | 43 | 58 | 50 | 4 |
| Hungary | 23 | 14 | 27 | 23 | 4 |
| Netherlands | 29 | 24 | 36 | 29 | 4 |
| Poland | 41 | 14 | 59 | 42 | 12 |
| Wales | 36 | 28 | 47 | 37 | 6 |
Disparity in variable selection between CC and MI, over 42 variables in 9 countries
| Complete case method | ||||
|---|---|---|---|---|
| Included n (%) | Excluded n (%) | Total | ||
| Multiple imputation | Included n (%) | 86 (23) | 3 (1) | 89 |
| Excluded n (%) | 25 (7) | 259 (69) | 284 | |
| Total | 111 | 262 | 373 | |
χ2= 250, p < 0.001
Fig. 5Simulation results for the three scenarios. Brackets indicate confidence intervals. Base case: 35 % had all items missing for a scale; 8 % had one or two items missing
Fig. 6Simulation results for the three scenarios. Brackets indicate confidence intervals. More incomplete observations with partial data: 18 % had all items missing for a scale; 25 % had one or two items missing
Fig. 7Simulation results for the three scenarios. Brackets indicate confidence intervals. Fewer observations with complete data: 55 % had all items missing for a scale; 15 % had just one or two missing