| Literature DB >> 17166270 |
Fiona M Shrive1, Heather Stuart, Hude Quan, William A Ghali.
Abstract
BACKGROUND: Missing data present a challenge to many research projects. The problem is often pronounced in studies utilizing self-report scales, and literature addressing different strategies for dealing with missing data in such circumstances is scarce. The objective of this study was to compare six different imputation techniques for dealing with missing data in the Zung Self-reported Depression scale (SDS).Entities:
Mesh:
Year: 2006 PMID: 17166270 PMCID: PMC1716168 DOI: 10.1186/1471-2288-6-57
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
The Zung Self-rating Depression Scale (SDS)
| None or a little of the time | Some of the time | Good part of the time | Most of the time | |
| 1. I feel down-hearted, blue, and sad. | 1 | 2 | 3 | 4 |
| 2. Morning is when I feel the best. | 4 | 3 | 2 | 1 |
| 3. I have crying spells or feel like it. | 1 | 2 | 3 | 4 |
| 4. I have trouble sleeping through the night. | 1 | 2 | 3 | 4 |
| 5. I eat as much as I used to. | 4 | 3 | 2 | 1 |
| 6. I enjoy looking at, talking to, and being with attractive men/women. | 4 | 3 | 2 | 1 |
| 7. I notice that I am losing weight. | 1 | 2 | 3 | 4 |
| 8. I have trouble with constipation. | 1 | 2 | 3 | 4 |
| 9. My heart beats faster than usual. | 1 | 2 | 3 | 4 |
| 10. I get tired for no reason. | 1 | 2 | 3 | 4 |
| 11. My mind is as clear as it used to be. | 4 | 3 | 2 | 1 |
| 12. I find it easy to do the things I used to. | 4 | 3 | 2 | 1 |
| 13. I am restless and can't keep still. | 1 | 2 | 3 | 4 |
| 14. I feel hopeful about the future. | 4 | 3 | 2 | 1 |
| 15. I am more irritable than usual. | 1 | 2 | 3 | 4 |
| 16. I find it easy to make decisions. | 4 | 3 | 2 | 1 |
| 17. I feel that I am useful and needed. | 4 | 3 | 2 | 1 |
| 18. My life is pretty full. | 4 | 3 | 2 | 1 |
| 19. I feel that others would be better off if I were dead. | 1 | 2 | 3 | 4 |
| 20. I still enjoy the things I used to do. | 4 | 3 | 2 | 1 |
Scoring – The total score is calculated by adding the responses to the 20 questions. The maximum total is 80 (20 statements with a highest possible score of 4 each). The score is then converted into a total out of 100 by dividing the respondent's sum by 0.8. The cut points for the scale are:
(1) < 50 : normal range
(2) 50 – 59 : minimal or mild depression
(3) 60 – 69 : moderate to marked depression
(4) > 70 : severe depression
Distribution of randomly deleted missing responses (N = 1580)
| Total missing responses | N | % of total | N | % of total | N | % of total | N | % of total | N | % of total | N | % of total |
| 0 | 201 | 12.7 | 18 | 1.1 | 174 | 11.0 | 1 | 0.1 | 151 | 9.6 | 174 | 11.0 |
| 1 | 422 | 26.7 | 83 | 5.3 | 373 | 23.6 | 7 | 0.4 | 376 | 23.8 | 422 | 26.7 |
| 2 | 432 | 27.3 | 220 | 14.0 | 463 | 29.3 | 42 | 2.7 | 435 | 27.5 | 466 | 29.5 |
| 3 | 306 | 19.4 | 341 | 21.6 | 305 | 19.3 | 141 | 8.9 | 317 | 20.1 | 315 | 19.9 |
| 4 | 147 | 9.3 | 314 | 19.9 | 161 | 10.2 | 226 | 14.3 | 172 | 10.9 | 122 | 7.7 |
| 5 | 52 | 3.3 | 296 | 18.7 | 85 | 5.4 | 288 | 18.2 | 78 | 4.9 | 59 | 3.7 |
| 6 | 18 | 1.1 | 171 | 10.8 | 12 | 0.8 | 305 | 19.3 | 35 | 2.2 | 15 | 1.0 |
| 7 | 2 | 0.1 | 92 | 5.82 | 6 | 0.4 | 241 | 15.3 | 11 | 0.7 | 5 | 0.3 |
| 8 | 0 | 0 | 34 | 2.2 | 1 | 0.1 | 158 | 10.0 | 3 | 0.2 | 2 | 0.1 |
| 9 | 0 | 0 | 9 | 0.6 | 0 | 0 | 90 | 5.7 | 2 | 0.1 | 0 | 0 |
| 10 | 0 | 0 | 0 | 0.0 | 0 | 0 | 46 | 2.9 | 0 | 0 | 0 | 0 |
| 11 | 0 | 0 | 1 | 0.1 | 0 | 0 | 25 | 1.6 | 0 | 0 | 0 | 0 |
| 12 | 0 | 0 | 1 | 0.1 | 0 | 0 | 6 | 0.4 | 0 | 0 | 0 | 0 |
| 13 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.1 | 0 | 0 | 0 | 0 |
| 14 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.1 | 0 | 0 | 0 | 0 |
Diagnostic measures for imputation methods
| Missing Data Scenario | Method | Mean | SD | Spearman | % Misclassified | Kappa |
| Random Selection | 45.99* | 10.65 | 0.906 | 15% (207) | 0.684 | |
| Preceding Question | 44.69* | 10.07 | 0.946 | 8.7% (120) | 0.807 | |
| Question Mean | 43.75 | 9.84* | 0.986 | 7.5% (104) | 0.823 | |
| Individual Mean | 43.74 | 11.11 | 0.986 | 5.4% (74) | 0.880 | |
| Single Regression | 44.03 | 10.71 | 0.981 | 5.6%(77) | 0.873 | |
| Multiple Imputation | 44.01 | 10.73 | 0.987 | 4.7% (65) | 0.893 | |
| Random Selection | 47.25* | 11.14 | 0.784 | 28.2% (440) | 0.452 | |
| Preceding Question | 46.41* | 9.79* | 0.898 | 14.4% (225) | 0.700 | |
| Question Mean | 43.59 | 8.88* | 0.974 | 12.1% (189) | 0.709 | |
| Individual Mean | 43.59 | 11.26 | 0.974 | 8.9% (139) | 0.802 | |
| Single Regression | 44.03 | 10.65 | 0.966 | 9.6% (150) | 0.781 | |
| Multiple Imputation | 44.06 | 10.49 | 0.976 | 7.0% (110) | 0.839 | |
| Random Selection | 49.09* | 11.92* | 0.610 | 41.0% (647) | 0.267 | |
| Preceding Question | 48.62* | 9.55* | 0.867 | 23.6% (373) | 0.549 | |
| Question Mean | 43.60 | 8.05* | 0.958 | 14.9% (235) | 0.629 | |
| Individual Mean | 43.66 | 11.33 | 0.955 | 10.8% (171) | 0.760 | |
| Single Regression | 44.39 | 10.33 | 0.937 | 11.4%(180) | 0.738 | |
| Multiple Imputation | 44.32 | 10.21 | 0.959 | 9.2% (145) | 0.789 | |
| Random Selection | 45.62 | 10.38 | 0.901 | 16.6% (233) | 0.649 | |
| Preceding Question | 41.66* | 10.73 | 0.970 | 10.2% (143) | 0.753 | |
| Question Mean | 43.43 | 9.67* | 0.987 | 8.4% (118) | 0.798 | |
| Individual Mean | 43.37 | 11.03 | 0.984 | 5.7% (80) | 0.870 | |
| Single Regression | 43.66 | 10.67 | 0.978 | 6.8%(95) | 0.842 | |
| Multiple Imputation | 43.67 | 10.61 | 0.986 | 5.8% (81) | 0.866 | |
| Random Selection | 45.85 | 10.48 | 0.885 | 18.1 %(259) | 0.618 | |
| Preceding Question | 44.81 | 10.09 | 0.940 | 8.9% (127) | 0.804 | |
| Question Mean | 43.63 | 9.65* | 0.984 | 7.4% (106) | 0.825 | |
| Individual Mean | 43.65 | 11.05 | 0.982 | 5.7% (82) | 0.867 | |
| Single Regression | 43.89 | 10.67 | 0.978 | 7.1%(102) | 0.835 | |
| Multiple Imputation | 43.91 | 10.58 | 0.985 | 5.3% (77) | 0.877 | |
| Random Selection | 45.82* | 10.46 | 0.899 | 15.7% (221) | 0.741 | |
| Preceding Question | 44.44 | 10.06 | 0.947 | 9.7% (136) | 0.839 | |
| Question Mean | 43.51 | 9.66* | 0.987 | 8.4% (118) | 0.850 | |
| Individual Mean | 43.50 | 10.90 | 0.985 | 5.9% (83) | 0.902 | |
| Single Regression | 43.54 | 10.78 | 0.975 | 7.7% (108) | 0.871 | |
| Multiple Imputation | 43.54 | 10.65 | 0.986 | 6.1% (86) | 0.897 | |
* significant difference from the population statistics at 95% confidence
** Participants for which no observations were randomly deleted are excluded from the analysis. When there are no missing values to impute, the calculated score is the same as the known "true" score thus the scores correlate perfectly (spearman = 1.0)
Figure 1Predicted versus observed scores for each imputation technique with a probability of missing of 20%.
Correlation with increased missing items in the P = 0.20 missing data scenario.
| # of Missing Items (P = 0.20) | Spearman | Kappa | Spearman | Kappa | Spearman | Kappa | Spearman | Kappa | Spearman | Kappa | Spearman | Kappa |
| 1 (N = 83) | 0.983 | 0.971 | 0.983 | 0.910 | 0.993 | 1.000 | 0.992 | 0.971 | 0.990 | 0.970 | 0.993 | 1.000 |
| 2 (N = 220) | 0.941 | 0.605 | 0.961 | 0.741 | 0.987 | 0.804 | 0.987 | 0.794 | 0.984 | 0.783 | 0.988 | 0.841 |
| 3 (N = 341) | 0.877 | 0.563 | 0.944 | 0.791 | 0.981 | 0.771 | 0.980 | 0.833 | 0.974 | 0.832 | 0.982 | 0.848 |
| 4 (N = 314) | 0.816 | 0.454 | 0.916 | 0.739 | 0.976 | 0.743 | 0.973 | 0.823 | 0.967 | 0.764 | 0.974 | 0.854 |
| 5 (N = 296) | 0.783 | 0.438 | 0.930 | 0.661 | 0.978 | 0.641 | 0.977 | 0.780 | 0.969 | 0.784 | 0.978 | 0.853 |
| 6 (N = 171) | 0.618 | 0.182 | 0.890 | 0.569 | 0.963 | 0.651 | 0.953 | 0.745 | 0.934 | 0.710 | 0.957 | 0.805 |
| 7 (N = 92) | 0.508 | 0.114 | 0.791 | 0.454 | 0.948 | 0.346 | 0.930 | 0.630 | 0.918 | 0.600 | 0.948 | 0.649 |
| 8 (N = 34) | 0.563 | 0.191 | 0.848 | 0.698 | 0.970 | 0.401 | 0.958 | 0.827 | 0.943 | 0.884 | 0.978 | 0.831 |
Comparison of MI and single regression in the "capture" of the true values using 3 missing data scenarios
| Question | # cases missing N | MI coverage of true value by range N (%) | Single regression agreement with true value N (%) | # cases missing N | MI coverage of true value by range N (%) | Single regression agreement with true value N (%) | # cases missing N | MI coverage of true value by range N (%) | Single regression agreement with true value N (%) |
| 1 | 150 | 142 (95%) | 88 (59%) | 189 | 173 (92%) | 115 (61%) | 139 | 124 (89%) | 76 (55%) |
| 2 | 175 | 141 (81%) | 48 (27%) | 182 | 142 (78%) | 46 (25%) | 161 | 124 (77%) | 46 (29%) |
| 3 | 155 | 150 (97%) | 111 (72%) | 173 | 164 (95%) | 114 (66%) | 156 | 146 (94%) | 94 (60%) |
| 4 | 146 | 115 (79%) | 47 (32%) | 176 | 137 (78%) | 47 (27%) | 170 | 143 (84%) | 40 (24%) |
| 5 | 158 | 125 (79%) | 47 (30%) | 185 | 150 (81%) | 45 (24%) | 159 | 126 (79%) | 49 (31%) |
| 6 | 160 | 132 (83%) | 43 (27%) | 170 | 132 (78%) | 44 (26%) | 249 | 206 (83%) | 75 (30%) |
| 7 | 145 | 132 (91%) | 64 (44%) | 199 | 181 (91%) | 110 (55%) | 167 | 153 (92%) | 85 (51%) |
| 8 | 161 | 142 (88%) | 58 (36%) | 190 | 171 (90%) | 82 (43%) | 152 | 132 (87%) | 70 (46%) |
| 9 | 172 | 164 (95%) | 94 (55%) | 184 | 172 (93%) | 107 (58%) | 177 | 164 (93%) | 93 (53%) |
| 10 | 174 | 156 (90%) | 69 (40%) | 166 | 144 (87%) | 65 (39%) | 152 | 131 (86%) | 64 (42%) |
| 11 | 163 | 126 (77%) | 63 (39%) | 184 | 153 (83%) | 50 (27%) | 142 | 119 (84%) | 49 (35%) |
| 12 | 151 | 126 (83%) | 66 (44%) | 168 | 146 (87%) | 60 (36%) | 155 | 131 (85%) | 52 (34%) |
| 13 | 151 | 135 (89%) | 55 (36%) | 155 | 135 (87%) | 71 (46%) | 141 | 125 (89%) | 53 (38%) |
| 14 | 165 | 149 (90%) | 74 (45%) | 195 | 175 (90%) | 84 (43%) | 153 | 134 (88%) | 68 (44%) |
| 15 | 157 | 134 (85%) | 53 (34%) | 186 | 169 (91%) | 63 (34%) | 147 | 126 (86%) | 60 (41%) |
| 16 | 147 | 132 (90%) | 59 (40%) | 194 | 154 (79%) | 72 (37%) | 163 | 141 (87%) | 59 (36%) |
| 17 | 152 | 143 (94%) | 87 (57%) | 184 | 179 (97%) | 105 (57%) | 150 | 135 (90%) | 81 (54%) |
| 18 | 153 | 135 (88%) | 68 (44%) | 169 | 154 (91%) | 93 (55%) | 152 | 140 (92%) | 84 (55%) |
| 19 | 177 | 174 (98%) | 142 (80%) | 174 | 167 (96%) | 135 (78%) | 169 | 164 (97%) | 147 (87%) |
| 20 | 162 | 140 (86%) | 73 (45%) | 185 | 157 (85%) | 75 (41%) | 169 | 152 (90%) | 72 (43%) |