Graeme Hawthorne1, Peter Elliott. 1. Department of Psychiatry, Australian Centre for Posttraumatic Mental Health, The University of Melbourne, PO Box 5444, West Heidelberg, Melbourne, Victoria 3081, Australia.
Abstract
OBJECTIVE: Increasing awareness of how missing data affects the analysis of clinical and public health interventions has led to increasing numbers of missing data procedures. There is little advice regarding which procedures should be selected under different circumstances. This paper compares six popular procedures: listwise deletion, item mean substitution, person mean substitution at two levels, regression imputation and hot deck imputation. METHOD: Using a complete dataset, each was examined under a variety of sample sizes and differing levels of missing data. The criteria were the true t-values for the entire sample. RESULTS: The results suggest important differences. If missing data are from a scale where about half the items are present, hot deck imputation or person mean substitution are best. Because person mean substitution is computationally simpler, similar in its efficiency, advocated by other researchers and more likely to be an option on statistical software packages, it is the method of choice. If the missing data are from a scale where more than half the items are missing, or with single-item measures, then hot deck imputation is recommended. The findings also showed that listwise deletion and item mean substitution performed poorly. CONCLUSIONS: Person mean and hot deck imputation are preferred. Since listwise deletion and item mean substitution performed poorly, yet are the most widely reported methods, the findings have broad implications.
OBJECTIVE: Increasing awareness of how missing data affects the analysis of clinical and public health interventions has led to increasing numbers of missing data procedures. There is little advice regarding which procedures should be selected under different circumstances. This paper compares six popular procedures: listwise deletion, item mean substitution, person mean substitution at two levels, regression imputation and hot deck imputation. METHOD: Using a complete dataset, each was examined under a variety of sample sizes and differing levels of missing data. The criteria were the true t-values for the entire sample. RESULTS: The results suggest important differences. If missing data are from a scale where about half the items are present, hot deck imputation or person mean substitution are best. Because person mean substitution is computationally simpler, similar in its efficiency, advocated by other researchers and more likely to be an option on statistical software packages, it is the method of choice. If the missing data are from a scale where more than half the items are missing, or with single-item measures, then hot deck imputation is recommended. The findings also showed that listwise deletion and item mean substitution performed poorly. CONCLUSIONS:Person mean and hot deck imputation are preferred. Since listwise deletion and item mean substitution performed poorly, yet are the most widely reported methods, the findings have broad implications.
Authors: Laura D Byham-Gray; J Scott Parrott; Emily N Peters; Susan Gould Fogerite; Rosa K Hand; Sean Ahrens; Andrea Fleisch Marcus; Justin J Fiutem Journal: JPEN J Parenter Enteral Nutr Date: 2017-12-19 Impact factor: 4.016
Authors: Robert J Wellman; Marie-Pierre Sylvestre; Erin K O'Loughlin; Hartley Dutczak; Annie Montreuil; Geetanjali D Datta; Jennifer O'Loughlin Journal: Int J Public Health Date: 2017-11-07 Impact factor: 3.380
Authors: Claire M Peterson; Deborah Young-Hyman; Sarah Fischer; Jessica T Markowitz; Andrew B Muir; Lori M Laffel Journal: J Pediatr Psychol Date: 2018-01-01