Antonia J Henry1, Nathanael D Hevelone, Stuart Lipsitz, Louis L Nguyen. 1. Division of Vascular & Endovascular Surgery, Brigham & Women's Hospital, Harvard Medical School, Boston, Mass; Center for Surgery and Public Health, Brigham & Women's Hospital, Harvard Medical School, Boston, Mass.
Abstract
OBJECTIVE: Analysis of complex survey databases is an important tool for health services researchers. Missing data elements are challenging because the reasons for "missingness" are multifactorial, especially categorical variables such as race. We simulated missing data for race and analyzed the bias from five methods used in predicting major amputation in patients with critical limb ischemia (CLI). METHODS: Patient discharges with fully observed data containing lower extremity revascularization or major amputation and CLI were selected from the 2003 to 2007 Nationwide Inpatient Sample, a complex survey database (weighted n = 684,057). Considering several random missing data schemes, we compared five missing data methods: complete case analysis, replacement with observed frequencies, missing indicator variable, multiple imputation, and reweighted estimating equations. We created 100 simulated data sets, with 5%, 15%, or 30% of subjects' race drawn to be missing from the full data set. Bias was estimated by comparing the estimated regression coefficients averaged over 100 simulated data sets (β(miss)) from each method vs estimates from the fully observed data set (β(full)), with relative bias calculated as (β(full) - β(miss)/β(full)) × 100%. RESULTS: Our results demonstrate that reweighted estimating equations produce the least biased and the missing indicator variable produces the most biased coefficients. Complete case analysis, replacement with observed frequencies, and multiple imputation resulted in moderate bias. Sensitivity analysis demonstrated the optimal method choice depends on the quantity and type of missing data encountered. CONCLUSIONS: Missing data are an important analytic topic in research with large databases. The commonly used missing indicator variable method introduces severe bias and should be used with caution. We present empiric evidence to guide method selection for handling missing data.
OBJECTIVE: Analysis of complex survey databases is an important tool for health services researchers. Missing data elements are challenging because the reasons for "missingness" are multifactorial, especially categorical variables such as race. We simulated missing data for race and analyzed the bias from five methods used in predicting major amputation in patients with critical limb ischemia (CLI). METHODS:Patient discharges with fully observed data containing lower extremity revascularization or major amputation and CLI were selected from the 2003 to 2007 Nationwide Inpatient Sample, a complex survey database (weighted n = 684,057). Considering several random missing data schemes, we compared five missing data methods: complete case analysis, replacement with observed frequencies, missing indicator variable, multiple imputation, and reweighted estimating equations. We created 100 simulated data sets, with 5%, 15%, or 30% of subjects' race drawn to be missing from the full data set. Bias was estimated by comparing the estimated regression coefficients averaged over 100 simulated data sets (β(miss)) from each method vs estimates from the fully observed data set (β(full)), with relative bias calculated as (β(full) - β(miss)/β(full)) × 100%. RESULTS: Our results demonstrate that reweighted estimating equations produce the least biased and the missing indicator variable produces the most biased coefficients. Complete case analysis, replacement with observed frequencies, and multiple imputation resulted in moderate bias. Sensitivity analysis demonstrated the optimal method choice depends on the quantity and type of missing data encountered. CONCLUSIONS: Missing data are an important analytic topic in research with large databases. The commonly used missing indicator variable method introduces severe bias and should be used with caution. We present empiric evidence to guide method selection for handling missing data.
Authors: Muhammad Ali Chaudhary; Jeffrey K Lange; Linda M Pak; Justin A Blucher; Lauren B Barton; Daniel J Sturgeon; Tracey Koehlmoos; Adil H Haider; Andrew J Schoenfeld Journal: Clin Orthop Relat Res Date: 2018-08 Impact factor: 4.176
Authors: Hassanain Jassim; Johnathan T Seligman; Matthew Frelich; Matthew Goldblatt; Andrew Kastenmeier; James Wallace; Heather S Zhao; Aniko Szabo; Jon C Gould Journal: Surg Endosc Date: 2014-06-18 Impact factor: 4.584
Authors: Alisa Khan; Maitreya Coffey; Katherine P Litterer; Jennifer D Baird; Stephannie L Furtak; Briana M Garcia; Michele A Ashland; Sharon Calaman; Nicholas C Kuzma; Jennifer K O'Toole; Aarti Patel; Glenn Rosenbluth; Lauren A Destino; Jennifer L Everhart; Brian P Good; Jennifer H Hepps; Anuj K Dalal; Stuart R Lipsitz; Catherine S Yoon; Katherine R Zigmont; Rajendu Srivastava; Amy J Starmer; Theodore C Sectish; Nancy D Spector; Daniel C West; Christopher P Landrigan; Brenda K Allair; Claire Alminde; Wilma Alvarado-Little; Marisa Atsatt; Megan E Aylor; James F Bale; Dorene Balmer; Kevin T Barton; Carolyn Beck; Zia Bismilla; Rebecca L Blankenburg; Debra Chandler; Amanda Choudhary; Eileen Christensen; Sally Coghlan-McDonald; F Sessions Cole; Elizabeth Corless; Sharon Cray; Roxi Da Silva; Devesh Dahale; Benard Dreyer; Amanda S Growdon; LeAnn Gubler; Amy Guiot; Roben Harris; Helen Haskell; Irene Kocolas; Elizabeth Kruvand; Michele Marie Lane; Kathleen Langrish; Christy J W Ledford; Kheyandra Lewis; Joseph O Lopreiato; Christopher G Maloney; Amanda Mangan; Peggy Markle; Fernando Mendoza; Dale Ann Micalizzi; Vineeta Mittal; Maria Obermeyer; Katherine A O'Donnell; Mary Ottolini; Shilpa J Patel; Rita Pickler; Jayne Elizabeth Rogers; Lee M Sanders; Kimberly Sauder; Samir S Shah; Meesha Sharma; Arabella Simpkin; Anupama Subramony; E Douglas Thompson; Laura Trueman; Tanner Trujillo; Michael P Turmelle; Cindy Warnick; Chelsea Welch; Andrew J White; Matthew F Wien; Ariel S Winn; Stephanie Wintch; Michael Wolf; H Shonna Yin; Clifton E Yu Journal: JAMA Pediatr Date: 2017-04-01 Impact factor: 16.193
Authors: Roland A Hernandez; Nathanael D Hevelone; Lenny Lopez; Samuel R G Finlayson; Eva Chittenden; Zara Cooper Journal: Am J Surg Date: 2014-10-13 Impact factor: 2.565
Authors: Jessica A Chen; Joseph E Glass; Kara M K Bensley; Simon B Goldberg; Keren Lehavot; Emily C Williams Journal: J Subst Abuse Treat Date: 2020-07-15
Authors: John W Scott; John A Rose; Thomas C Tsai; Cheryl K Zogg; Mark G Shrime; Benjamin D Sommers; Ali Salim; Adil H Haider Journal: Med Care Date: 2016-09 Impact factor: 2.983
Authors: Chester K Yarbrough; Kerry M Bommarito; Paul G Gamble; Ammar H Hawasli; Ian G Dorward; Margaret A Olsen; Wilson Z Ray Journal: J Neurosurg Sci Date: 2016-03-03 Impact factor: 2.279
Authors: Patricia C Dykes; Ronen Rozenblum; Anuj Dalal; Anthony Massaro; Frank Chang; Marsha Clements; Sarah Collins; Jacques Donze; Maureen Fagan; Priscilla Gazarian; John Hanna; Lisa Lehmann; Kathleen Leone; Stuart Lipsitz; Kelly McNally; Conny Morrison; Lipika Samal; Eli Mlaver; Kumiko Schnock; Diana Stade; Deborah Williams; Catherine Yoon; David W Bates Journal: Crit Care Med Date: 2017-08 Impact factor: 9.296