Ada O Youk1, Roslyn A Stone, Gary M Marsh. 1. Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA. ayouk@pitt.edu
Abstract
PURPOSE: In a cohort in which racial data are unknown for some persons, race-specific persons and person-years are imputed using a model-based iterative allocation algorithm (IAA). METHODS: An EM algorithm-based approach to address misclassification in a censored data regression setting can be adapted to estimate the probability that a person of unknown race is white. The corresponding race-specific person-years are obtained as a by-product of the estimation procedure. Variance estimates are computed using the bootstrap. The proposed approach is compared with the proportional allocation method (PAM). RESULTS: In an occupational cohort where racial data were missing for 41% of the workers, the age-time-race-specific person-years were estimated within a relative variation of approximately 20%, using the IAA. The deaths were less reliably estimated. The standardized mortality ratios (SMRs) for all-cause mortality estimated using the IAA and the PAM were more similar for the non-white workers than for a smaller subgroup of white workers. CONCLUSIONS: The IAA provides a method to reliably estimate race-specific person-year denominators in cohort studies with missing racial data. This method is applicable to other incompletely observed non-time-dependent categorical covariates. Internal cohort rates or SMRs can be computed and modeled, with bootstrap confidence intervals that account for the uncertainty in the determination of race.
PURPOSE: In a cohort in which racial data are unknown for some persons, race-specific persons and person-years are imputed using a model-based iterative allocation algorithm (IAA). METHODS: An EM algorithm-based approach to address misclassification in a censored data regression setting can be adapted to estimate the probability that a person of unknown race is white. The corresponding race-specific person-years are obtained as a by-product of the estimation procedure. Variance estimates are computed using the bootstrap. The proposed approach is compared with the proportional allocation method (PAM). RESULTS: In an occupational cohort where racial data were missing for 41% of the workers, the age-time-race-specific person-years were estimated within a relative variation of approximately 20%, using the IAA. The deaths were less reliably estimated. The standardized mortality ratios (SMRs) for all-cause mortality estimated using the IAA and the PAM were more similar for the non-white workers than for a smaller subgroup of white workers. CONCLUSIONS: The IAA provides a method to reliably estimate race-specific person-year denominators in cohort studies with missing racial data. This method is applicable to other incompletely observed non-time-dependent categorical covariates. Internal cohort rates or SMRs can be computed and modeled, with bootstrap confidence intervals that account for the uncertainty in the determination of race.
Authors: Kim A Papp; Philippe Fonjallaz; Florence Casset-Semanaz; James G Krueger; Knut M Wittkowski Journal: Curr Med Res Opin Date: 2008-06-04 Impact factor: 2.580
Authors: Peter H Van Ness; Terrence E Murphy; Katy L B Araujo; Margaret A Pisani; Heather G Allore Journal: J Clin Epidemiol Date: 2007-07-25 Impact factor: 6.437
Authors: José A Bauermeister; Jesse M Golinkoff; Keith J Horvath; Lisa B Hightow-Weidman; Patrick S Sullivan; Rob Stephenson Journal: JMIR Res Protoc Date: 2018-08-02