Literature DB >> 35368775

Multiple Imputation of Missing Race and Ethnicity in CDC COVID-19 Case-Level Surveillance Data.

Guangyu Zhang1, Charles E Rose1, Yujia Zhang1, Rui Li2, Florence C Lee1, Greta Massetti1, Laura E Adams1.   

Abstract

The COVID-19 pandemic has resulted in a disproportionate burden on racial and ethnic minority groups, but incompleteness in surveillance data limits understanding of disparities. CDC's case-based surveillance system contains case-level information on most COVID-19 cases in the United States. Data analyzed in this paper contain COVID-19 cases with case-level information through September 25, 2020, which represent 70.9% of all COVID-19 cases reported to CDC during the period. Case-level surveillance data are used to investigate COVID-19 disparities by race/ethnicity, sex, and age. However, demographic information on race and ethnicity is missing for a substantial percentage of COVID-19 cases (e.g., 35.8% and 47.2% of cases analyzed were missing race and ethnicity information, respectively). Our goal in this study was to impute missing race and ethnicity to derive more accurate incidence and incidence rate ratio (IRR) estimates for different racial and ethnic groups, and evaluate the results from imputation compared to complete case analysis, which involves removing cases with missing race/ethnicity information from the analysis. Two multiple imputation (MI) models were developed. Model 1 imputes race using six binary race variables, and Model 2 imputes race as a composite multinomial variable. Our evaluation found that compared with complete case analysis, MI reduced biases and improved coverage on incidence and IRR estimates for all race/ethnicity groups, except for the Non-Hispanic Multiple/other group. Our research highlights the importance of supplementing complete case analysis with additional methods of analysis to better describe racial and ethnic disparities. When race and ethnicity data are missing, multiple imputation may provide more accurate incidence and IRR estimates to monitor these disparities in tandem with efforts to improve the collection of race and ethnicity information for pandemic surveillance.

Entities:  

Keywords:  Health Equity; Missing Data; Multiple Imputation; Race and Ethnicity

Year:  2022        PMID: 35368775      PMCID: PMC8967240          DOI: 10.6000/1929-6029.2022.11.01

Source DB:  PubMed          Journal:  Int J Stat Med Res        ISSN: 1929-6029


  17 in total

1.  Multiple imputation of discrete and continuous data by fully conditional specification.

Authors:  Stef van Buuren
Journal:  Stat Methods Med Res       Date:  2007-06       Impact factor: 3.021

2.  A new method for estimating race/ethnicity and associated disparities where administrative records lack self-reported race/ethnicity.

Authors:  Marc N Elliott; Allen Fremont; Peter A Morrison; Philip Pantoja; Nicole Lurie
Journal:  Health Serv Res       Date:  2008-05-12       Impact factor: 3.402

3.  A passive and inclusive strategy to impute missing values of a composite categorical variable with an application to determine HIV transmission categories.

Authors:  Yi Pan; Yulei He; Ruiguang Song; Guoshen Wang; Qian An
Journal:  Ann Epidemiol       Date:  2020-07-22       Impact factor: 3.797

Review 4.  Use of geocoding and surname analysis to estimate race and ethnicity.

Authors:  Kevin Fiscella; Allen M Fremont
Journal:  Health Serv Res       Date:  2006-08       Impact factor: 3.402

5.  Assessment of COVID-19 Hospitalizations by Race/Ethnicity in 12 States.

Authors:  Pinar Karaca-Mandic; Archelle Georgiou; Soumya Sen
Journal:  JAMA Intern Med       Date:  2021-01-01       Impact factor: 21.873

6.  COVID-19 and African Americans.

Authors:  Clyde W Yancy
Journal:  JAMA       Date:  2020-05-19       Impact factor: 56.272

Review 7.  Missing data analysis using multiple imputation: getting to the heart of the matter.

Authors:  Yulei He
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2010-01

8.  Racial demographics and COVID-19 confirmed cases and deaths: a correlational analysis of 2886 US counties.

Authors:  Uma V Mahajan; Margaret Larkins-Pettigrew
Journal:  J Public Health (Oxf)       Date:  2020-08-18       Impact factor: 2.341

9.  Multiple Imputation by Fully Conditional Specification for Dealing with Missing Data in a Large Epidemiologic Study.

Authors:  Yang Liu; Anindya De
Journal:  Int J Stat Med Res       Date:  2015-08-19

10.  RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning.

Authors:  Ji-Sung Kim; Xin Gao; Andrey Rzhetsky
Journal:  PLoS Comput Biol       Date:  2018-04-26       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.