Literature DB >> 35945947

Association between air pollution and COVID-19 disease severity via Bayesian multinomial logistic regression with partially missing outcomes.

Lauren Hoskovec1, Sheena Martenies2, Tori L Burket3, Sheryl Magzamen4, Ander Wilson1.   

Abstract

Recent ecological analyses suggest air pollution exposure may increase susceptibility to and severity of coronavirus disease 2019 (COVID-19). Individual-level studies are needed to clarify the relationship between air pollution exposure and COVID-19 outcomes. We conduct an individual-level analysis of long-term exposure to air pollution and weather on peak COVID-19 severity. We develop a Bayesian multinomial logistic regression model with a multiple imputation approach to impute partially missing health outcomes. Our approach is based on the stick-breaking representation of the multinomial distribution, which offers computational advantages, but presents challenges in interpreting regression coefficients. We propose a novel inferential approach to address these challenges. In a simulation study, we demonstrate our method's ability to impute missing outcome data and improve estimation of regression coefficients compared to a complete case analysis. In our analysis of 55,273 COVID-19 cases in Denver, Colorado, increased annual exposure to fine particulate matter in the year prior to the pandemic was associated with increased risk of severe COVID-19 outcomes. We also found COVID-19 disease severity to be associated with interactions between exposures. Our individual-level analysis fills a gap in the literature and helps to elucidate the association between long-term exposure to air pollution and COVID-19 outcomes.
© 2022 The Authors. Environmetrics published by John Wiley & Sons Ltd.

Entities:  

Keywords:  Pólya‐gamma; SARS‐CoV‐2; categorical regression; multiple imputation

Year:  2022        PMID: 35945947      PMCID: PMC9353392          DOI: 10.1002/env.2751

Source DB:  PubMed          Journal:  Environmetrics        ISSN: 1099-095X            Impact factor:   1.527


INTRODUCTION

Ambient air pollution exposure is a major global environmental health concern (Global Burden of Diseases 2019 Risk Factors Collaborators, 2020; Health Effects Institute, 2018). Long‐term exposure to air pollution is associated with increased rates and severity of chronic diseases including cardiovascular disease, diabetes, asthma, chronic obstructive pulmonary disease, and mortality (Alencar & Santos, 2014; Di, Dai, et al., 2017; Di, Wang, et al., 2017; Dockery & Pope, 1994; Dockery et al., 1993; Pan et al., 2018; Smith et al., 2000). In addition, poor air quality has a negative impact on infectious diseases, and has been linked to increased rates of influenza (Landguth et al., 2020) and increased fatalities from sudden acute respiratory syndrome (SARS, Cui et al., 2003). Previous evidence indicates long‐term exposure to air pollution increases susceptibility to viral disease, leading to more severe outcomes (Ciencewicki & Jaspers, 2007). It is hypothesized that air pollution exposure may be linked to increased severity in the global pandemic of coronavirus disease 2019 (COVID‐19) caused by the novel coronavirus SARS‐CoV‐2 (Comunian et al., 2020; Domingo & Rovira, 2020; Frontera et al., 2020; Setti et al., 2020). Similar biological pathways that have been observed with influenza and other respiratory viral infections may exist between exposure to particulate matter and SARS‐CoV‐2 infection, highlighting the possibility of increased COVID‐19 severity among individuals with higher exposure to air pollution (Frontera et al., 2020). The study of the effects of air pollution on COVID‐19 health endpoints has been identified as a critically important area of research for developing solutions to the global COVID‐19 pandemic (Bhaskar et al., 2020). Studies investigating this relationship have considered exposures such as air quality index, fine and coarse particulate matter, nitrogen oxides, ozone, carbon monoxide, and sulfur dioxide, as well as meteorological factors including temperature and relative humidity. In two literature reviews of studies taking place world‐wide, a majority of articles identified significant associations between short‐ and long‐term exposure to air pollution and negative COVID‐19 endpoints (Bhaskar et al., 2020; Copat et al., 2020). The COVID‐19 endpoints varied among studies and included number of cases, number of deaths, case‐hospitalization rate, case‐fatality rate, percent of severe infection, basic reproduction number, intensive care unit (ICU) admissions, and epidemic escalation. In addition, emerging cohort studies suggest long‐term exposure to air pollution prior to the pandemic is associated with a higher risk of severe COVID‐19 in those infected with SARS‐CoV‐2 (Bozack et al., 2021; Kogevinas et al., 2021). The vast majority of existing studies used ecological designs with aggregated, most commonly county‐level, data. Ecological studies suffer from ecological fallacy; that is, characteristics of the group cannot be attributed to individuals. In their review, Brandt and Mersha (2021) emphasized the need for individual‐level air pollution exposure data and detailed clinical data to establish a causal relationship between air pollution exposure and COVID‐19 outcomes. Individual‐level exposure and risk factor data are needed to minimize bias and potential confounding that can occur at larger spatial resolutions. In addition, health endpoints for COVID‐19 that are measured at the individual level are more accurate than regional endpoints, which may be subject to variations among regions or error due to unmeasured asymptomatic cases and under‐reporting of cases and deaths. The current literature is sparse with regards to individual‐level studies on the association between air pollution exposure and COVID‐19 outcomes. We conduct an individual‐level analysis of the association between long‐term exposure to air pollution and weather and peak COVID‐19 severity in a Denver, Colorado, USA administrative cohort. We consider all cases of COVID‐19 that were reported to the Colorado Department of Public Health and Environment (CDPHE) between March 6, 2020 and February 28, 2021, resulting in a cohort size of 57,027 verified COVID‐19 infections. As the primary health outcome, we consider peak severity. Our peak severity outcome takes on one of six mutually exclusive categorical values: asymptomatic, symptomatic, hospitalized, admitted to the ICU, placed on a mechanical ventilator, or death. Our primary interest is estimating the association between long‐term exposure to ambient air pollution and weather and COVID‐19 peak severity. A key challenge when using individual‐level administrative data, especially in the rapidly evolving COVID‐19 pandemic, is the presence of missing health outcomes. Individual health outcomes may be missing due to nonresponse or logistical problems with data attainment. In the Denver, Colorado cohort, health outcomes are either observed or partially missing. For example, it may be known that an individual was not hospitalized or worse, but it is unknown whether the individual was symptomatic or asymptomatic. Observations with partially missing outcomes are often discarded prior to a complete case analysis; however, there is valuable information to gain from the partially missing observations. Hence, there is a need for statistical methods for regression analysis of data with partially missing categorical outcomes. In classical statistics, multiple imputation approaches for categorical outcome data include nearest‐neighbor based methods (Zhou et al., 2017), bootstrap hotdeck multiple imputation (Wang & Hsu, 2020), inverse probability weighting, and expected estimating equations. These methods generally require discrete covariates, though continuous covariates can be incorporated through discretization. In Bayesian statistics, missing data are handled naturally by sampling from the posterior predictive distribution of the missing data given the observed data. Currently, however, there are no fully Bayesian approaches for multinomial logistic regression with missing outcome data. In this article, we propose a Bayesian multinomial logistic regression model for data that contain observations with partially missing categorical outcomes. Fully Bayesian inference in categorical and multinomial regression has been historically challenging due to nonconjugate priors for the model's likelihood. In our analysis, we base inference on the odds ratio for each peak severity category; hence, we require a logit link function. Polson et al. (2013) proposed a Pólya‐gamma data augmentation approach for Bayesian logit models, and extended the Pólya‐gamma approach to multinomial models by combining it with the data augmentation approach from Holmes and Held (2006). This approach requires sampling the category‐specific regression coefficients one at a time, which can cause slow mixing and convergence in correlated models. To address this issue, Linderman et al. (2015) proposed modeling the multinomial distribution recursively with binomial distributions via the stick‐breaking representation. The stick‐breaking approach permits parallelized updates of the regression parameters, leading to more efficient mixing. Though the stick‐breaking approach offers computational improvements, it presents an inferential challenge because the odds ratio ceases to be a linear function of the exposures. We develop the first fully Bayesian multinomial logistic regression model for partially missing outcome data in which the primary goal is inference on the odds ratios. Our method builds on the approach of Linderman et al. (2015), and we address the inferential challenges induced by the stick‐breaking approach through post‐processing and visualization of the posterior distribution. Our model imputes partially missing health outcome data, where the number of missing outcome categories can vary by individual. Using the proposed model, we estimate the association between long‐term exposure to fine particulate matter, ozone, and temperature and peak COVID‐19 severity in the presence of missing outcome data, while controlling for individual‐ and neighborhood‐level risk factors. We find evidence of a positive association between exposure to fine particulate matter and increased risk of severe COVID‐19, as well as a suggestive interaction effect between fine particulate matter and ozone. Our individual‐level analysis supports existing research on air pollution and COVID‐19, and provides the additional contribution of beginning to draw a causal link.

DATA

Health data

We obtained health outcome from Denver Public Health, a department of Denver Health and Hospital Authority (DHHA). Our study population includes 57,027 laboratory‐confirmed cases of COVID‐19 in the City and County of Denver, Colorado reported between March 6, 2020 and February 28, 2021. The data include information about the case status including if an individual was symptomatic, hospitalized, admitted to an ICU, placed on a mechanical ventilator, or died. The case outcome data had missing observations, primarily due to lack of staff capacity to follow‐up with cases regarding disease outcomes. Hence, the missing mechanism was assumed to be missing at random. We made two assumptions to deterministically fill in some of the missing outcome data. First, since deaths were accurately reported to the City and County of Denver, we assumed that a case with missing death status did not die. Second, we assumed that a case that was not symptomatic was not hospitalized, a case that was not hospitalized was not admitted to the ICU, and a case that was not admitted to the ICU was not placed on a mechanical ventilator. After deterministically imputing missing outcome data using these basic assumptions, we assigned each case to its most severe outcome. When peak severity could not be determined for an individual due to missing data, all possible peak severity outcome categories were left as missing and imputed by our model. Table 1 shows the resulting pattern of missingness in the data.
TABLE 1

Missing data pattern for the peak severity outcome categories in our analysis of the Denver, Colorado cohort ()

# missing outcomesMissing categories # cases% cases
020,87237.8
2(Asymptomatic, symptomatic)29165.3
3(Symptomatic, hospitalized, ICU)590.1
4(Symptomatic, hospitalized, ICU, ventilator)872515.8
5(Asymptomatic, symptomatic, hospitalized, ICU, ventilator)22,70141.1

Note: Cases with partially missing outcomes were missing between 2 and 5 outcome categories. The table shows the number and percent of cases with each missing outcome category pattern.

Missing data pattern for the peak severity outcome categories in our analysis of the Denver, Colorado cohort () Note: Cases with partially missing outcomes were missing between 2 and 5 outcome categories. The table shows the number and percent of cases with each missing outcome category pattern.

Exposure data

We obtained air pollutant and meteorological exposure data from the Colorado Department of Public Health and Environment (CDPHE) website (Department of Public Health and Environment, 2021). The exposure metric of interest was annual average exposure to fine particulate matter with an aerodynamic diameter less than 2.5 m (PM; g/m), ozone (ppb), and temperature (degrees Fahrenheit) in the year prior to the COVID‐19 pandemic in Denver, Colorado. We opted to use the annual average exposure metric because our hypothesis was that chronic exposure to air pollutants would put individuals at higher risk of severe COVID‐19 outcomes by priming immune cells for a large inflammatory response (Domingo & Rovira, 2020; Tripathy et al., 2021). The first officially documented case of COVID‐19 in Denver was March 6, 2020. We therefore define the year prior to the pandemic, our exposure period, as March 1, 2019 through February 29, 2020. During the exposure period, we calculated daily average exposure for PM and temperature and 1‐h maximum daily average for ozone from hourly measurements recorded at ground monitoring stations. We excluded daily variables if more than 25% of hourly observations recorded at that monitoring site for that day were missing. We excluded monitors that were located in the Rocky Mountains west of Denver because that area experiences unique meteorological conditions not representative of the study area (Vedal et al., 2009). Using inverse‐distance weighting, we assigned daily exposures to individual residential locations using data from all monitors within 50 km of the individual's address. Finally, we calculated each individual's annual average exposure during the year prior to the pandemic by averaging the daily exposure values. For our analyses, exposure data were centered and divided by the interquartile range (IQR) prior to model fitting.

Covariate data

We included individual‐ and census tract‐level variables. We obtained individual‐level variables from Denver Public Health's COVID‐19 case investigation database. These variables included the continuous covariate age and the categorical covariates gender, race/ethnicity, and pregnancy status. We also included case report date, defined as the date the case was first reported to CDPHE. The individual‐level covariate data contained a small number of missing observations. To impute missing categorical covariate data, we first assumed that if the case was listed as male then the case was not pregnant. We then singly imputed the missing values for categorical covariates with 0 and added a dummy variable for each covariate with missing data that indicated which values of the covariate were missing. For gender, a value of “other” was combined with the missing group due to the small number in the “other” group (). We obtained census‐tract variables summarizing socioeconomic status from the 2015–2019 American Community Survey (United States Census Bureau, 2020) using the tidycensus package in R (Walker et al., 2021). These variables included median income, percent of the civilian workforce aged 16 and older that is unemployed (unemployed), percent of the population aged 25 and older with less than a high school diploma or equivalent education (low education), and percent of individuals in the census tract with past year's income below the poverty level (poverty). We obtained a final sample size of 55,273 individuals for which we were able to link the health outcome data with complete covariate and exposure data. We provide a summary of the demographic characteristics of the sample in the Supporting Information (Table S1). We show demographic information stratified by completely observed or partially missing health outcomes in Table S2. This study was approved by the Institutional Review Board of Colorado State University.

STATISTICAL MODEL

We first present the model for complete data. Then we describe our Markov chain Monte Carlo (MCMC) sampler, our multiple imputation approach, and our inferential approach. Software to fit our proposed method is available in the R package pgmultinomr (Hoskovec, 2021).

Complete data model

For a sample , let denote the ‐dimensional vector indicating to which of possible outcome categories individual belongs. Hence, contains one 1 and the remaining observations are all 0. Let denote the vector of exposures for individual . In our analysis, contains three exposures and all pairwise interactions. Let denote the vector of covariates measured for individual , including an intercept term. We model with a multinomial distribution where the number of trials is 1. Using the stick‐breaking representation of the multinomial distribution, we model the complete data for individuals by where and . In (1), for are the stick‐specific weights for individual , denoting the proportion of the remaining probability mass assigned to the th category. The parameter denotes the number of remaining trials for the th category, which, in our case, will always be either 0 or 1. We model each for and using a logit link function of exposures and covariates. The logit link for the stick‐specific weights is given by where and are category‐specific regression coefficients for the exposures and covariates, respectively. The model in (1) is equivalent to the standard multinomial model where and To achieve efficient Gibbs sampling of the posterior distribution, we implement a Pólya‐gamma data augmentation scheme (Linderman et al., 2015; Polson et al., 2013). We introduce latent Pólya‐gamma random variables for and such that , where denotes the Pólya‐gamma distribution. Using the stick‐breaking representation and Pólya‐gamma augmentation, the multinomial likelihood for individual can be written as where and denotes the expectation taken with respect to the Pólya‐gamma random variable . By conditioning (6) on , we obtain which is a Gaussian kernel with respect to the regression coefficients. The prior distributions on the regression coefficients are and for , which allow for efficient Gibbs sampling of the posterior distribution.

Posterior computation

In our model, the number of trials is 1. We sample only for those individuals and categories such that . The full conditional for is The regression coefficients for the th category only depend on data from individuals where . For , let be a matrix with rows and be a matrix with rows for each such that . We sample the exposure regression coefficients from where is a diagonal matrix with elements for each such that and is a vector of for each such that . The regression coefficients for the covariates are similarly updated.

Multiple imputation

We assume missing outcome data are missing at random (Little & Rubin, 2019), but may be conditional on partial outcome information. The complete data vector may contain any combination of observed and missing values for the categories. If the vector contains missing data, then any observed values must be 0, since the total number of trials is 1. To impute missing outcome data, we sample from the posterior predictive distribution of the missing data given the observed data. Consider an individual with missing outcome data. Let denote the set of outcome categories with missing data and the set of outcomes categories that are observed for individual . If (i.e., the individual is missing outcome data for all categories), then the posterior predictive distribution is where is calculated by (2), (3), and (5). If some outcome categories for individual are observed and some are missing, then we leverage the partial information to improve imputations. Let denote the set of categories with missing data for individual and let denote the set of categories that are observed. Note that . For partially missing outcomes in our analysis, the number of missing outcome categories may range from 2 to 5. Since we know the observed outcome categories must all be 0, we can sample the missing outcome categories from a reduced dimensional multinomial distribution. First, we calculate the entire probability vector by (2), (3), and (5). Then we calculate where Finally, we sample the missing outcome categories from where is the number of missing outcome categories for individual .

Inference

The traditional multinomial model (4) exhibits straightforward inference on the regression coefficients, assuming the model only includes main effects. The exponentiated regression coefficient is the odds ratio for category relative to the reference category that is associated with a one‐unit increase in exposure , holding all other exposures constant. When interactions are included, the interpretation of regression coefficients in the traditional multinomial model is complicated. Unless all co‐exposures are set to zero, in which case inference simply ignores interactions, it is impossible to increase an exposure while holding constant an interaction term containing that exposure. The stick‐breaking representation of the multinomial distribution also presents challenges in interpreting the regression coefficients because logit for each is conditional on not being in any category . In the stick‐breaking model, is the odds ratio for category relative to a category greater than , conditional on not being in a category less than , that is associated with a one‐unit increase in exposure , holding all other exposures constant. Not only is the stick‐breaking interpretation difficult to comprehend, it also heavily depends on the ordering of the categories since the reference is to a category greater than . The stick‐breaking model presents the same problem with interpreting interactions as the traditional multinomial model. We propose a visualization approach for inference on the stick‐breaking multinomial model to address these problems. Due to the fully Bayesian nature of our model, we use the posterior distribution of the regression coefficients to recover the traditional odds ratio inference that is common in logistic regression. Let denote the sampled value for the parameter at iteration of the MCMC sampler. For iterations post burn‐in, we calculate the posterior distribution of the assignment probabilities, , for each category, by With this method, any category may be selected as the reference category. Let denote the selected reference category. For fixed covariates , we calculate the posterior distribution of the odds ratio (OR) for specified exposure values relative to baseline exposure values by for all . In our analysis, we consider three exposures and their pairwise interactions. To create the matrix in (14), we generate a sequence of evenly‐spaced exposure values within the mean plus or minus two IQR for a primary exposure of interest, and set the two secondary exposures to a fixed percentile. The pairwise interactions are then calculated and all six exposure variables (3 main effects and 3 interactions) are included in . The baseline exposure matrix, , includes the primary exposure set to its mean value, the other two secondary exposures set to the specified percentiles, and the pairwise interactions. Hence, the OR will always be 1 at the mean value of the primary exposure. To visualize the posterior distribution of , we plot the posterior mean and 95% credible intervals as a function of the primary exposure, holding the secondary exposures at the same fixed percentile. We visualize interaction effects by plotting for different percentiles of the secondary exposures. We repeat this procedure three times so each of the three exposures included in our analysis is used as the primary exposure. With our proposed visualization approach, interpreting interaction effects requires care. To determine if there is an interaction, the regression coefficients and 95% credible intervals can be directly interpreted as usual. To understand the magnitude of an interaction, one can use our proposed visualization approach by setting one of the exposures in the interaction as the primary exposure and the other as a secondary exposure and visualizing how the exposure‐response function for the primary exposure changes as the secondary exposure is moved from low to high percentiles. Changes in the shape of the exposure‐response function for the primary exposure associated changes in the percentiles of a secondary exposure suggest interaction effects between exposures. In some situations, the peak severity outcomes follow a logical order. Such is the case in our analysis with outcomes: asymptomatic, symptomatic, hospitalized, admitted to the ICU, placed on a mechanical ventilator, and death. In these situations, the ordinal regression model may seem appropriate. However, ordinal regression requires the strong assumption that the odds ratio for being in a category less than or equal to , relative to being in a category greater than , is the same for all categories. Ordinal regression also presents the same interpretation problems for interaction effects as discussed previously for traditional multinomial regression. Hence, we utilize the flexibility and Bayesian nature of our model to estimate the incremental odds ratio (IOR), which is interpreted as the odds ratio of being in category relative to being in any of the less severe categories. Following a similar approach as we did for the OR, we calculate the posterior distribution of the IOR by for the ordered outcome categories . The IOR thus accounts for the ordinal nature of the outcome by focusing inference on the odds of being in one greater severity outcome. In contrast to ordinal logistic regression, the IOR maintains flexibility by allowing the odds of being category relative to all categories less than to differ for each category . In addition, the IOR may be a nonlinear function of exposures and permits similar interpretation of interaction effects as described above for the OR.

SIMULATION STUDY

Simulation study design and evaluation metrics

We conducted a simulation study to evaluate the proposed method's performance at imputing missing outcome data and estimating regression coefficients. We considered eight simulation scenarios that vary (1) the proportion of observations in each outcome category, (2) whether or not exposures and covariates are predictive of the outcome categories, and (3) whether there are partially missing outcomes or fully missing outcomes. For each scenario, we compared the proposed model to a complete case analysis estimated with a similar Pólya‐gamma augmented stick‐breaking model using the same priors on the regression coefficients. For all eight scenarios, we generated exposure data for with components from a multivariate normal distribution with mean and covariance matrix where is the correlation matrix of the real exposure data. We generated covariate data for with components from independent standard normal distributions. We simulated outcome data with categories and used a sample size of . Scenarios 1–4 encompassed the “data probabilities” setting, in which we set category‐specific intercepts so the outcome category sizes mimicked the complete cases of the real data as much as possible. In scenarios 5–8, termed the “equal probabilities” setting, we set the intercepts so outcome categories were approximately equal‐sized. Outcome categories in the equal probabilities setting were not exactly equal‐sized due to the randomness in the data generating process. Rather, the equal probabilities setting provides a setting in which all six categories have a substantial amount of data, on average roughly equal amounts, and there are no very small or very large categories as is the case in the data probabilities setting. The intercepts were appended to the covariate matrix . The outcome category assignment proportions for each scenario, as well as the true proportions for the complete case data, are shown in Table 2.
TABLE 2

Classification probabilities into each of the six outcome categories

Data probabilitiesEqual probabilities
Real dataSignalNullSignalNull
Symptomatic0.760.71 (0.64, 0.81)0.770.19 (0.09, 0.29)0.14
Asymptomatic0.150.16 (0.08, 0.25)0.160.18 (0.09, 0.27)0.16
Hospitalized0.060.06 (0.01, 0.14)0.050.15 (0.06, 0.24)0.19
ICU0.010.03 (0.01, 0.09)0.010.16 (0.07, 0.26)0.19
Ventilator0.010.02 (0.01, 0.05)0.010.14 (0.07, 0.25)0.14
Death0.010.01 (0.01, 0.07)0.010.18 (0.07, 0.28)0.18

Note: The table shows the outcome probabilities for the complete cases of the real data (“real data”) and for the complete data in our simulation scenarios. Measures for the simulated data were taken from 500 simulated datasets. The table shows the mean (minimum, maximum) classification probabilities for scenarios with a signal, and the fixed classification probabilities for null scenarios. Classification probabilities for null scenarios did not differ among the simulated datasets. Probabilities are shown for both the “data probabilities” and “equal probabilities” simulation design settings.

Classification probabilities into each of the six outcome categories Note: The table shows the outcome probabilities for the complete cases of the real data (“real data”) and for the complete data in our simulation scenarios. Measures for the simulated data were taken from 500 simulated datasets. The table shows the mean (minimum, maximum) classification probabilities for scenarios with a signal, and the fixed classification probabilities for null scenarios. Classification probabilities for null scenarios did not differ among the simulated datasets. Probabilities are shown for both the “data probabilities” and “equal probabilities” simulation design settings. We designed scenarios both with and without a signal from the data to determine the effect of a signal on imputations. In scenarios with a signal, exposure and covariate regression coefficients ( and , respectively) for categories were simulated from independent standard normal distributions. In scenarios without a signal (null scenarios), all exposure and regression coefficients were set to 0, with the exception of the intercepts, which were specified to dictate outcome category size. In all scenarios, we let and then generated outcome data according to the stick‐breaking representation of the multinomial distribution. We considered missing data levels of 0%, 20%, 50%, and 80%. Each missing data level reflects the percent of cases that have some level of uncertainty in the outcome. The cases with missing outcome data were randomly selected in each simulated dataset. We considered “partially missing” outcomes and “fully missing” outcomes. Under partially missing outcomes, a case with missing outcome data was missing anywhere between 2 and 5 outcome categories. The true outcome was always included as one of the missing outcome categories. We randomly selected the additional outcome categories, drawing the number of additional missing outcome categories (1–4) uniformly. Under fully missing outcomes, all cases with missing outcome data were missing data for all six outcome categories. Performance was based on 500 simulated datasets for each scenario and missing data level. We evaluated estimation of the exposure and covariate regression coefficients through root mean squared error (RMSE), bias, coverage of the 95% posterior credible intervals (CI), and CI width, averaged over all regression coefficients. To evaluate imputations, we calculated precision (the proportion of outcomes assigned to a category that truly belong in that category) and recall (the proportion of outcomes that truly belong in a category that were assigned to that category) for each outcome category. We compared our method's estimation performance to a complete case analysis in each of the eight simulation scenarios.

Simulation study results

We summarized simulation results for estimation of the exposure regression coefficients in the data probabilities setting in Table 3 and in the equal probabilities setting in Table 4. Results for covariate regression coefficients are available in the Supporting Information (Tables S3 and S4). We presented precision and recall for 80% missing data in the data probabilities setting in Table 5 and in the equal probabilities setting in Table 6. Precision and recall for 20% and 50% missing data were similar and are available in the Supporting Information (Tables S5 and S6). For each of Tables 3, 4, 5, 6, the four scenarios within each of the two simulation settings reflect the four combinations of the data (providing a signal or being null) and the missing mechanism of the outcomes (partially or fully missing). Hence, the scenarios were termed “partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null.”
TABLE 3

Simulation study results for the data probabilities setting

Proposed methodComplete case analysis
RMSEBiasWidthCovRMSEBiasWidthCov
Partially missing, signal0%0.350.000.870.950.350.000.870.95
20%0.380.000.920.940.390.000.960.95
50%0.430.001.020.940.460.001.160.95
80%0.510.001.200.930.630.001.620.96
Fully missing, signal0%0.350.000.870.950.350.000.870.95
20%0.380.000.930.940.390.000.960.95
50%0.450.001.080.930.460.001.160.95
80%0.590.001.410.910.630.001.620.96
Partially missing, null0%0.340.000.830.950.340.000.830.95
20%0.380.000.900.950.380.000.930.95
50%0.440.001.070.940.470.001.160.95
80%0.530.001.350.950.62 0.011.680.96
Fully missing, null0%0.340.000.830.950.340.000.830.95
20%0.380.000.910.950.380.000.930.95
50%0.450.001.110.940.470.001.160.95
80%0.56 0.011.490.940.62 0.011.680.96

Note: The table shows mean across 500 datasets for each measure in four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The measures are root mean squared error (RMSE), bias, 95% credible interval width (width), and coverage (cov) for exposure regression coefficients. The table shows results from our proposed method and the complete case analysis for missing data levels of 0%, 20%, 50%, and 80%.

TABLE 4

Simulation study results for the equal probabilities setting

Proposed methodComplete case analysis
RMSEBiasWidthCovRMSEBiasWidthCov
Partially missing, signal0%0.150.000.400.940.150.000.400.94
20%0.160.000.420.940.170.000.450.95
50%0.180.000.470.930.210.000.560.95
80%0.210.000.550.930.320.000.860.95
Fully missing, signal0%0.150.000.400.940.150.000.400.94
20%0.160.000.430.940.170.000.450.95
50%0.200.000.520.930.210.000.560.95
80%0.300.000.740.900.320.000.860.95
Partially missing, null0%0.100.000.260.950.100.000.260.95
20%0.110.000.280.940.110.000.290.95
50%0.120.000.330.940.140.000.370.95
80%0.160.000.410.920.220.000.590.95
Fully missing, null0%0.100.000.260.950.100.000.260.95
20%0.110.000.290.940.110.000.290.95
50%0.130.000.350.930.140.000.370.95
80%0.200.000.520.920.220.000.590.95

Note: The table shows mean across 500 datasets for each measure in four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The measures are root mean squared error (RMSE), bias, 95% credible interval width (width), and coverage (cov) for exposure regression coefficients. The table shows results from our proposed method and the complete case analysis for missing data levels of 0%, 20%, 50%, and 80%.

TABLE 5

Summary of imputation performance in the data probabilities setting

Outcome category
123456
Partially missing, signalPrecision0.920.690.540.430.310.31
Recall0.920.690.530.430.310.31
Fully missing, signalPrecision0.850.470.300.210.130.13
Recall0.850.470.300.210.130.13
Partially missing, nullPrecision0.880.500.280.130.070.06
Recall0.880.490.280.130.070.07
Fully missing, nullPrecision0.770.160.050.010.010.01
Recall0.760.160.050.020.010.01

Note: Results are shown for 80% missing data and four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The table shows mean across 500 datasets for precision and recall for each outcome category. Results for the other missing data levels (20% and 50%) were similar and are shown in the supplemental materials.

TABLE 6

Summary of imputation performance in the equal probabilities setting

Outcome category
123456
Partially missing, signalPrecision0.710.670.630.630.590.62
Recall0.710.680.630.630.580.61
Fully missing, signalPrecision0.550.500.440.440.380.42
Recall0.560.500.450.430.380.41
Partially missing, nullPrecision0.280.310.350.360.270.35
Recall0.290.310.350.360.270.34
Fully missing, nullPrecision0.140.160.190.190.140.18
Recall0.140.160.190.190.130.18

Note: Results are shown for 80% missing data and four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The table shows mean across 500 datasets for precision and recall for each outcome category. Results for the other missing data levels (20% and 50%) were similar and are shown in the supplemental materials.

Simulation study results for the data probabilities setting Note: The table shows mean across 500 datasets for each measure in four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The measures are root mean squared error (RMSE), bias, 95% credible interval width (width), and coverage (cov) for exposure regression coefficients. The table shows results from our proposed method and the complete case analysis for missing data levels of 0%, 20%, 50%, and 80%. Simulation study results for the equal probabilities setting Note: The table shows mean across 500 datasets for each measure in four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The measures are root mean squared error (RMSE), bias, 95% credible interval width (width), and coverage (cov) for exposure regression coefficients. The table shows results from our proposed method and the complete case analysis for missing data levels of 0%, 20%, 50%, and 80%. Summary of imputation performance in the data probabilities setting Note: Results are shown for 80% missing data and four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The table shows mean across 500 datasets for precision and recall for each outcome category. Results for the other missing data levels (20% and 50%) were similar and are shown in the supplemental materials. Summary of imputation performance in the equal probabilities setting Note: Results are shown for 80% missing data and four simulation scenarios (“partially missing, signal,” “fully missing, signal,” “partially missing, null,” and “fully missing, null”). The table shows mean across 500 datasets for precision and recall for each outcome category. Results for the other missing data levels (20% and 50%) were similar and are shown in the supplemental materials. Both our proposed method and the complete case analysis produced unbiased estimates for the regression coefficients. However, our proposed method resulted in lower variance estimates of the regression coefficients, exhibited by lower RMSE, smaller CI width, and maintenance of the nominal coverage level (0.95). Hence, by retaining the full dataset and imputing missing outcomes, we obtained more efficient inference over a complete case only analysis. Further estimation gains were achieved through improvements in the imputations, which occurred when there was partial outcome information or larger category sizes. When outcomes were partially missing, as opposed to fully missing, our method leveraged the information available to improve imputations. In each case, partially missing outcomes resulted in higher precision and recall over fully missing outcomes, controlling for other scenario factors (Tables 5 and 6). For example, looking at the scenarios with a signal in the data probabilities setting (Table 5), precision and recall in category 1, the largest category, were 0.92 when outcomes were partially missing versus 0.85 when outcomes were fully missing. For category 6, the smallest category, precision and recall were 0.31 when outcomes were partially missing versus 0.13 when outcomes were fully missing. Hence, the partial outcome information was particularly valuable for small categories where little observed data were available. With improved imputations from partially missing outcomes, our proposed method resulted in even more efficient estimation of the regression coefficients. We saw the greatest estimation gains from the partial information at 80% missing data. At 80% missing data in the data probabilities setting, the partially missing scenario with a signal resulted in RMSE of 0.51, CI width of 1.20, and coverage of 0.93, compared to RMSE of 0.59, CI width of 1.41, and coverage of 0.91 for the fully missing scenario with a signal (Table 3). In the respective complete case analysis, RMSE was 0.63, CI width was 1.62, and coverage was 0.96. Similar patterns for partially and fully missing outcomes existed in the null scenarios in the data probabilities setting (Table 3), and in all scenarios in the equal probabilities setting (Table 4). Hence, our imputation approach offers estimation gains over the complete case analysis for both partially and fully missing outcomes, and these gains are increased further by leveraging the information from partially missing outcomes. Keeping the scenario constant, regression coefficients were more efficiently estimated in the equal probabilities setting than in the data probabilities setting. This is because the data probabilities setting results in some large categories and some very small categories. The small categories have higher estimation uncertainty and worse imputation performance, as evidenced by lower precision and recall for categories 3–6, which contained less data in the data probabilities setting than in the equal probabilities setting. On the other hand, the largest category in the data probabilities setting (category 1) had higher precision and recall than any category in the equal probabilities setting. Hence, the differences in estimation and imputation performance between the data probabilities and equal probabilities settings are purely a result of differences in category size. In both settings, there remained substantial gains in estimation performance from our proposed method over the complete case analysis. A signal in the data also improved imputations. Controlling for other scenario factors, precision and recall were higher in scenarios with a signal than in scenarios with null effects. For small categories containing approximately 1/6 of the data or less (all categories in the equal probabilities setting and categories 3–6 in the data probabilities setting), a signal in the data improved imputations to a greater extent than did the partial outcome information (Tables 5 and 6). In many scenarios, regression coefficient estimation was more efficient in the null scenarios than in the scenarios with a signal. This is likely due to the prior distribution for the regression coefficients being centered on zero. Hence, even though the signal aided imputations, the prior distribution provided more information for estimating the null regression coefficients. Our simulation study demonstrates that our proposed method is able to impute missing outcomes and offers more efficient inference over a complete case analysis under a wide variety of scenarios. Imputation and estimation performance improved as more information became available, whether in the form of partially missing outcomes, larger categories, or a signal to inform outcomes.

DATA ANALYSIS

We applied our proposed method to an analysis of the Denver, Colorado cohort data. The dataset contained 55,273 cases, of which 62.2% () had partially missing health outcomes. Cases with incomplete health outcomes were missing data for between 2 and 5 outcome categories (Table 1). We fit our proposed method to the full dataset and imputed missing outcomes. Due to constraints of the stick‐breaking representation of the multinomial distribution, the largest probability mass is most often assigned to the first outcome category (Zhang & Zhou, 2018). When fitting the model, we ordered the outcome categories so the largest category was first. Hence, the order was: symptomatic, asymptomatic, hospitalized, admitted to the ICU (ICU), placed on a mechanical ventilator (ventilator), and then death (Table 2). For comparison, we conducted an analysis of the subset of complete cases (). We also conducted a sensitivity analysis using logistic regression. In the logistic analysis, we collapsed the multinomial categories to severe (hospitalized, ICU, ventilator, or death) and not severe (asymptomatic or symptomatic). We considered only complete cases. In all models, we included main effects for average exposure to PM, ozone, and temperature in the year prior to the pandemic as well as all pairwise interactions. Although time is not a confounding variable because our exposure time period is the same for all cases, temporal changes in the pandemic such as lifestyle restrictions, access to healthcare, and the harvesting effect may alter the probabilities of each peak severity category. We account for these temporal trends by including a natural cubic spline function of the case report date with 3 degrees of freedom as covariates in our model. To account for potential nonlinearities in the effect of age, we included a natural cubic spline function of age with 3 degrees of freedom. We included all covariates described in the data section. We based inference on 5000 MCMC iterations after a burn‐in of 5000 iterations.

Results

The estimated exponentiated regression coefficients from our proposed method and the complete case analysis are shown in Figure 1 and Tables S7 and S8. The posterior means for the regression coefficients were similar between the two methods. On average, the 95% CI's in our proposed method were 8.2% smaller than those in the complete case analysis, demonstrating the estimation gains from using our proposed method. Exponentiated regression coefficients with 95% credible intervals that do not cross 1.0 existed primarily for the main effect of PM and the interaction effect between PM and ozone. Posterior predictive checks of our imputed values are shown in Figure S5.
FIGURE 1

Results of the analysis of the Denver, Colorado COVID‐19 cohort from our proposed method (black circles) and the complete case analysis (blue triangles). The figure shows the posterior mean and 95% credible intervals for the estimated exponentiated category‐specific regression coefficients associated with main effects (top row) and pairwise interactions (bottom row). Exposures are PM, ozone, and temperature. Categories are symptomatic (sympt.), asymptomatic (asympt.), hospitalized (hosp.), admitted to the ICU (ICU), and placed on a mechanical ventilator (vent). There are no regression coefficients for the death category because it was the last category, and thus contains the remaining probability mass in the stick‐breaking representation.

Results of the analysis of the Denver, Colorado COVID‐19 cohort from our proposed method (black circles) and the complete case analysis (blue triangles). The figure shows the posterior mean and 95% credible intervals for the estimated exponentiated category‐specific regression coefficients associated with main effects (top row) and pairwise interactions (bottom row). Exposures are PM, ozone, and temperature. Categories are symptomatic (sympt.), asymptomatic (asympt.), hospitalized (hosp.), admitted to the ICU (ICU), and placed on a mechanical ventilator (vent). There are no regression coefficients for the death category because it was the last category, and thus contains the remaining probability mass in the stick‐breaking representation. As described in Section 3.4, interpreting the regression coefficients in the stick‐breaking multinomial approach is challenging. Instead, we made inference on the results using OR and IOR, as described in Section 3.4. We selected asymptomatic as the reference category for inference. We visualized the posterior distribution of the OR and IOR for each severity category as a function of a single exposure, holding the other two exposures at their 25th, 50th, and 75th percentiles. We consider the OR and IOR a function of exposures and set all covariates to 0 in our calculations. Figure 2 shows the posterior distribution of the OR for each severity category (symptomatic, hospitalized, ICU, ventilator, and death) relative to asymptomatic, as a function of average annual PM exposure, holding ozone and temperature at their 25th, 50th, and 75th percentiles. At the 25th percentiles of ozone and temperature (Figure 2a), increased exposure to PM was associated with a decreased risk of being admitted to the ICU, relative to being asymptomatic. There was a suggestive positive effect of PM exposure associated with risk of death, relative to being asymptomatic. When ozone and temperature were at their 50th percentiles (Figure 2b), increased annual PM exposure was associated with a starkly increased risk of being hospitalized, relative to being asymptomatic. At these levels of ozone and temperature, exposure to PM was no longer associated with risk of death. A similar pattern continued at the 75th percentiles of ozone and temperature (Figure 2c). At these high levels of ozone and temperature, PM exposure was associated with an increased risk of being hospitalized and, to a lesser extent, being symptomatic and admitted to the ICU, relative to being asymptomatic.
FIGURE 2

Results from the analysis of the Denver, Colorado COVID‐19 cohort using our proposed method. The figure shows the posterior mean (black line) and 95% credible interval (gray shaded area) of the estimated odds ratio (OR) for categories symptomatic, hospitalized, admitted to the ICU (ICU), placed on a mechanical ventilator (ventilator) and death, relative to asymptomatic. The OR was calculated as a function of annual average PM exposure (g/m) relative to the mean exposure, holding ozone and temperature at their 25th (a), 50th (b), and 75th (c) percentiles.

Results from the analysis of the Denver, Colorado COVID‐19 cohort using our proposed method. The figure shows the posterior mean (black line) and 95% credible interval (gray shaded area) of the estimated odds ratio (OR) for categories symptomatic, hospitalized, admitted to the ICU (ICU), placed on a mechanical ventilator (ventilator) and death, relative to asymptomatic. The OR was calculated as a function of annual average PM exposure (g/m) relative to the mean exposure, holding ozone and temperature at their 25th (a), 50th (b), and 75th (c) percentiles. The posterior mean effect of annual average exposure to PM on the risk of being hospitalized or admitted to the ICU, relative to being asymptomatic, switched from a negative trend to a positive trend when ozone and temperature moved from their 25th to 75th percentiles. Changes in the effect of PM as co‐exposures change indicate interactions among exposures. The PM–ozone interaction was associated with a decreased risk of being asymptomatic relative to being hospitalized, admitted to the ICU, placed on a mechanical ventilator, or death as evidenced by the 95% credible interval for the asymptomatic category's exponentiated regression coefficient being below 1.0 (Figure 1). Hence, we determined the PM–ozone interaction is driving the patterns seen in the effect of PM on risk of severe COVID‐19 as ozone and temperature move from low to high levels. We obtained similar inferences from the IOR. Figure 3 shows the IOR associated with annual average PM exposure, holding ozone and temperature at their 25th, 50th, and 75th percentiles. At the 25th percentiles of ozone and temperature (Figure 3a), there was a protective effect of PM exposure on the risk of being admitted to the ICU, relative to not being admitted (e.g., being asymptomatic, symptomatic, or hospitalized only). Exposure to PM was also associated with an increased risk of death, relative to not dying. These effects became null as ozone and temperature moved to their 50th percentiles (Figure 3b). At the 50th percentile of ozone and temperature, exposure to PM was associated with an increased risk of being hospitalized, relative to not hospitalized. Similar effects of PM occurred at the 75th percentiles of ozone and temperature (Figure 3c), with the addition of a suggestive protective effect on the risk of dying, relative to not dying. The complex interaction between PM and ozone was again revealed by the directional switches in the posterior mean trends of PM as ozone and temperature moved from the 25th to 75th percentiles.
FIGURE 3

Results from the analysis of the Denver, Colorado COVID‐19 cohort using our proposed method. The figure shows the posterior mean (black line) and 95% credible interval (gray shaded area) of the estimated incremental odds ratio (IOR) for categories symptomatic, hospitalized, admitted to the ICU (ICU), placed on a mechanical ventilator (ventilator) and death, relative to all less severe categories. The IOR was calculated as a function of annual average PM exposure (g/m) relative to the mean exposure, holding ozone and temperature at their 25th (a), 50th (b), and 75th (c) percentiles.

Results from the analysis of the Denver, Colorado COVID‐19 cohort using our proposed method. The figure shows the posterior mean (black line) and 95% credible interval (gray shaded area) of the estimated incremental odds ratio (IOR) for categories symptomatic, hospitalized, admitted to the ICU (ICU), placed on a mechanical ventilator (ventilator) and death, relative to all less severe categories. The IOR was calculated as a function of annual average PM exposure (g/m) relative to the mean exposure, holding ozone and temperature at their 25th (a), 50th (b), and 75th (c) percentiles. Similar plots for the effects of annual average exposure to ozone and temperature are available in the Supporting Information. Overall, there was weaker evidence for the effects of ozone and temperature on COVID‐19 severity. There were suggestive effects of increased ozone exposure associated with an increased risk of dying, relative to being asymptomatic and relative to not dying, when PM and temperature were at their 25th percentiles (Figures S1 and S2). These effects became null as PM and temperature moved to their 75th percentiles. Increases in temperature, combined with low levels of PM and ozone, were associated with a decreased risk of being symptomatic relative to asymptomatic (Figure S3). This effect was attenuated at higher levels of PM and ozone. At high levels of PM and ozone, there was a suggestive protective effect of temperature on risk of being hospitalized and placed on a mechanical ventilator, relative to being asymptomatic (Figure S3), but not relative to a less severe outcome (Figure S4). The estimated regression coefficients (Figure 1) indicate interaction effects between PM and temperature and between ozone and temperature may be driving these patterns. We reported the estimated regression coefficients for covariates in Tables S9–S11. Briefly, the individual‐level variables age, pregnancy status, and race/ethnicity, as well as the census‐tract level variables median income, percent unemployed, and percent poverty were associated with COVID‐19 disease severity. As described in Section 3.4, direct interpretation of the regression coefficient estimates in the stick‐breaking representation of the multinomial model is challenging. However, most of the significant regression coefficients pertained to the symptomatic or asymptomatic categories. Since symptomatic is the first category, interpretation for the symptomatic category is log‐odds of being symptomatic relative to all other categories. For the asymptomatic category, interpretation is log‐odds of being asymptomatic, conditional on not being symptomatic, relative to all more severe categories (hospitalized or worse). Hence, significant effects that exist in the same direction for both the asymptomatic and symptomatic categories allow us to dichotomize effects for severe COVID‐19 (hospitalized, admitted to ICU, placed on a ventilator, and death) versus not severe (asymptomatic or symptomatic). Considering these interpretation nuances, our results suggest that pregnancy increases risk of severe COVID‐19 relative to not severe (Table S9). In addition, belonging to Black or American Indian race/ethnicity groups increases risk of severe COVID‐19 relative to not severe (Table S10). Since we included spline functions of report date and age in our model, we plotted the estimated log‐odds for each severity category as a function of report date and age in Figures S6 and S7, respectively. There was a suggestive effect of report date on the ventilator category, meaning that later case report dates had a lower risk of being placed on a ventilator relative to death, conditional on at least being placed on a ventilator. This effect could be due to changes over time in treatment for severe cases. There were clear effects of age on nearly all severity categories with the general trend being that younger cases are more likely to have less severe COVID‐19 outcomes.

Sensitivity analysis results

Results from our logistic regression analysis are shown in Table S12. A one IQR increase in exposure to PM was associated with a 9% increased risk of severe COVID‐19 (, 95% CI: (1.02, 1.18)). These results mirror the results from our multinomial regression analysis. In both analyses, PM was associated with an increased risk of severe COVID‐19. In the logistic analysis, there was a positive estimated effect for the interaction between PM and ozone, and a negative estimated effect for the interaction between ozone and temperature. Notably, the negative interaction effect between ozone and temperature may be due to the fact that annual averages for ozone and temperature were highly negatively correlated (.

DISCUSSION

In this article, we proposed a Bayesian multinomial logistic regression model for data with partially missing outcomes to estimate the relationship between long‐term exposure to air pollution and COVID‐19 peak severity outcomes. We implemented Pólya‐gamma data augmentation to achieve efficient computation of the posterior distribution. We developed a multiple imputation algorithm to impute missing outcomes, where the number of missing outcome categories for each case can vary from 2 to the total number of outcomes. Our model is based on the stick‐breaking representation of the multinomial distribution, which presents a challenge in interpreting regression coefficients. The stick‐breaking multinomial regression approach has historically been used for applications focused on clustering and prediction (Linderman et al., 2015). To our knowledge, we present the first application of this approach in which inference using the odds ratio is the primary goal. We proposed an inferential approach based on visualization of the posterior distribution to retain the familiar logistic regression interpretation of the odds ratios. In a simulation study, we demonstrated our method's ability to impute missing outcome data and improve estimation over complete case analyses. In eight different scenarios, our proposed method produced unbiased estimates for the regression coefficients that had smaller RMSE and CI width than estimates from respective complete case analyses. Our proposed method leveraged information from various sources to improve imputation. These sources include: partially, as opposed to fully, missing outcomes, a signal in the data, and larger outcome categories. Better imputations resulted in even more efficient inference on the regression coefficients using our method compared to the complete case analysis. Using our proposed method, we estimated the association between long‐term exposure to PM, ozone, and temperature and COVID‐19 peak severity in a Denver, Colorado cohort. Our model imputed outcomes for the 34,401 cases with partially missing outcome data. In our analysis, we found increased long‐term exposure to PM, combined with high levels of ozone and temperature, was associated with an increased risk of being hospitalized and admitted to the ICU, relative to being asymptomatic. These associations were null or reversed when ozone and temperature were low, indicating interaction effects between the exposures. Through visualization of the OR and IOR, combined with analysis of the estimated regression coefficients, we identified an interaction effect between PM and ozone. A complete case analysis produced similar results, but with more uncertainty, further exemplifying the estimation gains from our proposed method with imputation. Our results support recent studies that identified an association between increased PM exposure and a lagged effect on COVID‐19 mortality (Garcia et al., 2021; Shao et al., 2021). Our individual‐level analysis of the Denver, Colorado cohort fills a major gap in the literature. With individual‐level data, we controlled for known confounding variables and risk factors, and began to establish a driving association between air pollution exposure and COVID‐19 outcomes. Our primary focus was to understand how long‐term exposure to air pollution relates to severity of COVID‐19 outcomes. A natural extension is to study short‐term air pollution exposure, in which case our model could be adapted to a time series framework where the short‐term exposure window moves with case report date. In this case, the model could be used to predict future health outcomes based on forecasted exposure data. A notable complexity in the study of short‐term exposure to air pollution and COVID‐19 health outcomes is the confounding relationship among timing of the pandemic, lifestyle restrictions, and ambient air pollution levels (Zheng et al., 2021). Data S1 Supporting information Click here for additional data file.
  24 in total

1.  The Denver Aerosol Sources and Health (DASH) Study: Overview and Early Findings.

Authors:  S Vedal; M P Hannigan; S J Dutton; S L Miller; J B Milford; N Rabinovitch; S-Y Kim; L Sheppard
Journal:  Atmos Environ (1994)       Date:  2008-12-24       Impact factor: 4.798

Review 2.  Air pollution and respiratory viral infection.

Authors:  Jonathan Ciencewicki; Ilona Jaspers
Journal:  Inhal Toxicol       Date:  2007-11       Impact factor: 2.724

3.  A nonparametric multiple imputation approach for missing categorical data.

Authors:  Muhan Zhou; Yulei He; Mandi Yu; Chiu-Hsieh Hsu
Journal:  BMC Med Res Methodol       Date:  2017-06-06       Impact factor: 4.615

Review 4.  Air Pollution and Covid-19: The Role of Particulate Matter in the Spread and Increase of Covid-19's Morbidity and Mortality.

Authors:  Silvia Comunian; Dario Dongo; Chiara Milani; Paola Palestini
Journal:  Int J Environ Res Public Health       Date:  2020-06-22       Impact factor: 3.390

5.  SARS-Cov-2RNA found on particulate matter of Bergamo in Northern Italy: First evidence.

Authors:  Leonardo Setti; Fabrizio Passarini; Gianluigi De Gennaro; Pierluigi Barbieri; Maria Grazia Perrone; Massimo Borelli; Jolanda Palmisani; Alessia Di Gilio; Valentina Torboli; Francesco Fontana; Libera Clemente; Alberto Pallavicini; Maurizio Ruscio; Prisco Piscitelli; Alessandro Miani
Journal:  Environ Res       Date:  2020-05-30       Impact factor: 6.498

6.  Severe air pollution links to higher mortality in COVID-19 patients: The "double-hit" hypothesis.

Authors:  Antonio Frontera; Lorenzo Cianfanelli; Konstantinos Vlachos; Giovanni Landoni; George Cremona
Journal:  J Infect       Date:  2020-05-21       Impact factor: 6.072

7.  Effects of corona virus disease-19 control measures on air quality in North China.

Authors:  Xiangyu Zheng; Bin Guo; Jing He; Song Xi Chen
Journal:  Environmetrics       Date:  2021-02-20       Impact factor: 1.900

8.  Association between air pollution and COVID-19 disease severity via Bayesian multinomial logistic regression with partially missing outcomes.

Authors:  Lauren Hoskovec; Sheena Martenies; Tori L Burket; Sheryl Magzamen; Ander Wilson
Journal:  Environmetrics       Date:  2022-07-31       Impact factor: 1.527

9.  Global burden of 87 risk factors in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019.

Authors: 
Journal:  Lancet       Date:  2020-10-17       Impact factor: 202.731

View more
  1 in total

1.  Association between air pollution and COVID-19 disease severity via Bayesian multinomial logistic regression with partially missing outcomes.

Authors:  Lauren Hoskovec; Sheena Martenies; Tori L Burket; Sheryl Magzamen; Ander Wilson
Journal:  Environmetrics       Date:  2022-07-31       Impact factor: 1.527

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.