Literature DB >> 32986795

A Bayesian framework for estimating the risk ratio of hospitalization for people with comorbidity infected by SARS-CoV-2 virus.

Abstract

OBJECTIVE: Estimating the hospitalization risk for people with comorbidities infected by the SARS-CoV-2 virus is important for developing public health policies and guidance. Traditional biostatistical methods for risk estimations require: (i) the number of infected people who were not hospitalized, which may be severely undercounted since many infected people were not tested; (ii) comorbidity information for people not hospitalized, which may not always be readily available. We aim to overcome these limitations by developing a Bayesian approach to estimate the risk ratio of hospitalization for COVID-19 patients with comorbidities.
MATERIALS AND METHODS: We derived a Bayesian approach to estimate the posterior distribution of the risk ratio using the observed frequency of comorbidities in COVID-19 patients in hospitals and the prevalence of comorbidities in the general population. We applied our approach to 2 large-scale datasets in the United States: 2491 patients in the COVID-NET, and 5700 patients in New York hospitals.
RESULTS: Our results consistently indicated that cardiovascular diseases carried the highest hospitalization risk for COVID-19 patients, followed by diabetes, chronic respiratory disease, hypertension, and obesity, respectively. DISCUSSION: Our approach only needs (i) the number of hospitalized COVID-19 patients and their comorbidity information, which can be reliably obtained using hospital records, and (ii) the prevalence of the comorbidity of interest in the general population, which is regularly documented by public health agencies for common medical conditions.
CONCLUSION: We developed a novel Bayesian approach to estimate the hospitalization risk for people with comorbidities infected with the SARS-CoV-2 virus.

Entities: Disease Species

Keywords: Bayesian; COVID-19; SARS-CoV-2; comorbidity; hospitalization; risk ratio

Mesh：

Year: 2021 PMID： 32986795 PMCID： PMC7543407 DOI： 10.1093/jamia/ocaa246

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

INTRODUCTION

Recently published data from China, Italy, and the US showed that a large portion of hospitalized COVID-19 patients had at least 1 preexisting comorbidity or medical condition. Estimating the hospitalization risk for people with a certain comorbidity condition infected by the SARS-CoV-2 virus would be important for developing public health policies and guidance (eg, planning medical resources) based on risk stratification. To provide such an estimation, traditional biostatistical methods (eg, logistic regression) require knowing both the number of infected people who were hospitalized and the number of infected people who were not hospitalized. Although the number of infected people who were hospitalized might be tallied from hospital records, the number of infected people who were not hospitalized may be difficult to estimate. The reason is that the official count of infected people who were not hospitalized is limited to those tested for the virus. Since not everyone infected by the virus is tested, the number of infected people not hospitalized may be undercounted in the traditional analysis. In addition, comorbidity information for people not hospitalized may not always be readily available for traditional biostatistical analyses. For example, Petrilli et al recently applied multiple logistic regression to identify factors associated with hospital admission using a prospective cohort, but they had stated 2 main limitations in their study regarding the patients who were not hospitalized: (i) many confirmed patients might not have had their detailed medical history recorded for the study, and (ii) many potential patients might have been omitted from the study since they were not tested for the virus. To overcome these limitations, we developed a Bayesian approach to estimate the risk ratio of hospitalization for COVID-19 patients with comorbidities. The risk ratio is defined as a ratio of the probability of hospitalization for the infected people with a particular comorbidity (eg, diabetes) versus the probability of hospitalization for the infected people without any comorbidities. In other words, people without any comorbidities are used as reference for the risk ratio estimation. Our Bayesian approach only needs (i) the number of hospitalized COVID-19 patients and their comorbidity information, which can be reliably obtained using hospital records, and (ii) the prevalence of the comorbidity of interest in the general population, which is regularly documented by public health agencies for common medical conditions (eg, hypertension, obesity, and diabetes). By applying our approach to 2 different large-scale datasets in the US, we obtained consistent results showing that cardiovascular disease had the highest elevated risk of hospitalization, followed by diabetes, chronic respiratory diseases, hypertension, and obesity, respectively.

MATERIALS AND METHODS

Bayesian modeling

The overview of our Bayesian model is depicted in Figure 1. We classify people in the general population into 3 categories: (1) h—the people without any known comorbidities, (2) c—the people with a particular comorbidity of interest (eg, diabetes), who may or may not carry other types of comorbidities, and (3) o—the people with some other types of comorbidities excluding c (eg, any other types of comorbidities excluding diabetes). Let N, N, and N denote the number of hospitalized COVID-19 patients for these three categories, respectively. Let N denote the total number of COVID-19 patients in the hospital, that is, N = N + N + N. Then the vector (N, N, N) is a multinomial random variable:where denotes a vector of unobserved multinomial probabilities (θ) corresponding to N, N, and N, respectively.

Figure 1.

Graphic representation of the Bayesian model. On the left, the boxes represent constants for either fixed parameter values (α, β, α, β, a, a, a) for priors or observed data (N, N, N, N). The grey ovals (κ, κ, θ) represent stochastic nodes. The white oval (τ/τ) represents a deterministic node. The solid arrows represent stochastic dependence. The dashed arrows represent logical dependence. The corresponding stochastic and deterministic expressions are depicted on the right. Let τ, τ, and τ denote the unknown probabilities of hospitalization for people infected by SARS-CoV-2 virus in the category of h, c, and o, respectively, in the general population. For example, if 50 000 out of a total of 200 000 infected people with a particular comorbidity of interest (ie, in the category of c) were eventually hospitalized, τ would be equal to 0.25 (ie, 50 000/200 000). We define τ/τ to be the risk ratio of the probability of hospitalization for the infected people with a particular comorbidity of interest c (eg, diabetes) versus the probability of hospitalization for the infected people without any comorbidities. It’s important to note that τ, τ, and τ are different from θ and θ. However, we derived an algebraic relationship between τ/τ and θ/θ, as shown below. Let κ and κ denote the proportion of people without any medical conditions and the people with a comorbidity of interest c (eg, diabetes), respectively, in the general population of size N. Let ρ, ρ, and ρ denote the probabilities of being infected by SARS-CoV-2 virus for people in the categories of h, c, and o, respectively. Then, can be expressed as follows: In Eq. (2), the numerator (τ) corresponds to the expected number of hospitalized COVID-19 patients without any comorbidities. Specifically, out of the general population of size N, κ is the expected number of people without any comorbidities given the definition of κ; ρ is the expected number of people without any comorbidities infected by the virus given the definition of ρ; τ is the expected number of infected people without any comorbidities who were hospitalized given the definition of τ. Similarly, in Eq. (3), the numerator (τ) corresponds to the expected number of hospitalized COVID-19 patients with the comorbidity of interest c. In addition, the term τ (1 − κ) N in the denominator of both Eq. (2) and (3) corresponds to the expected number of hospitalized COVID-19 patients with some other types of comorbidities excluding c. Therefore, the denominator in both Eq. (2) and (3) corresponds to the expected total number of hospitalized COVID-19 patients, regardless of the status of their comorbidities. If we assume that people in the general population, regardless of the status of their comorbidities, have an equal probability of being infected by SARS-CoV-2 virus (see Discussion), then ρ = ρ = ρ, and the Eq. (2) and (3) can be simplified by canceling out ρ, ρ, ρ, and N from both the numerator and denominator. The simplified Eq. (2) and (3) are shown as follows: Dividing Eq. (5) by Eq. (4), we obtain the expression of τ/τ as follows: To estimate the posterior probability of τ/τ, we need to sample from the following posterior distribution: To specify the prior distribution for , we chose a Dirichlet distribution, as it is commonly used, as the conjugate prior of the multinomial likelihood described in Eq. (1). where a, a, a correspond to the shape parameters of Dirichlet distribution. For this study, we set a = a = a = 1 to set a uniform prior, although those values can be adjusted to more accurately reflect the probabilities of hospitalization for each category of patients if more data becomes available in the future. To specify the prior distributions for κ and κ, we chose beta distributions as they are commonly used to model proportions. However, κ and κ are not unconstrained beta variables since κ + κ < 1 (ie, the combined proportion of people without any medical conditions and people with a comorbidity of interest c is less than 100%). We implemented a constrained sampling strategy to enforce this constraint (see below). In addition, α, β, α, and β denote shape parameters of the corresponding beta distributions. Using the method of moments these parameters can be expressed as follows: where µ and σ, and µ and σ represent the mean and variance of the proportions of the healthy people and people with the comorbidity of interest c, respectively, in the general population. Then, by plugging in the priors in Eq. (7), the posterior distribution becomes the following: In summary, the foundation of our approach is based on our derived algebraic relationship (Eq. 6) between the quantity of τ/τ (the risk ratio) and the quantities of κ, κ, and . Using a uniform Dirichlet distribution, is modeled by a noninformative prior; is related to the observed data (N, N, N, and N) in hospitalized COVID-19 patients through the multinomial likelihood as described in Eq. (1). κ and κ are modeled by informative priors using beta distributions whose shape parameters were expressed using the published prevalence rates for comorbidities in the general population. Through sampling from the posterior distribution of κ, κ, and , we were able to estimate the posterior distribution of τ/τ as a derived quantity of κ/κ with the constraint that the sum of κ and κ is less than 1. We used WinBUGS (version 1.4.3) to implement the above models. In particular, the imposed constraint on the sum of κ and κ was implemented using the “ones trick” of WinBUGS (see Supplementary File 1 for the implementation details). The posterior distributions of risk ratios for different comorbidities were estimated with the Markov Chain Monte Carlo (MCMC) sampling in WinBUGS using the following parameters: the number of chains of 4, the number of total iterations of 100 000, burn-in of 10 000, and thinning of 4. Convergence and autocorrelations were evaluated with trace/history and autocorrelation plots. Multiple initial values were applied for MCMC sampling.

Comorbidity data for hospitalized COVID-19 patients

For the above Bayesian approach, the following 2 types of data are required: (1) the frequency of the comorbidity of interest (eg, diabetes) in COVID-19 patients in hospitals, and (2) the prevalence of the comorbidity in the general population. For the comorbidity frequency of hospitalized COVID-19 patients, we used a large-scale dataset, available at COVID-NET, collected from 154 acute care hospitals in 74 counties in 13 states in the US from March 1 to May 2, 2020. Among a total of 2491 hospitalized adult patients with laboratory-confirmed COVID-19 in this COVID-NET dataset, 314 had asthma, 266 had COPD, 859 had cardiovascular diseases, 819 had diabetes, 1154 were obese, 1428 had hypertension, and 336 had no known medical conditions. Besides the COVID-NET dataset, we also used a published dataset from the state of New York collected from 12 hospitals in New York City, Long Island, and Westchester County from March 1 to April 4, 2020. Among a total of 5700 hospitalized COVID-19 patents in this New York dataset, 479 had asthma, 287 had COPD, 966 had cardiovascular diseases, 1808 had diabetes, 1737 were obese, 3026 had hypertension, and 350 had no known medical conditions. In both the COVID-NET and New York datasets, cardiovascular disease referred to coronary artery disease and congestive heart failure. For the prevalence of comorbidities in the general US adult population, the following estimates (mean ± standard error) by the US public health government agencies were used: asthma (7.7%±0.22%), cardiovascular disease (5.6%±0.14%), COPD (5.9%±0.051%), diabetes (13%±5.6%), obesity (42.4%±1.8%), and hypertension (49.1%±1.5%). The proportion of healthy adults in the US who have no medical conditions was estimated to be 12.2% (95% CI: 10.9–13.6).

RESULTS

Estimation with the COVID-NET data

We applied our Bayesian approach to a dataset of hospitalized COVID-19 patients from the COVID-NET (n = 2491). Table 1 provides the summary statistics (median, central 95% Bayesian credible interval) for the estimated posterior distributions of the risk ratio of hospitalization for COVID-19 patients with cardiovascular disease (6.9, 5.1–9.3), diabetes (3.6, 2.9–4.4), COPD (2.6, 2.1–3.2), asthma (2.3, 1.9–2.9), hypertension (1.7, 1.4–2.0), and obesity (1.6, 1.3–1.9). Figure 2A depicts the posterior distributions for these estimated risk ratios.

Table 1.

Summary statistics of the posterior distributions of the hospitalization risk for COVID-19 patients with comorbidities

Comorbidity	Median Risk Ratio (Central 95% Bayesian Credible Interval)
Comorbidity	COVID-NET	New York
Asthma	2.331	2.165
Asthma	(1.878–2.894)	(1.793–2.610)
Cardiovascular disease	6.885	6.369
Cardiovascular disease	(5.139–9.317)	(4.831–8.483)
COPD	2.577	1.694
COPD	(2.079–3.188)	(1.393–2.054)
Diabetes	3.599	4.838
Diabetes	(2.926–4.428)	(4.031–5.811)
Hypertension	1.660	2.143
Hypertension	(1.368–2.017)	(1.806–2.538)
Obesity	1.555	1.425
Obesity	(1.269–1.906)	(1.187–1.711)

Figure 2.

The posterior probability density of the risk ratio of hospitalization for each comorbidity estimated from the datasets of (A) COVID-NET and (B) New York.

The posterior probability density of the risk ratio of hospitalization for each comorbidity estimated from the datasets of (A) COVID-NET and (B) New York. Summary statistics of the posterior distributions of the hospitalization risk for COVID-19 patients with comorbidities

Estimation with the New York data

A different dataset from the state of New York (n = 5700) was used as a comparison. The results from the New York dataset were similar to the ones from the COVID-NET dataset, showing that cardiovascular diseases (6.4, 4.9–8.5) and diabetes (4.8, 4.0–5.8) significantly increased the risk of hospitalization for COVID-19 patients, followed by asthma (2.2, 1.8–2.6), hypertension (2.1, 1.8–2.5), COPD (1.7, 1.4–2.1), and obesity (1.4, 1.2–1.7) (Table 1, Figure 2B). The only major difference in the ranking of the risk from the COVID-NET and the New York datasets was for COPD, with a lower risk estimated from the New York dataset compared to the COVID-NET dataset.

DISCUSSION

Using 2 different large-scale datasets from COVID-NET and New York, our approach obtained similar results, which strongly indicated that COVID-19 patients with cardiovascular disease or diabetes had the highest elevated risk of hospitalization. For example, the risk ratio of hospitalization for COVID-19 patients with cardiovascular disease was estimated to have a median value greater than 6, indicating that the hospitalization risk of COVID-19 patients with cardiovascular disease was 6 times greater than that of COVID-19 patients without any comorbidities. The hospitalization risk for COVID-19 patients also increased with chronic respiratory disease (asthma and COPD), hypertension, and obesity. These comorbidities were selected for this study since they were documented in both the COVID-NET and the New York datasets. Our preliminary exploration with the COVID-NET dataset also indicated elevated risks for people with autoimmune diseases, immune suppression conditions, and renal diseases (data not shown). We encourage other researchers to apply our Bayesian model to their own investigations. One limitation with our current analysis is that our estimated hospitalization risk for patients with a comorbidity of interest (eg, diabetes) may be confounded with other comorbidities in the same patient (eg, hypertension). This limitation is attributed to our lack of access to the necessary data rather than our Bayesian approach per se. In this study, we relied upon 2 published summary statistics (ie, COVID-NET and New York), which did not include the details of joint comorbidities in their publications., For researchers who can access the complete medical records (instead of just summary statistics), they would be able to obtain the frequency of joint comorbidities (eg, number of COVID-19 hospitalized patients with both diabetes and hypertension). Then, our Bayesian approach could be applied directly to estimate the hospitalization risk for such joint comorbidities. Specifically, instead of using the frequency of a particular comorbidity (eg, diabetes), our model would use the frequency of the joint comorbidities (eg, diabetes and hypertension). In addition, the joint prevalence of comorbidities (eg, diabetes and hypertension) would be used as informative priors. The only biological assumption in our Bayesian model is that people in the general population, regardless of the status of their comorbidities, could be infected by the SARS-CoV-2 virus with similar probability (no assumptions on the severity of the symptoms after infection were made). This assumption is based on the rationale that SARS-CoV-2 is a newly emerged virus to the human population; thus, nobody is particularly immune to the virus. For example, it has been reported that viral loads were similar in asymptomatic and symptomatic patients, and young children and adults were both similarly infected by the virus. If future research shows that people with specified comorbidities do have a different chance of being infected, our Bayesian approach can be modified to accommodate that difference (eg, using informative prior probability distributions to reflect the differential degrees of the chance of infection).

CONCLUSION

We developed a novel Bayesian approach to estimate the hospitalization risk for people with comorbidities infected with the SARS-CoV-2 virus. Our results indicated that cardiovascular diseases carried the highest hospitalization risk for COVID-19 patients, followed by diabetes, chronic respiratory disease, hypertension, and obesity, respectively.

AUTHOR CONTRIBUTIONS

XG contributed project conception and data analysis, QD contributed Bayesian modeling and drafting the manuscript.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online. Click here for additional data file.

7 in total

1. Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area.

Authors: Safiya Richardson; Jamie S Hirsch; Mangala Narasimhan; James M Crawford; Thomas McGinn; Karina W Davidson; Douglas P Barnaby; Lance B Becker; John D Chelico; Stuart L Cohen; Jennifer Cookingham; Kevin Coppa; Michael A Diefenbach; Andrew J Dominello; Joan Duer-Hefele; Louise Falzon; Jordan Gitlin; Negin Hajizadeh; Tiffany G Harvin; David A Hirschwerk; Eun Ji Kim; Zachary M Kozel; Lyndonna M Marrast; Jazmin N Mogavero; Gabrielle A Osorio; Michael Qiu; Theodoros P Zanos
Journal: JAMA Date: 2020-05-26 Impact factor: 56.272

2. Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy.

Authors: Graziano Onder; Giovanni Rezza; Silvio Brusaferro
Journal: JAMA Date: 2020-05-12 Impact factor: 56.272

3. Prevalence of Optimal Metabolic Health in American Adults: National Health and Nutrition Examination Survey 2009-2016.

Authors: Joana Araújo; Jianwen Cai; June Stevens
Journal: Metab Syndr Relat Disord Date: 2018-11-27 Impact factor: 1.894

4. SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients.

Authors: Lirong Zou; Feng Ruan; Mingxing Huang; Lijun Liang; Huitao Huang; Zhongsi Hong; Jianxiang Yu; Min Kang; Yingchao Song; Jinyu Xia; Qianfang Guo; Tie Song; Jianfeng He; Hui-Ling Yen; Malik Peiris; Jie Wu
Journal: N Engl J Med Date: 2020-02-19 Impact factor: 91.245

5. Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective cohort study.

Authors: Christopher M Petrilli; Simon A Jones; Jie Yang; Harish Rajagopalan; Luke O'Donnell; Yelena Chernyak; Katie A Tobin; Robert J Cerfolio; Fritz Francois; Leora I Horwitz
Journal: BMJ Date: 2020-05-22

6. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.

Authors: Fei Zhou; Ting Yu; Ronghui Du; Guohui Fan; Ying Liu; Zhibo Liu; Jie Xiang; Yeming Wang; Bin Song; Xiaoying Gu; Lulu Guan; Yuan Wei; Hui Li; Xudong Wu; Jiuyang Xu; Shengjin Tu; Yi Zhang; Hua Chen; Bin Cao
Journal: Lancet Date: 2020-03-11 Impact factor: 79.321

7. Risk Factors for Intensive Care Unit Admission and In-hospital Mortality Among Hospitalized Adults Identified through the US Coronavirus Disease 2019 (COVID-19)-Associated Hospitalization Surveillance Network (COVID-NET).

Authors: Lindsay Kim; Shikha Garg; Alissa O'Halloran; Michael Whitaker; Huong Pham; Evan J Anderson; Isaac Armistead; Nancy M Bennett; Laurie Billing; Kathryn Como-Sabetti; Mary Hill; Sue Kim; Maya L Monroe; Alison Muse; Arthur L Reingold; William Schaffner; Melissa Sutton; H Keipp Talbot; Salina M Torres; Kimberly Yousey-Hindes; Rachel Holstein; Charisse Cummings; Lynnette Brammer; Aron J Hall; Alicia M Fry; Gayle E Langley
Journal: Clin Infect Dis Date: 2021-05-04 Impact factor: 9.079

7 in total

2 in total

1. Enhancing the prediction of hospitalization from a COVID-19 agent-based model: A Bayesian method for model parameter estimation.

Authors: Emily Hadley; Sarah Rhea; Kasey Jones; Lei Li; Marie Stoner; Georgiy Bobashev
Journal: PLoS One Date: 2022-03-01 Impact factor: 3.240

Review 2. A primer on Bayesian estimation of prevalence of COVID-19 patient outcomes.

Authors: Xiang Gao; Qunfeng Dong
Journal: JAMIA Open Date: 2020-11-10

2 in total