Literature DB >> 33817689

An observational study to develop a scoring system and model to detect risk of hospital admission due to COVID-19.

Zhe Chen1, Nicholas W Russo1, Matthew M Miller1, Robert X Murphy2, David B Burmeister1.   

Abstract

BACKGROUND: COVID-19 has caused an unprecedented global health emergency. The strains of such a pandemic can overwhelm hospital capacity. Efficient clinical decision-making is crucial for proper healthcare resource utilization in this crisis. Using observational study data, we set out to create a predictive model that could anticipate which COVID-19 patients would likely be admitted and developed a scoring tool that could be used in the clinical setting and for population risk stratification.
METHODS: We retrospectively evaluated data from COVID-19 patients across a network of 6 hospitals in northeastern Pennsylvania. Analysis was limited to age, gender, and historical variables. After creating a variable importance plot, we chose a selection of the best predictors to train a logistic regression model. Variable selection was done using a lasso regularization technique. Using the coefficients in our logistic regression model, we then created a scoring tool and validated the score on a test set data.
RESULTS: A total of 6485 COVID-19 patients were included in our analysis, of which 707 were hospitalized. The biggest predictors of patient hospitalization included age, a history of hypertension, diabetes, chronic heart disease, gender, tobacco use, and chronic kidney disease. The logistic regression model demonstrated an AUC of 0.81. The coefficients for our logistic regression model were used to develop a scoring tool. Low-, intermediate-, and high-risk patients were deemed to have a 3.5%, 26%, and 38% chance of hospitalization, respectively. The best predictors of hospitalization included age (odds ratio [OR] = 1.03, confidence interval [CI] = 1.02-1.03), diabetes (OR = 2.08, CI = 1.69-2.57), hypertension (OR = 2.36, CI = 1.90-2.94), chronic heart disease (OR = 1.53, CI = 1.22-1.91), and male gender (OR = 1.32, CI = 1.11-1.58).
CONCLUSIONS: Using retrospective observational data from a 6-hospital network, we determined risk factors for admission and developed a predictive model and scoring tool for use in the clinical and population setting that could anticipate admission for COVID-19 patients.
© 2021 The Authors. JACEP Open published by Wiley Periodicals LLC on behalf of American College of Emergency Physicians.

Entities:  

Keywords:  COVID; machine learning; predictive model; risk of admission

Year:  2021        PMID: 33817689      PMCID: PMC8011617          DOI: 10.1002/emp2.12406

Source DB:  PubMed          Journal:  J Am Coll Emerg Physicians Open        ISSN: 2688-1152


INTRODUCTION

Background

COVID‐19, declared a global pandemic by the World Health Organization (WHO) in March 2020, has caused an unprecedented global health emergency with more than 23.7 million confirmed cases globally as of August 25, 2020. In the United States alone, there have been 5.7 million confirmed cases, with over 177,000 deaths—numbers that increase by the day. The virus displays person‐to‐person transmission, , and to cope with the tremendous burden this places on the healthcare system, governments worldwide have instituted quarantine measures to slow the spread. Although the typical intensive care unit (ICU) occupancy is 60%–80%, the strains of a pandemic, such as that of COVID‐19, can overwhelm hospital capacity. Furthermore, COVID‐19 demonstrates widespread symptomology with varying degrees of illness and inconsistent radiologic findings. Although the majority of patients present with fever, cough, shortness of breath, and respiratory distress, , gastrointestinal symptoms, such as nausea and vomiting, have also been reported, as well as the asymptomatic patient. Although COVID‐19 has a diverse presentation, clinical variables, such as age, race, or comorbidities can help inform disease course. Data show higher prevalence of COVID‐19 in patients with features of the metabolic syndrome, hypertension, cardiovascular disease, and positive smoking history, , , and these factors have also been correlated with a greater rate of ICU admission , and mortality. ,

Importance

Predictive analytics that use algorithms to identify patterns in large amounts of data have been gaining popularity in healthcare research to provide researchers with the ability to harness data to inform clinical decision making. COVID‐19 has proven no exception, as the global pandemic has spurred the research community to recommend using predictive modelling techniques with the data gathered to expedite clinical decision making. , , This call has been answered, as clinicians are using emerging patterns in disease presentation and prognosis in combination with predictive modelling to improve hospital and patient management. Modelling tools have been developed that predict ICU admission, detect COVID‐19 patients using routine blood tests, combine clinical history and computed tomograpy (CT) images for accurate diagnosis, and act as a COVID‐19 mortality predictor using clinical features. Although data have been used to inform decision making and provide a prediction of prognosis and hospitalization, , these models rely on clinical data that have to be obtained on a hospital visit (ie, lab values, vital signs, and chest X‐ray findings). An accurate model has not yet been introduced that predicts whether a patient who tested positive for COVID‐19 will be admitted to the hospital using only age and historical variables. Such a model can be used both in the clinical and population level.

Goals of investigation

We created a logistic regression model using retrospective observational study data to try to predict which patients will likely be admitted to the hospital that test positive for COVID‐19. We also understand that, on a practical level, incorporating a predictive model into an electronic health record for decision support can be challenging, and, thus, we also used variables from our logistic regression model to develop a practical scoring tool.

METHODS

Study design/setting

This study is an observational retrospective study that includes description and analysis of COVID‐19 patients across our 6‐hospital network in northeastern Pennsylvania who had data collected in a COVID‐19 registry database. The registry was developed to be used as an analytical tool as a part of the organization's quality improvement COVID initiatives. It was built in the Epic electronic health record system (Epic Systems, Verona, WI) after being developed and maintained by the network's enterprise analytics team in the information support (IS) department. The data were extracted from the database with de‐identified patient data, and variables within the database were used in our analysis.

Selection of participants

A COVID‐19 patient was defined as a patient who had a positive SARS‐COV2 PCR test. Patients younger than 18 years old and older than 90 years old were excluded from the analysis based on our network institutional review board requirements to maintain a quality improvement initiative and preserve patient confidentiality. We limited our analysis to age, gender, and historical variables because our goal was to develop an ambulatory predictive tool to predict which patients likely will be hospitalized.

Measurements

The demographic and historical variables were defined using Epic electronic health record groupers that generally are used in our electronic health record to capture patient clinical history. The initial independent variables that were extracted include age, hypertension, diabetes, chronic heart disease, gender, smoking history, chronic kidney disease, whether the patient was taking an ACE inhibitor, a history of cancer, chronic obstructive pulmonary disease, asthma, chronic liver disease, chronic renal failure, corticosteroid use, whether the patient was taking an immunosuppressive, and history of chronic bronchitis and HIV status. The groupers aggregate together ICD‐10 codes on problem lists that fall into a particular category.

Analysis

We first split the data into a training set that consisted of 80% of the data. Exploratory data analyses and model developments were conducted on the training set, and validation was done on a test set (20% of the data) not seen by the model. A separate 20% test set to analyze model performance is standard practice for model development to detect overfitting. We created a variable importance plot to examine which variables were most associated with admission. We also used a least absolute shrinkage and selection operator (lasso) regularization technique to select the variables to include in our final logistic regression model. Lasso performs variable selection to enhance the prediction accuracy and interpretability of the statistical model by penalizing model complexity such as the number of variables used.

Outcomes

The definition of hospitalization for COVID‐19 was a positive test, defined as a positive SAR‐COV2 PCR test, and hospitalization, defined as time of admission order placement, within 7 days of each other. We used the best predictors to train a logistic regression model to predict which patients will likely be hospitalized. We reported odds ratios (ORs) and confidence intervals (CIs) for those variables in our model. The performance of the model was validated on the 20% test set data that the model did not see as mentioned above. We used the coefficient values from our logistic regression model with binned age to develop a manual scoring tool that allows a manual calculation of the risk of hospitalization. We then validated the score on the test set data. The R statistical software was used to conduct all statistical analysis, and the stats and randomForest package were used for model development. This study proposal was reviewed by our institutional review board and deemed to be non‐human research.

The Bottom Line

This retrospective study used data from 6,485 patients with COVID‐19 at 6 hospitals in Pennsylvania to develop a logistic regression model and scoring tool to predict hospitalization using easily‐obtained input variables. This model has potential to support resource planning for patients with COVID‐19.

RESULTS

A total of 6485 patients were included in our analysis. Of these, 707 patients were defined as being hospitalized for COVID‐19. There was a clear difference in age between those who were hospitalized compared to those who were not with a mean of 64 and 48 years of age, respectively. Table 1 shows the variable differences between hospitalized versus non‐hospitalized patients. The best predictors of hospitalization were age, a history of hypertension, diabetes, chronic heart disease, gender, tobacco use, and chronic kidney disease (see Figure 1, variable importance plot). The receiver operating characteristic (ROC) curve for the model tested on the 20% validation data is presented in Figure 2. The area under the curve (AUC) for the logistic regression model was 0.81. The AUC is a measure of how much better the model performs, compared with a random guess. A perfect model would give an AUC of 1, and an uninformed model would give an AUC of 0.5. This indicated that this model performed overall well in predicting which patients will need hospitalization for COVID‐19. The model had a sensitivity of 80%, a specificity of 71%, a positive predictive value of 25%, and a negative predictive value of 97%. The OR for our logistic regression model is shown in Table 2.
TABLE 1

Historical variable differences between hospitalized versus non‐hospitalized patients for COVID‐19 (%)

Comparison of history for non‐hospitalized versus hospitalized patients (%)
NoYes
Male4450
Has HIV0.490.75
Tobacco smoker2237
On ACE inhibitor8.716
Inhaled steroids use1.93.6
Diabetes mellitus1139
Hypertension2264
Chronic kidney disease4.817
Has cancer3.48.9
Asthma6.48
Chronic bronchitis0.491.5
Chronic heart disease1141
Chronic lung disease6.618
Chronic liver disease36.6
On immunosuppression0.722.4
Chronic obstructive pulmonary disease2.411
FIGURE 1

Variable importance plot for predicting hospitalization for COVID‐19

FIGURE 2

ROC curve for predicting hospitalization for COVID‐19. ROC, receiver operating characteristic

TABLE 2

OR of logistic regression model coefficients

ORCI
Age1.0261.021–1.032
Diabetes mellitus2.0831.685–2.572
Hypertension2.3571.892–2.937
Chronic heart disease1.5301.222–1.912
Male gender1.3241.107–1.583
Tobacco use1.0600.871–1.288
Chronic kidney disease0.8750.664–1.148

CI, confidence interval; OR, odds ratio.

Historical variable differences between hospitalized versus non‐hospitalized patients for COVID‐19 (%) Variable importance plot for predicting hospitalization for COVID‐19 ROC curve for predicting hospitalization for COVID‐19. ROC, receiver operating characteristic OR of logistic regression model coefficients CI, confidence interval; OR, odds ratio. The distribution of those admitted versus not admitted were evaluated and ranged from a score ≤3 (37 hospitalized vs 1019 not hospitalized) to a score ≥8 (78 hospitalized vs 123 not hospitalized). Based on the way the model weighted each variable, we developed the score shown in Table 3. Patients were considered intermediate risk if they had a score between 4–7 and a 26% chance of hospitalization. Patients with scores ≥8 had a 38% chance of hospitalization.
TABLE 3

COVID‐19 risk of hospitalization score

Points
Diabetes mellitus+2
Hypertension+3
Male+1
Has chronic heart disease+2
Age 55–65 y+2
Age 66–75 y+3
Age >75 y+4
Total score
≥8High risk, 38% chance of hospitalization
4–7Intermediate risk, 26% chance of hospitalization
≤3Low risk, 3.5% chance of hospitalization
COVID‐19 risk of hospitalization score

LIMITATIONS

This study was performed at a single hospital network with a convenience sample limited by retrospective data. We did not study those under 18 or over 90 years old, and we did not study the severity of illness at admission, either of which may be an area of interest for future research. Our calculations were based on a 7‐day interval between when the test is positive and admission date; other institutions might prefer 10 or 14 days—for this and other reasons, our scoring needs validation at other sites to be generalizable. Additionally, this scoring predictor has not been compared to hospitalizations in patients without COVID‐19. Next steps should involve validating our model in other institutions across a range of settings and gathering information on hospitalization rates. Furthermore, this study is unique in that it synthesizes these data in the COVID‐19 population to provide a predictor for hospitalization using age and clinical history only.

DISCUSSION

In this study, we used observational data to develop a predictive model to calculate the probability of hospital admission of a COVID‐19 patient. Predictive model algorithms were applied to 6485 patients with a 10.9% hospital admission rate, which is comparable to other published data. We then used differences in historical variable data to inform our variable importance plot and used the top features to train a logistic regression model. Our model was internally validated on a test set of data. We then created an easy‐to‐use risk stratification score based on our statistical analysis. There are other models that predict the risk of hospitalization and prognosis. , However, one of the more recently published relies heavily on data that have to be collected on a hospital or office visit including vital signs, lab data, and clinical imaging. Our model relies only on easy to collect clinical history and age. This indicates that this model can be used at both the clinical and population level. Incorporating a predictive model into the electronic health record for decision support is challenging and not immediately feasible. Therefore, we also developed a manual scoring tool using coefficients from our logistic regression model. Our study demonstrated that the greatest predictors of hospitalization included age, hypertension, chronic heart disease, diabetes, gender, tobacco use, and chronic kidney disease. This is consistent with other studies demonstrating these variables have high impact on ICU transfer, poor prognosis, and mortality. , , , Given the heterogeneity in disease presentation, having a model that uses historical clinical characteristics, such as these, allows for use in a variety of clinical settings. Some examples include selecting individuals to undergo more aggressive preventative measures and even vaccination prioritization. Unsurprisingly, the model found age as the highest predictor, which is consistent with previous data. , , , In addition to age, the other variables included in the model are consistent with the literature. Several studies published have found patients with chronic heart disease, hypertension, and diabetes have worse prognosis and higher mortality , , and thus have a higher likelihood of hospitalization. Given the high prevalence of these chronic diseases and potential role they play in disease progression, it is important that our model and future models incorporate these diagnoses. We are currently using this model to inform referral for our network's remote home monitoring program to allow early remote intervention. Our model can be calculated without an office visit needed to collect clinical visit data, which other models would require. Therefore, referrals can be made once patients’ SARS‐COV2 test results are back. This could allow us to potentially reduce ER visits and hospitalizations, and we hope to publish on the results of our program. In summary, we have described the predictors of hospital admission from our observational data and created a tool that predicts hospitalization rates in COVID‐19 patients using common clinical variables and comorbidities without collecting vital signs or laboratory values. This information can be used to guide clinical decision making and increase efficiency and prioritization of patient care in an era where hospital resources are being pushed to their limits.

DECLARATIONS OF INTEREST

None.

AUTHORS CONTRIBUTION

ZC, MM, RM developed the study concept and design and participated in acquisition of the data. ZC performed the analysis and all authors (ZC, NR, MM, RM, DB) participated in the described interpretation of the data; NR and ZC drafted their portion of the manuscript, and all participated in the critical revision of the manuscript for important intellectual content. All authors take final responsibility for the manuscript as a whole.
  33 in total

1.  Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy.

Authors:  Giacomo Grasselli; Alberto Zangrillo; Alberto Zanella; Massimo Antonelli; Luca Cabrini; Antonio Castelli; Danilo Cereda; Antonio Coluccello; Giuseppe Foti; Roberto Fumagalli; Giorgio Iotti; Nicola Latronico; Luca Lorini; Stefano Merler; Giuseppe Natalini; Alessandra Piatti; Marco Vito Ranieri; Anna Mara Scandroglio; Enrico Storti; Maurizio Cecconi; Antonio Pesenti
Journal:  JAMA       Date:  2020-04-28       Impact factor: 56.272

2.  Prevalence of comorbidities in patients and mortality cases affected by SARS-CoV2: a systematic review and meta-analysis.

Authors:  Omar Ariel Espinosa; Andernice Dos Santos Zanetti; Ednardo Fornanciari Antunes; Fabiana Gulin Longhi; Tatiane Amorim de Matos; Paula Franciene Battaglini
Journal:  Rev Inst Med Trop Sao Paulo       Date:  2020-06-22       Impact factor: 1.846

Review 3.  Machine learning and medical education.

Authors:  Vijaya B Kolachalama; Priya S Garg
Journal:  NPJ Digit Med       Date:  2018-09-27

4.  Metabolic syndrome and COVID-19: An update on the associated comorbidities and proposed therapies.

Authors:  Fernanda Farias Costa; Wilian Reis Rosário; Ana Cláudia Ribeiro Farias; Ramon Guimarães de Souza; Roberta Sabrine Duarte Gondim; Wermerson Assunção Barroso
Journal:  Diabetes Metab Syndr       Date:  2020-06-11

5.  AI-Driven Tools for Coronavirus Outbreak: Need of Active Learning and Cross-Population Train/Test Models on Multitudinal/Multimodal Data.

Authors:  K C Santosh
Journal:  J Med Syst       Date:  2020-03-18       Impact factor: 4.460

6.  The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health - The latest 2019 novel coronavirus outbreak in Wuhan, China.

Authors:  David S Hui; Esam I Azhar; Tariq A Madani; Francine Ntoumi; Richard Kock; Osman Dar; Giuseppe Ippolito; Timothy D Mchugh; Ziad A Memish; Christian Drosten; Alimuddin Zumla; Eskild Petersen
Journal:  Int J Infect Dis       Date:  2020-01-14       Impact factor: 3.623

7.  An interactive web-based dashboard to track COVID-19 in real time.

Authors:  Ensheng Dong; Hongru Du; Lauren Gardner
Journal:  Lancet Infect Dis       Date:  2020-02-19       Impact factor: 25.071

8.  Risk factors of critical & mortal COVID-19 cases: A systematic literature review and meta-analysis.

Authors:  Zhaohai Zheng; Fang Peng; Buyun Xu; Jingjing Zhao; Huahua Liu; Jiahao Peng; Qingsong Li; Chongfu Jiang; Yan Zhou; Shuqing Liu; Chunji Ye; Peng Zhang; Yangbo Xing; Hangyuan Guo; Weiliang Tang
Journal:  J Infect       Date:  2020-04-23       Impact factor: 6.072

9.  CoVA: An Acuity Score for Outpatient Screening that Predicts Coronavirus Disease 2019 Prognosis.

Authors:  Haoqi Sun; Aayushee Jain; Michael J Leone; Haitham S Alabsi; Laura N Brenner; Elissa Ye; Wendong Ge; Yu-Ping Shao; Christine L Boutros; Ruopeng Wang; Ryan A Tesh; Colin Magdamo; Sarah I Collens; Wolfgang Ganglberger; Ingrid V Bassett; James B Meigs; Jayashree Kalpathy-Cramer; Matthew D Li; Jacqueline T Chu; Michael L Dougan; Lawrence W Stratton; Jonathan Rosand; Bruce Fischl; Sudeshna Das; Shibani S Mukerji; Gregory K Robbins; M Brandon Westover
Journal:  J Infect Dis       Date:  2021-01-04       Impact factor: 5.226

10.  Severe Outcomes Among Patients with Coronavirus Disease 2019 (COVID-19) - United States, February 12-March 16, 2020.

Authors: 
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2020-03-27       Impact factor: 17.586

View more
  2 in total

Review 1.  Hospitalizations from covid-19: a health planning tool.

Authors:  Miguel Santolino; Manuela Alcañiz; Catalina Bolancé
Journal:  Rev Saude Publica       Date:  2022-06-13       Impact factor: 2.772

2.  Predicting hospitalization of COVID-19 positive patients using clinician-guided machine learning methods.

Authors:  Wenyu Song; Linying Zhang; Luwei Liu; Michael Sainlaire; Mehran Karvar; Min-Jeoung Kang; Avery Pullman; Stuart Lipsitz; Anthony Massaro; Namrata Patil; Ravi Jasuja; Patricia C Dykes
Journal:  J Am Med Inform Assoc       Date:  2022-09-12       Impact factor: 7.942

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.