Literature DB >> 35905072

Rapid prediction of in-hospital mortality among adults with COVID-19 disease.

Kyoung Min Kim1,2, Daniel S Evans1, Jessica Jacobson3, Xiaqing Jiang4, Warren Browner1, Steven R Cummings1,5.   

Abstract

BACKGROUND: We developed a simple tool to estimate the probability of dying from acute COVID-19 illness only with readily available assessments at initial admission.
METHODS: This retrospective study included 13,190 racially and ethnically diverse adults admitted to one of the New York City Health + Hospitals (NYC H+H) system for COVID-19 illness between March 1 and June 30, 2020. Demographic characteristics, simple vital signs and routine clinical laboratory tests were collected from the electronic medical records. A clinical prediction model to estimate the risk of dying during the hospitalization were developed.
RESULTS: Mean age (interquartile range) was 58 (45-72) years; 5421 (41%) were women, 5258 were Latinx (40%), 3805 Black (29%), 1168 White (9%), and 2959 Other (22%). During hospitalization, 2,875 were (22%) died. Using separate test and validation samples, machine learning (Gradient Boosted Decision Trees) identified eight variables-oxygen saturation, respiratory rate, systolic and diastolic blood pressures, pulse rate, blood urea nitrogen level, age and creatinine-that predicted mortality, with an area under the ROC curve (AUC) of 94%. A score based on these variables classified 5,677 (46%) as low risk (a score of 0) who had 0.8% (95% confidence interval, 0.5-1.0%) risk of dying, and 674 (5.4%) as high-risk (score ≥ 12 points) who had a 97.6% (96.5-98.8%) risk of dying; the remainder had intermediate risks. A risk calculator is available online at https://danielevanslab.shinyapps.io/Covid_mortality/.
CONCLUSIONS: In a diverse population of hospitalized patients with COVID-19 illness, a clinical prediction model using a few readily available vital signs reflecting the severity of disease may precisely predict in-hospital mortality in diverse populations and can rapidly assist decisions to prioritize admissions and intensive care.

Entities:  

Mesh:

Year:  2022        PMID: 35905072      PMCID: PMC9337639          DOI: 10.1371/journal.pone.0269813

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


Introduction

Hospitals in many countries have been overwhelmed again by admissions of patients with COVID-19 illness in the second wave of delta variant infections or other types of variants. Accurate prediction of the probability of death using rapidly available vital signs on arrival or immediately after admission to hospital without further testing of laboratory parameters or chest x-ray, might help prioritize patients for hospitalization, intensive care and intubation, or to receive limited treatments in places that have very limited resources. Several prediction models about dying from COVID-19 disease has been proposed [1], but they have several limitations when immediate applying at initial admission of patients. Previous prediction algorithms have been derived from small numbers of deaths [2-10], used comorbid conditions, diagnoses, and severity indices from electronic medical records assessed after the patient is admitted [5, 11–18], or included tests—such as levels of C-reactive protein, troponin, D-dimers—that may not be readily available for urgent triage of patients for hospital admission or intensive care [18, 19]. Some studies were done in ethnically homogenous populations such as Wuhan [20], China [2], or Italy [21], or specific populations such as nursing home residents [22] or community based registry [23]. Some studies have been done in patients already admitted to the hospital with clinical or laboratory results after admission, or patients already in an intensive care unit (ICU) [7, 11, 12] or after admission to the hospital [19] and, therefore, not applicable to features of the infection when first presenting to emergency care. Some imposed arbitrary durations of follow-up, such as 7 or 30 days. Some studies applied machine learning methods to develop predictive models [2, 6, 24, 25]. No previous model has been based on measurements immediately available at the time of triage in a large racially diverse population. No model has been translated into a calculator that can be used on mobile devices in clinical settings. The model and online calculator are likely to apply to all variants of COVID-19 infection because, although the delta variant has greater viral loads and risk of transmission, there is no evidence that the physiologic manifestations, such as hypoxia differ or that the clinical manifestations that predict mortality would differ between the variants [26] (https://www.cdc.gov/coronavirus/2019-ncov/variants/variant.htm). We developed a predictive algorithm based on readily data from initial evaluation before admission to a hospital, in a diverse patient population, and mortality at any time after admission. We studied the large and diverse population of patients admitted to New York City Health + Hospitals (NYC H+H) public hospital system. We used machine learning to select strong predictors of mortality, developed and validated a multivariable model and score to estimate the risk of dying, and translated the model into an online calculator to estimate the risk of in-hospital mortality.

Methods

Setting and data sources

We used data extracted from the electronic medical records of all patients at least 18 years old who were admitted to any of the 11 hospitals of the New York City Health + Hospitals (NYC H+H) system with a diagnosis of Covid 19 infection verified with a positive polymerase chain reaction (PCR) test between March 1 and June 30, 2020. NYC H+H is the largest public health system in the United States, providing health services to more than one million New Yorkers across the city’s five boroughs. These hospitals account for approximately one-fifth of all general hospital discharges and more than one-third of emergency department and hospital-based clinic visits in New York City.

Variables

We abstracted demographic characteristics (sex, age, race and ethnicity), weight, body mass index, vital signs, oxygen saturation (SpO2) from peripheral monitors, and routine clinical laboratory tests (serum chemistry panel, complete blood counts) and D-dimer levels from electronic medical records. When there was more than one value, we selected the first. Missing values were not imputed. Non-transformed values were used. Sex and race/ethnicity were coded as categorical variables; all others were recorded as continuous variables. (Results did not change when continuous features were centered to their mean and scaled to a standard deviation of one.) The outcome was death from any cause during hospitalization with COVID19 infection; length of hospitalization was also noted.

Statistical analysis

The assumptions of normality for collected variables were tested Kolmogrov-Smirnov test, which is known to be sensitive in two samples. Baseline characteristics are presented median (IQR) for non-normally distributed continuous variables or N (%) for categorical variables. Because most of predictors were not distributed normally, the descriptive statistics for baseline characteristics were compared by in-hospital mortality using Mann-Whitney U test for continuous variables and Chi-squared test for categorical variables. To develop a clinical prediction model to estimate the risk of dying in the hospital, we adopted a multistep approach that included variable selection using Extreme Gradient Boosted Decision Trees (XGBoost), followed by the identification of cut points of the selected variables using classification and regression trees (CART), then followed by the development of a score that was used to predict in-hospital mortality within Covid-19 positive patients in the study population. Train and test data partitions were created using an 80%/20% random split stratified by death status to ensure an even proportion of mortality in the train and test partitions. Gradient Boosted Decision Trees implemented in the XGBoost R package v 1.2.0.1 with R v 4.0.2. were used to generate an ensemble of multiple decision trees to minimize errors in the classification of mortality in patients. The XGBoost model was developed in the train partition, using four boosting rounds, a maximum depth of three for each decision tree, a learning rate of 0.3, a binary: logistic learning objective with error rate used as the evaluation metric, and a minimum child weight of 75. Variable importance was evaluated using the information gain metric of a split on a variable. XGBoost model performance was evaluated in the test partition using accuracy and area under the curve (AUC) from a receiver operating characteristic (ROC) curve. Selected features and model performance did not change with 10-fold cross-validation. To develop a clinical prediction score, we used Classification and Regression Tree (CART) analyses in the original training set to identify optimum cut-points for each variable selected by XGBoost (S1 Fig). There was no clear cut-point for creatinine level and it had low importance in the XGBoost model, therefore it was not included in the final calculation of clinical risk score. We entered the selected variables and cut-points into a logistic regression model to estimate the multivariable odds ratios. To assign risk scores, the odds ratio for each of these categorical variables were divided by 2.6 (the lowest odds ratio), rounded, and then summed for each patient to calculate a risk score. The risk score calculation was not changed after including categories for missing values for all selected variables. The predicted probability of mortality from the risk score was also compared with the observed mortality in the test set. After excluding 703 patients with missing values for one or more variables, the proportions of patients who died were calculated for each 1-point interval in risk score; the highest-risk categories, which had similar scores and small numbers of patients, were combined. Because the predicted mortality by risk score categories were very similar in the training and test sets (AUC = 0.94 for both), these sets were combined to estimate the probabilities and 95% confidence intervals for the entire population. An online calculator reports the probability of in-hospital mortality from the risk score (danielevanslab.shinyapps.io/Covid_mortality). To report the probability of dying, all variables must have non-missing values except for the blood urea nitrogen (BUN) test which includes a term for missing results. All statistical analysis was performed using R Statistical Software (version 4.0.3 and version 4.0.2; R Foundation for Statistical Computing, Vienna, Austria).

Patient and public involvement

Patients or members of the public were not included in the analysis owing to restriction on the use of the data included in the study and a lack of training in the use of these data.

Results

Between March 1 and June 30, 2020, 13,190 patients who confirmed with COVID-19 infection, were admitted to a NYC H+H hospital. Among them, 2,227 (16.9%) patients were cared in ICU during hospitalization. The cohort included 5421 [41.1%] women, mean age 58 years [interquartile range 45–72 years]; 5258 were Hispanic [39.9%], 3805 Black [28.8%], 1168 White [8.9%], 716 Asian [5.4%] and 2243 individuals of other races/ethnicities [17.0%] (Table 1). During hospitalization, 2,875 (21.8%) died a mean of 10.6 days after admission (interquartile range: 3 to 13 days) and 2279 (17.3) were treated with mechanical ventilation.
Table 1

Baseline characteristics of the patient population and of those who survived and died.

TotalSurvivedDiedp-value
N = 13,190N = 10,315N = 2,875
Demographics and diagnoses
 Age (years)59.0 (45.0, 72.0)55.0 (42.0, 67.0)72.0 (61.0, 81.0)<0.001
 BMI (kg/m2)28.1 (24.4, 32.6028.1 (24.4, 32.5)28.4 (24.5, 33.0)0.032
 Female (%)5421 (41.1)4320 (41.9)1101 (38.3)0.001
 Hypertension (%)6552 (49.7)4804 (46.6)1748 (60.8)<0.001
 Diabetes (%)4635 (35.1)3339 (32.4)1296 (45.1)<0.001
 Race/Ethnicity (%)<0.001
  American Indian or Alaskan22 (0.2)17 (0.2)5 (0.2)
  Asian716 (5.4)547 (5.3)169 (5.9)
  Black3805 (28.8)2962 (28.7)843 (29.3)
  White1168 (8.9)808 (7.8)360 (12.5)
  Pacific Islander8 (0.1)6 (0.1)2 (0.1)
  Hispanic5258 (39.9)4289 (41.6)969 (33.7)
  Other1601 (12.1)1246 (12.1)355 (12.3)
  Declined/Unknown612 (4.6)440 (4.3)172 (6.0)
Vital signs
 O2 Saturation (%)97.0 (95.0, 98.0)97.0 (95.0, 99.0)91.5 (80.0, 96.0)<0.001
 Body temperature (℉)98.5 (97.9, 99.0)98,4 (98.0, 98.9)91.5 (80.0, 96.0)0.015
 Pulse rate (/min)85.0 (74.0, 97.0)85.0 (75.0, 94.0)89.0 (60.0, 109.0)<0.001
 Respiratory rate (/min)18.0 (18.0, 20.0)18.0 (18.0, 20.0)20.0 (18.0, 26.3)<0.001
 Systolic BP (mmHg)121.0 (108.0, 134.0)124.0 (112.0, 136.0)102.0 (76.0, 126.0)<0.001
 Diastolic BP (mmHg)72.0 (63.0, 80.0)74.0 (67.0, 81.0)56.0 (41.0, 70.0)<0.001
Laboratory parameters
 Calcium (mg/dL)8.4 (4.9, 9.0)8.5 (5.2, 9.1)8.1 (4.7, 8.7)<0.001
 Glucose (mg/dL)125.0 (104.0, 178.0)119.0 (101.0, 159.8)153.0 (118.0, 232.0)<0.001
 BUN (mg/dL)16.0 (11.0, 29.0)14.0 (10.0, 22.5)29.0 (17.0, 53.0)<0.001
 Creatinine (mg/dL)1..0 (0.8, 1.5)1.0 (0.8, 1.3)1.4 (1.0, 2.5)<0.001
 Albumin (mg/dL)3.7 (3.1, 4.1)3.8 (3.2, 4.2)3.4 (2.7, 3.8)<0.001
 Magnesium (mg/dL)2.1 (1.9, 2.4)2.1 (1.8, 2.3)2.2 (1.9, 2.5)<0.001
 Sodium (mg/dL)137.0 (134.0, 140.0)137.0 (134.0, 140.0)137.0 (133.0, 142.0)<0.001
 Potassium (mg/dL)4.2 (3.8, 4.6)4.2 (3.8, 4.6)4.3 (3.9, 4.9)<0.001
 Chloride (mg/dL)100.0 (96.0, 104.0)100.0 (96.0, 103.0)100.0 (95.0, 106.0)<0.001
 CO2 (mg/dL)23.0 (20.0, 25.0)23.0 (21.0, 25.7)21.0 (18.0, 24.0)<0.001
 Anion Gap15.6 (13.0, 18.0)15.0 (13.0, 17.0)18.0 (15.0, 21.0)<0.001
 White blood cells (x 109/L)7.6 (5.6, 10.5)7.2 (5.4, 9.9)8.9 (6.4, 12.3)<0.001
 Red cell distribution width (%)13.6 (12.8, 14.8)13.4 (12.7, 14.6)14.2 (13.2, 15.7)<0.001
 Red blood cell count (x 109/L)4.6 (4.1, 5.0)4.6 (4.1, 5.0)4.4 (3.9, 5.0)<0.001
 Hemoglobin (g/L)13.0 (11.5, 14.3)13.1 (11.7, 14.4)12.7 (10.8, 14.2)<0.001
 Hematocrit (L/L)39.9 (35.8, 43.6)40.10 (36.0, 43.0)39.30 (34.0, 43.0)<0.001
 Mean corpuscular volume (fL)88.2 (84.3, 92.0)87.90 (84.2, 91.5)89.20 (84.9, 93.4)<0.001
 Mean corpuscular hemoglobin (pg)28.90 (27.2, 30.2)28.90 (27.3, 30.2)28.80 (27.1, 30.2)0.226
 MCHC (g/L)32.50 (31.5, 33.5)32.70 (31.7, 33.60)32.10 (30.9, 33.2)<0.001
 Platelets (x 109/L)216.0 (167.0, 278.0)218.00 (171.0, 279.0)207.0 (153.0, 274.0)<0.001
 Mean platelet volume (fL)10.7 (10.0, 11.5)10.6 (9.9, 11.4)10.9 (10.2, 11.7)<0.001
 Basophil (%)0.2 (0.1, 0.3)0.2 (0.1, 0.3)0.2 (0.1, 0.3)<0.001
 Immature granulocyte (%)0.19 (0.04, 0.50)0.15 (0.03, 0.50)0.30 (0.06, 0.73)<0.001
 Neutrophils (x 109/L)5.65 (3.82, 8.41)5.24 (3.62, 7.73)7.10 (4.87, 10.33)<0.001
 Lymphocytes (x 109/L)1.05 (0.74, 1.48)1.12 (0.80, 1.55)0.87 (0.61, 1.23)0.003
 Monocytes (x 109/L)0.50 (0.35, 0.71)0.51 (0.36, 0.72)0.47 (0.31, 0.70)0.445
 Eosinophils (x 109/L)0.01 (0.00, 0.04)0.01 (0.00, 0.05)0.00 (0.00, 0.02)<0.001
 Nucleated red blood cell (/uL))0.00 (0.00, 0.00)0.00 (0.00, 0.00)0.00 (0.00, 0.02)<0.001
International normalized ratio (INR)1.2 (1.1, 1.3)1.1 (1.1, 1.2)1.2 (1.1, 1.4)<0.001
D-Dimer (ng/mL)594 (324, 1,644)500 (287, 1,062)925 (463, 3,224)<0.001
There were statistically significant differences between those who died and those who survived for almost all variables (Table 1). The XGBoost algorithm identified eight variables (Fig 1) that, together, generated predictions of mortality with an AUC of 94% and an accuracy of 91% (Fig 2). Of the variables that the XGBoost model selected, SpO2 was the strongest predictor; respiratory rate and blood pressure were also major contributors; body temperature was not. Although race and ethnicity were associated with mortality in univariable analyses, they were not selected in the predictive model.
Fig 1

Features, or variables, identified by the XGBoost model and ranked by importance, based on the gain in the accuracy of classification when the variable was used in decision trees that generated the model.

Fig 2

Receiver Operating Characteristics (ROC) Curves of mortality predicted by the machine learning XGBoost model and the clinical prediction model based on total point score in the test set of data.

A. ROC for In-hospital Mortality Predicted by the Machine Learning XGBoost Model. B. ROC Curve for the Clinical Prediction Model Point Score.

Receiver Operating Characteristics (ROC) Curves of mortality predicted by the machine learning XGBoost model and the clinical prediction model based on total point score in the test set of data.

A. ROC for In-hospital Mortality Predicted by the Machine Learning XGBoost Model. B. ROC Curve for the Clinical Prediction Model Point Score. CART analysis identified cut-points for each of the XGBoost-selected variables. A multivariable logistic model showed that the selected cut-points were all significant predictors of mortality (Table 2). The risk score based on the odds ratios for these variables ranged from 0 to 22 points and had an AUC of 0.94 for predicting mortality, the same as the XGBoost algorithm (Fig 2). The calibration curve of the risk score on the test set also showed excellence predictability over the full range probabilities of mortality (slope = 1, Brier score 0.061, Fig 3).
Table 2

Predictors from the multivariable model and points indicating an increased risk of death.

Odds ratio (95% CI)Points
Age
 Age < 70 years oldReference0
 70 ≤ Age < 85 years old2.6 (2.2–3.1)1
 Age ≥ 85 years old5.4 (4.2–7.0)2
O2 Saturation (SpO2)
 SpO2 ≥ 91%Reference0
 SpO2 < 91%10.7 (8.3–19.9)4
Respiratory Rate (RR)
 14 ≤ RR<22/minReference0
 RR ≥ 22/min7.8 (4.2–14.4)3
 RR < 14/min9.2 (7.6–11.1)4
Pulse Rate (PR)
 51 ≤ PR < 109/minReference0
  PR < 51/min3.5 (2.6–4.7)1
 109 ≤ PR < 119/min9.7 (4.8–20.0)4
  PR ≥ 119/min12.5 (9.1–17.3)5
Systolic BP (SBP)
 SBP ≥ 95 mmHgReference0
 SBP < 95 mmHg7.9 (5.8–10.7)3
Diastolic BP (DBP)
 DBP ≥ 54 mmHgReference0
 DBP < 54 mmHg4.7 (3.6–6.1)2
BUN
 BUN < 20 mmHgReference0
 20 ≤ BUN < 44 mmHg2.6 (2.1–3.2)1
 BUN ≥ 44 mmHg5.9 (4.8–7.4)2
Range of risk score 0–22

BP, blood pressure; BUN, blood urea nitrogen.

Fig 3

Calibration curve comparing the probability of mortality predicted by the score and the probability of mortality observed in the patient population in the test set of data (slope = 1 and Brier score = 0.061).

BP, blood pressure; BUN, blood urea nitrogen. Among the total study subjects, there were 5,677 (45.5%) patients with a score of 0, and 674 (5.4%) with a score ≥ 12 points (Table 3). In-hospital mortality increased continuously with higher risk scores, ranging from ranged from 0.8% (95% confidence interval, 0.5–1.0%) for those with a score of 0 to 97.6% (96.5–98.8%) for patients with a score ≥ 12 points (Table 3). The mean times between admission and death was 18 days (IOR 6–27 days) for those with a risk score of 0, compared with 9 days (IQR 3–11 days) for those with a risk score of 12 or greater. We translated the models into an online calculator to report the probability of mortality and the corresponding 95% confidence interval: danielevanslab.shinyapps.io/COVID_mortality/.
Table 3

Total point score and risk of in-hospital death.

Total ScoreRisk of Death % (95% C.I.)N (%)
00.8 (0.5–1.0)5677 (45.5)
14.5 (3.5–5.5)1636 (13.1)
39.7 (8.0–11.4)1137 (9.1)
420.7 (17.6–23.8)936 (7.5)
539.4 (34.6–44.1)404 (3.2)
657.8 (51.9–63.8)268 (2.1)
7 64.3 (58.6–69.9)277 (2.2)
874.4 (69.2–79.7)266 (2.1)
987.6 (82.8–92.4)185 (1.5)
1092.3 (88.6–95.9)207 (1.7)
1192.4 (88.2–96.6)157 (1.3)
≥1297.6 (96.5–98.8)674 (5.4)

Discussion

A few clinical observations readily available in the initial assessment of patients with COVID-19 infection can estimate the probability of dying during hospitalization across a full spectrum of outcomes. The model is available online for convenient use in acute care settings. Not surprisingly, physiologic variables, such as SpO2, respiratory rate, and low blood pressures were important predictors, indicating that the pulmonary and systemic effects of the infection are its most important prognostic features. Both slow and rapid respiratory rates and slow and fast pulse rates indicated an increased risk of in-hospital mortality. As expected, mortality also increased with age and with higher BUN levels [6, 27, 28]. Notably, after considering other variables, race and ethnicity were not significant predictors of mortality, as has been seen in other studies [9, 14, 29]. Previous studies have had important limitations, particularly studying patients who were already admitted and including assessments that are generally not available at the time the decision is made whether to admit a patient to the hospital [5, 11–18]. Most studies have developed models for predicting mortality from COVID-19 infections that are less accurate than the one presented here. For example, one study, applied XG boost to select variables from hospital admission in UK hospitals to from which a validated 4C Mortality model generated an AUC = 0.74 that was better than 17 other models with which it was compared, but not as accurate as the model we developed [30]. It included laboratory tests (c-reactive protein and urea) and number of comorbidities and Glasgow coma score that may require the medical record and neurological exam [30]. Other studies have identified other laboratory values, such as red cell distribution width and D-dimer levels, as significant predictors, but they did not contribute to this algorithm [4, 31]. BUN was the only laboratory value in our algorithm and a missing value did not influence the score and it is optional for estimating the risk using the website. This suggests that clinicians do not need to order or wait for laboratory test results to estimate a patient’s probability of dying. These data were collected before effective treatments, such as corticosteroids, which were used commonly in the treatment of COVID-19 disease [30]. Improvements in care of patients have reduced inpatient mortality from the infection [32]. Although our risk model and algorithm is not calibrated to the current mortality risk, it does reflect the probability of dying without current in-hospital treatments and, thus may be useful to identify patients who are most—or least—likely to benefit from hospital care. However, infection in people who have been fully vaccinated may be less severe and our model may therefore overestimate mortality in those uncommon cases. Studies suggest that the delta variant carries a greater risk of hospitalization [33]. However, there is no evidence that the physiologic or clinical manifestations that would relate to the risk of mortality would differ between the variants and wild type. Therefore, our model of risk of in hospital mortality is likely to apply to all variants. Ideally, prognostic models developed for the alpha variant would be recalibrated for the delta variant. The estimates from the model may have the most value when triage decisions need to be made about which of patients to admit to a hospital or ICU bed, especially when the number of patients exceeds capacity. The model may be most useful for prioritizing patients at the extremes of prognosis. Notably, the 41% of patients in this cohort had scores of 0 had a very low probability of dying, and likely could have been cared for in outpatient settings, especially if periodic assessments of SpO2 and vital signs could be obtained. At the other extreme, over 90% of those with scores of 10 or more died, indicating a need to decide whether to implement or withhold aggressive treatment. Our model may be currently useful in places outside of the US. The pandemic COVID-19 disease, hospitalization and death still continues to burden health systems in many countries, while abating in the U.S. Although our model was derived from the first wave of the pandemic in New York City, the results and the model are likely to apply to other populations around the world. Our patient population is very racially diverse- Hispanic, Asian, Black and Caucasian, many of whom are recent immigrants and largely low income. We found that physiologic measurements of COVID-19 infection, such as low pSO2 and vital signs were very strong predictors of mortality while race, ethnicity did not influence outcome. New variants of the virus influence its transmission; they might cause more severe infection but are less likely to change the relationship between physiologic severity of the infection and risk of death. Any influence might mean that the model might underestimate the probability of death. This analysis has several strengths. The algorithm was derived from a very diverse population of patients in New York City using data from 11 hospitals. The study population and number of deaths were large enough to produce estimates of mortality with narrow confidence intervals and high AUC values; it is unlikely that adding additional variables to the model would substantially improve its already high accuracy. Multivariable regression analysis of the variables selected by machine learning confirmed that they were strong and independent predictors of mortality. An easy-to-use version of the model is also universally available online for use in acute care settings danielevanslab.shinyapps.io/Covid_mortality/). The analysis also has limitations. The model represents the natural history of COVID-19 disease before hospital care improved—and mortality rates declined—so it could not be calibrated to predict mortality with current standards of care. A large proportion of the patients who were admitted had low risk scores which reflects admission practices in NYC hospitals during the first wave of the pandemic. Although the study subjects included diverse races and ethnicities, we did not test the performance of this model in other study population. Further studies testing the performance of our model in other countries would be warranted. By design, the data did not include measurements, such as markers of inflammation and coagulation, or indices of comorbidity and severity of illness including presence of patients’ symptoms, that predict mortality but that may not be readily available in the initial assessment of a patient. Thus, we did not calculate PSI, NEWS or CURB65 scores for comparison because our model used only data immediately available without referring to medical records.

Conclusions

Mortality from COVID-19 illness can be rapidly and accurately predicted from a few vital signs that are readily available in acute care settings. When resources, such as hospital beds, are scarce, estimates of the probability of dying might aid decisions about prioritizing patients to receive intensive care or other scarce resources. The prediction model, based on racially and ethnically diverse patients, is available online for use in clinical settings around the world.

The sequence of boosted decision trees.

The first (top) figure (Tree 0) is the first boosted decision tree from the XGBoost model. The next (Tree 1) is the second boosted decision tree from the XGBoost model. The next tree (Tree 2) is the third boosted decision tree from the XGBoost model. The bottom tree (Tree 3) is the fourth boosted decision tree from the XGBoost model. (TIF) Click here for additional data file. 7 Dec 2021
PONE-D-21-32900
Rapid Prediction of In-hospital Mortality among Adults with COVID-19 disease
PLOS ONE Dear Dr. Cummings, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jan 21 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Chiara Lazzeri Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf. 2. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. In your revised cover letter, please address the following prompts: a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent. b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. We will update your Data Availability statement on your behalf to reflect the information you provide. 3. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript. Additional Editor Comments: The topic is interesting and the study well designed and written. The Authors should hypothesized whether their model could be extended to noICU settings. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Dear authors. Thank you so much for submitting this manuscript. Below, I am including some thoughts that hopefully would be useful to further strengthen your work. -Line 62: When you mention rapidly available vital signs, could you clarify if you are referring exclusively to vital signs upon arrival to the hospital, or vital signs taken in any setting (i.e. at home, for example). It appears that Line 88 would suggest the former, but please clarify this point in Line 62 to satisfy the picky reader. -Line 80: I appreciate very much the discussion of the motive behind your work. It is clear and it explicitly outlines the limitations of previous literature. Well done -Introduction: Overall concise, well-motivated, and with a clearly stated objective. Again, well done -Line 101: If relevant, could you clarify why you chose these these dates? Just curious ! -Line 119: Please briefly explain why the Kolmogrov-Smirnov test is the best statistical test for this context (this should be explained as a short clause only). Same with the Mann-Whitney U test. I think that it is not necessary to do the same for the Chi-squared test given that most readers should be familiar with it -Line 134: Why a learning rate of 0.3? Why is this the best rate? -Line 172: Please confirm that the 13,1910 patients were COVID positive, or COVID presumed patients, or all sorts of patients , just in the spirit of clarity -Line 177: Do you have figures about the number of patients that were admitted for ICU-level of care, out of curiosity? -Discussion: is there any data that you found on number of symptomatic days? For example, folks presenting with low BP and low SpO2 - I assume - likely had been symptomatic for more than days than say someone who has been experiencing symptoms for only one day. If relevant, please add this perspective to your discussion. -You discuss the "easy to use" approach of your model very well. I am wondering: Could you describe how your model could be introduced to non-US settings? Are there further studies that would be needed for a low-income country, for example, to fully adopt your model? Or could any hospital abroad simply "start using" your model? If so, why do you think that is the case? I believe that you hint at all of this in your final paragraphs, but a richer Discussion on this end could be interesting. -If you think that it would be relevant, I would encourage you to create a visual algorithm that could help a clinician triage their patients based on your model (similar to UpToDate models) - this would be a time-intensive case, but I strongly feel that visual algorithms could ultimately make it easier for others to better understand how to apply your model. Some interesting papers that you may want to cite in your text, where you see fit: -https://bmjopen.bmj.com/content/11/2/e045442.long -https://onlinelibrary.wiley.com/doi/full/10.1002/rmv.2146 ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 16 May 2022 Reviewer #1: Dear authors. Thank you so much for submitting this manuscript. Below, I am including some thoughts that hopefully would be useful to further strengthen your work. -Line 62: When you mention rapidly available vital signs, could you clarify if you are referring exclusively to vital signs upon arrival to the hospital, or vital signs taken in any setting (i.e. at home, for example). It appears that Line 88 would suggest the former, but please clarify this point in Line 62 to satisfy the picky reader. Answer: We appreciate the insightful suggestions. In the present study, we aimed to develop a machine learning-based algorithm to rapidly predict the severity of COVID-19 infection, that is likely to lead to mortality, on or immediately after admission. Therefore, we used vital signs and parameters that were assessed upon arrival at the hospital. We tried to clarify more the time point when these predictors were collected as follows. Revised manuscript> -Line 63: Accurate prediction of the probability of death using rapidly available vital signs on arrival or immediately after admission to hospital without further testing of laboratory parameters or chest x-ray, might help prioritize patients for hospitalization, intensive care and intubation, or to receive limited treatments in places that have very limited resources. -Line 80: I appreciate very much the discussion of the motive behind your work. It is clear and it explicitly outlines the limitations of previous literature. Well done Answer: We appreciate the reviewer’s comments. Since the world still suffers from the COVID19 infection, several prediction models have been proposed from the diverse clinical settings or situations, in different study subjects, and using different predictors. Thus, prior to model development, we had thoroughly reviewed the performance, limitations, and strengths of other reported models. The strength of our model is that we only used readily available predictors in patients composed with diverse ethnicity. Therefore, we believe that we expect that our predictive model would outperform others especially in terms of generalizability and utility. We emphasized this point as the strengths of our model in the Discussion part. -Introduction: Overall concise, well-motivated, and with a clearly stated objective. Again, well done Answer: Thank you for your positive response. -Line 101: If relevant, could you clarify why you chose these dates? Just curious ! Answer: In March 2020, the number of patients infected with COVID-19 in the United States started to increase dramatically, posing a serious health threat, and hospitals faced a shortage of beds for seriously ill patients. With this, we aimed to develop a model to predict the disease severity of COVID-19 infection for early classification of patients. So, in June 2020, we submitted an IRB and started analyses for this study. That is why these dates were finally set as a study period. -Line 119: Please briefly explain why the Kolmogrov-Smirnov test is the best statistical test for this context (this should be explained as a short clause only). Same with the Mann-Whitney U test. I think that it is not necessary to do the same for the Chi-squared test given that most readers should be familiar with it Answer: There are several statistical methods to test normality, including the Shapiro-wilk test (SW), the Anderson-Darling test (AD), and the Kolmogrov-Smirnov test (KS). The Shapiro-wilk test is known to have the best statistical power at low sample sizes, but the statistical power becomes comparable among these methods at high sample sizes. Furthermore, SW test cannot be used with sample sizes greater than 5000. The statistics of KS test measures distance from the reference distribution and known to be a relative conservative test. It is also known to be sensitive of the two samples and can be applied larger sample size. Therefore, we chose the KS test as an alternative to test for normality. From the normality test, it turned out that most of predictors were not distributed normally, thus we used Mann-Whitney U test to compare baseline values between deceased subjects and survived subjects. we added this point why we used these methods for statistical analyses briefly as follows. Revised manuscript> Line 120: The assumptions of normality for collected variables were tested Kolmogrov-Smirnov test, which is known to be sensitive in two samples. Baseline characteristics are presented median (IQR) for non-normally distributed continuous variables or N (%) for categorical variables. Because most of predictors were not distributed normally, the descriptive statistics for baseline characteristics were compared by in-hospital mortality using Mann-Whitney U test for continuous variables and Chi-squared test for categorical variables. -Line 134: Why a learning rate of 0.3? Why is this the best rate? Answer: We used the default value for the learning rate, which was 0.3. The learning rate (values range from 0 to 1) is a factor that shrinks the feature weights after each boosting step. Higher values have very little shrinkage and result in models being fit with fewer boosting steps. Lower values have more shrinkage and require more boosting steps and more computing time. While we could have experimented with this parameter to minimize the computing time required to fit a well-performing model, we were quite satisfied with the default learning rate that achieved a short computing time (<10 seconds) to fit a model with such good performance in the test dataset (AUC 94%). We didn’t want to search out a model with unusual parameter settings just to improve performance by a few percent. Thus, we left the learning rate at the default value. -Line 172: Please confirm that the 13,190 patients were COVID positive, or COVID presumed patients, or all sorts of patients , just in the spirit of clarity. Answer: All the 13,190 patients were confirmed as COVID19 infection. We edited the sentence as follows. Revised manuscript> Line 175: Between March 1 and June 30, 2020, 13,190 patients who confirmed with COVID-19 infection, were admitted to a NYC H+H hospital. -Line 177: Do you have figures about the number of patients that were admitted for ICU-level of care, out of curiosity? Answer: Among 13,190 subjects, 2,227 (16.9%) patients were cared in the ICU during the hospitalizations. As expected, the subjects who cared in the ICU had a higher mortality rate than those who did not. We added this number in the manuscript since it helps to figure the general severity of the patients included in this study. Revised manuscript> Line 176: Among them, 2,227 (16.9%) patients were cared in ICU during hospitalizations. -Discussion: is there any data that you found on number of symptomatic days? For example, folks presenting with low BP and low SpO2 - I assume - likely had been symptomatic for more than days than say someone who has been experiencing symptoms for only one day. If relevant, please add this perspective to your discussion. Answer: We appreciate the reviewer’s valuable comment. However, unfortunately, we did not assess any parameters about subjective symptoms (presence, severity, or duration of symptom) in this study. Although the presence, duration or severity of symptoms is likely associated with a poorer prognosis for patients, the purpose of this study is to establish a predictive model just using objective and readily measurable parameters. Therefore, we did not include parameters related to the patient's symptoms in this analysis. We added this point as a limitation of the study as follows; Revised manuscript> Line 279: By design, the data did not include measurements, such as markers of inflammation and coagulation, or indices of comorbidity and severity of illness including presence of patients’ symptoms, that predict mortality but that may not be readily available in the initial assessment of a patient. -You discuss the "easy to use" approach of your model very well. I am wondering: Could you describe how your model could be introduced to non-US settings? Are there further studies that would be needed for a low-income country, for example, to fully adopt your model? Or could any hospital abroad simply "start using" your model? If so, why do you think that is the case? I believe that you hint at all of this in your final paragraphs, but a richer Discussion on this end could be interesting. Answer: As we described in the Discussion part, the study subjects in the present study composited from diverse ethnicity. Therefore, we cautiously expect this model would work in other ethnic groups as well. However, we did not test the performance of this model in other populations or non-US setting, so we cannot conclude yet. Therefore, further studies testing the performance of our model in other countries would be warranted. We added this point in the discussion as follows. Revised manuscript> Line 277: Although the study subjects included diverse races and ethnicities, we did not test the performance of this model in other study populations. Further studies testing the performance of our model in other countries will be warranted. -If you think that it would be relevant, I encourage you to create a visual algorithm that could help a clinician triage their patients based on your model (similar to UpToDate models) - this would be a time-intensive case, but I strongly feel that visual algorithms could ultimately make it easier for others to better understand how to apply your model. Answer: Thank you for the suggestion. We additionally provided the four decision trees of our model in supplemental figure in the revised manuscript. However, assessing an individual with four decision trees is not very easy. Thus, we also provide our web application that visualizes the risk category a person would fall into with values of different variables. Revised manuscript> Supplemental Figure 1. First boosted decision tree from XGBoost model. Some interesting papers that you may want to cite in your text, where you see fit: -https://bmjopen.bmj.com/content/11/2/e045442.long -https://onlinelibrary.wiley.com/doi/full/10.1002/rmv.2146 Answer: We appreciate the reviewer’s suggestion and we additionally cited these references in the manuscript as new reference 1 and 23. Submitted filename: ResponseLetter_PLOSONE_20220117.docx Click here for additional data file. 30 May 2022 Rapid Prediction of In-hospital Mortality among Adults with COVID-19 disease PONE-D-21-32900R1 Dear Dr. Cummings, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Chiara Lazzeri Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 7 Jul 2022 PONE-D-21-32900R1 Rapid Prediction of In-hospital Mortality among Adults with COVID-19 disease Dear Dr. Cummings: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Chiara Lazzeri Academic Editor PLOS ONE
  32 in total

Review 1.  Predictors of COVID-19 severity: A literature review.

Authors:  Benjamin Gallo Marin; Ghazal Aghagoli; Katya Lavine; Lanbo Yang; Emily J Siff; Silvia S Chiang; Thais P Salazar-Mather; Luba Dumenco; Michael C Savaria; Su N Aung; Timothy Flanigan; Ian C Michelow
Journal:  Rev Med Virol       Date:  2020-07-30       Impact factor: 6.989

2.  Predictive Modeling of Morbidity and Mortality in Patients Hospitalized With COVID-19 and its Clinical Implications: Algorithm Development and Interpretation.

Authors:  Joshua M Wang; Wenke Liu; Xiaoshan Chen; Michael P McRae; John T McDevitt; David Fenyö
Journal:  J Med Internet Res       Date:  2021-07-09       Impact factor: 5.428

3.  Comorbidity and clinical factors associated with COVID-19 critical illness and mortality at a large public hospital in New York City in the early phase of the pandemic (March-April 2020).

Authors:  Thomas D Filardo; Maria R Khan; Noa Krawczyk; Hayley Galitzer; Savannah Karmen-Tuohy; Megan Coffee; Verity E Schaye; Benjamin J Eckhardt; Gabriel M Cohen
Journal:  PLoS One       Date:  2020-11-23       Impact factor: 3.240

4.  Variation in US Hospital Mortality Rates for Patients Admitted With COVID-19 During the First 6 Months of the Pandemic.

Authors:  David A Asch; Natalie E Sheils; Md Nazmul Islam; Yong Chen; Rachel M Werner; John Buresh; Jalpa A Doshi
Journal:  JAMA Intern Med       Date:  2021-04-01       Impact factor: 21.873

5.  Association between biomarkers and COVID-19 severity and mortality: a nationwide Danish cohort study.

Authors:  Gethin Hodges; Jannik Pallisgaard; Anne-Marie Schjerning Olsen; Patricia McGettigan; Mikkel Andersen; Maria Krogager; Kristian Kragholm; Lars Køber; Gunnar Hilmar Gislason; Christian Torp-Pedersen; Casper N Bang
Journal:  BMJ Open       Date:  2020-12-02       Impact factor: 2.692

6.  Risk Factors Associated With All-Cause 30-Day Mortality in Nursing Home Residents With COVID-19.

Authors:  Orestis A Panagiotou; Cyrus M Kosar; Elizabeth M White; Leonidas E Bantis; Xiaofei Yang; Christopher M Santostefano; Richard A Feifer; Carolyn Blackman; James L Rudolph; Stefan Gravenstein; Vincent Mor
Journal:  JAMA Intern Med       Date:  2021-04-01       Impact factor: 21.873

7.  Multivariable mortality risk prediction using machine learning for COVID-19 patients at admission (AICOVID).

Authors:  Sujoy Kar; Rajesh Chawla; Sai Praveen Haranath; Suresh Ramasubban; Nagarajan Ramakrishnan; Raju Vaishya; Anupam Sibal; Sangita Reddy
Journal:  Sci Rep       Date:  2021-06-17       Impact factor: 4.379

8.  Risk Factors for Hospitalization, Mechanical Ventilation, or Death Among 10 131 US Veterans With SARS-CoV-2 Infection.

Authors:  George N Ioannou; Emily Locke; Pamela Green; Kristin Berry; Ann M O'Hare; Javeed A Shah; Kristina Crothers; McKenna C Eastment; Jason A Dominitz; Vincent S Fan
Journal:  JAMA Netw Open       Date:  2020-09-01

9.  Deep-learning artificial intelligence analysis of clinical variables predicts mortality in COVID-19 patients.

Authors:  Jocelyn S Zhu; Peilin Ge; Chunguo Jiang; Yong Zhang; Xiaoran Li; Zirun Zhao; Liming Zhang; Tim Q Duong
Journal:  J Am Coll Emerg Physicians Open       Date:  2020-08-25

10.  Association of Race With Mortality Among Patients Hospitalized With Coronavirus Disease 2019 (COVID-19) at 92 US Hospitals.

Authors:  Baligh R Yehia; Angela Winegar; Richard Fogel; Mohamad Fakih; Allison Ottenbacher; Christine Jesser; Angelo Bufalino; Ren-Huai Huang; Joseph Cacchione
Journal:  JAMA Netw Open       Date:  2020-08-03
View more
  1 in total

Review 1.  Data capture and sharing in the COVID-19 pandemic: a cause for concern.

Authors:  Louis Dron; Vinusha Kalatharan; Alind Gupta; Jonas Haggstrom; Nevine Zariffa; Andrew D Morris; Paul Arora; Jay Park
Journal:  Lancet Digit Health       Date:  2022-10
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.