| Literature DB >> 33962987 |
Anna Stachel1, Kwesi Daniel2, Dan Ding2, Fritz Francois3, Michael Phillips4, Jennifer Lighter5.
Abstract
New York City quickly became an epicentre of the COVID-19 pandemic. An ability to triage patients was needed due to a sudden and massive increase in patients during the COVID-19 pandemic as healthcare providers incurred an exponential increase in workload,which created a strain on the staff and limited resources. Further, methods to better understand and characterise the predictors of morbidity and mortality was needed.Entities:
Keywords: information science; medical informatics; patient care
Year: 2021 PMID: 33962987 PMCID: PMC8108129 DOI: 10.1136/bmjhci-2020-100235
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
Features extracted for three training datasets: features on first calendar day of admission, last available value and selected 1 day at random from patient’s stay
| Dataset sample | Feature engineering | Variable |
| Data from admission | Quintile binning on training set for continuous variables | Demographic and hospital characteristics: previous positive COVID-19 PCR test during an outpatient or inpatient visit within 60 days, race, age, sex, body mass index (BMI) and days in hospital (current day minus admission date). |
| Data from first calendar day at admission, last available value, and 1 day selected at random from patient’s stay | Quintile binning on training set variables: current value, first value, minimum value, maximum value, mean value, median value, difference in current value from mean, difference in current value from median, difference in first value from mean, difference in first value from median, difference in max value from mean, difference in max value from median, difference in minimum value from mean and difference in minimum value from median | Laboratory values: albumin, alkaline phosphatase (ALKPHOS), alanine aminotransferase (ALT), anion gap (ANIONGAP), activated partial thromboplastin time (APTT), aspartate aminotransferase (AST), atypical lymphocytes per cent (ATYLYMREL), bands per cent (BANDSPCT), conjugated bilirubin (BILIDB), bilirubin direct (BILIDIRECT), bilirubin total, natriuretic peptide B (BNPEPTIDE), blood urea nitrogen (BUN), calcium, CKTOTAL, chloride, carbon dioxide (CO2), creatinine, C reactive protein (CRP), d-dimer, glomerular filtration rate – African American (EGGRAA), glomerular filtration rate – non-African American (EGFRNONAA), erythrocyte sedimentation rate (ESR), ferritin, fibrinogen, fraction of inspired oxygen arterial blood gas (FIO2ABG), glucose, HCT, haemoglobin, haemoglobin (HA1C), immunoglobulin A (IGA), immunoglobulin G (IGG), glomerular basement membrane (IGBM), absolute immature granulocytes (IMMGRANABS), per cent immature granulocytes (IMMGRANPCT), interleukin-1 beta (INTERL1B), interleukin 6 INTRLKN6, potassium (K), potassium plasma (KPLA), lactate arterial blood gas (LACTATEABG), lactate venous blood gas (LACTATEVBG), lactate dehydrogenase (LDH), lipase, lymphocyte absolute calculated (LYMPABSCAL), lymphocyte per cent (LYMPHPCT), lymphocyte absolute (LYMPHSABS), magnesium (MG), sodium (NA), NEUTABSCAL, neutrophil absolute (NEUTSABS), neutrophils per cent (NEUTSPCT), carbon dioxide in arterial blood (PCO2ART), carbon dioxide in venous blood (PCO2VEN), pH of arterial blood (PHART), phosphorous, pH of venous blood (PHVBG), platelet, P02ABG, P02VB, procalcitonin (PROCAL), total protein (PROTTOTAL), prothrombin time (PT), platelet poor plasma (PTT), red blood cell (RBC), troponin (TROPONINI), troponin point of care (TRPNONPOC) and white blood cell count (WBC). |
| Data from first calendar day at admission, last available value, and 1 day selected at random from patient’s stay | Quintile binning on training set: current value, first value, minimum value, maximum value, mean value, median value, difference in current value from mean, difference in current value from median, difference in first value from mean, difference in first value from median, difference in max value from mean, difference in max value from median, difference in minimum value from mean and difference in minimum value from median | Vitals: systolic blood pressure, diastolic blood pressure, pulse pressure, oximetry, respiratory rate, pulse and temperature. |
AUC, accuracy (acc), sensitivity (sens), specificity (spec), NPV and PPV of LR, DT, GB, SVM and NN models on admission benchmark, last-value and time-varying models in test sets
| Model | N | Admission | Last-value | Time-vary | |||||||||||||||
| AUC | Acc | Sens | Spec | PPV | NPV | AUC | Acc | Sens | Spec | PPV | NPV | AUC | Acc | Sens | Spec | PPV | NPV | ||
| LR | 864 | 0.79 | 0.80 | 0.34 | 0.94 | 0.66 | 0.82 | 0.98 | 0.95 | 0.23 | 0.97 | 0.90 | 0.97 | 0.88 | 0.83 | 0.66 | 0.89 | 0.65 | 0.89 |
| DT | 864 | 0.69 | 0.76 | 0.47 | 0.85 | 0.49 | 0.83 | 0.93 | 0.92 | 0.21 | 0.96 | 0.85 | 0.94 | 0.81 | 0.78 | 0.61 | 0.83 | 0.53 | 0.87 |
| GB | 864 | 0.83 | 0.82 | 0.53 | 0.91 | 0.64 | 0.86 | 0.99 | 0.96 | 0.24 | 0.97 | 0.90 | 0.98 | 0.93 | 0.88 | 0.81 | 0.90 | 0.72 | 0.94 |
| SVM | 864 | 0.77 | 0.74 | 0.56 | 0.80 | 0.47 | 0.85 | 0.99 | 0.93 | 0.23 | 0.94 | 0.83 | 0.97 | 0.85 | 0.80 | 0.68 | 0.84 | 0.57 | 0.89 |
| NN | 864 | 0.82 | 0.81 | 0.49 | 0.92 | 0.65 | 0.85 | 0.97 | 0.95 | 0.24 | 0.95 | 0.87 | 0.87 | 0.90 | 0.84 | 0.77 | 0.86 | 0.63 | 0.92 |
AUC, area under the receiver operating characteristic curve; DT, decision tree; GB, gradient boosting decision trees; LR, logistic regression; NN, neural network; NPV, negative predictive value; PPV, positive predictive value; SVM, support vector machine.
Daily AUC, accuracy (acc), sensitivity (sens), specificity (spec), NPV and PPV comparison of GB admission, last-value and time-vary models on patient’s daily prediction (a) AUC, accuracy, sens, spec, NPV and PPV of daily prediction test set
| Day | N | Admission | Last-value | Time-vary | |||||||||||||||
| AUC | Acc | Sens | Spec | PPV | NPV | AUC | Acc | Sens | Spec | PPV | NPV | AUC | Acc | Sens | Spec | PPV | NPV | ||
| Days After Admission 1 | 864 | 0.84 | 0.82 | 0.53 | 0.91 | 0.64 | 0.86 | 0.82 | 0.79 | 0.40 | 0.92 | 0.61 | 0.83 | 0.83 | 0.77 | 0.58 | 0.82 | 0.51 | 0.86 |
| Days After Admission 2 | 859 | 0.84 | 0.79 | 0.60 | 0.86 | 0.57 | 0.87 | 0.86 | 0.82 | 0.49 | 0.93 | 0.70 | 0.85 | 0.87 | 0.80 | 0.60 | 0.86 | 0.59 | 0.87 |
| Days After Admission 3 | 822 | 0.86 | 0.79 | 0.69 | 0.83 | 0.57 | 0.89 | 0.85 | 0.81 | 0.46 | 0.93 | 0.68 | 0.84 | 0.89 | 0.83 | 0.64 | 0.90 | 0.67 | 0.88 |
| Days After Admission 4 | 759 | 0.87 | 0.78 | 0.77 | 0.79 | 0.56 | 0.91 | 0.91 | 0.84 | 0.57 | 0.94 | 0.76 | 0.86 | 0.89 | 0.84 | 0.73 | 0.88 | 0.69 | 0.90 |
| Days After Admission 5 | 680 | 0.86 | 0.75 | 0.77 | 0.74 | 0.53 | 0.89 | 0.91 | 0.83 | 0.57 | 0.93 | 0.77 | 0.85 | 0.91 | 0.84 | 0.76 | 0.87 | 0.69 | 0.90 |
| Days After Admission 6 | 603 | 0.87 | 0.75 | 0.80 | 0.73 | 0.55 | 0.90 | 0.94 | 0.83 | 0.59 | 0.92 | 0.77 | 0.85 | 0.91 | 0.83 | 0.73 | 0.87 | 0.70 | 0.89 |
| Days After Admission 7 | 539 | 0.88 | 0.74 | 0.83 | 0.70 | 0.56 | 0.90 | 0.94 | 0.82 | 0.62 | 0.91 | 0.76 | 0.84 | 0.94 | 0.86 | 0.81 | 0.89 | 0.77 | 0.91 |
| Days Before Discharge 1 | 864 | 0.87 | 0.77 | 0.83 | 0.76 | 0.52 | 0.93 | 0.95 | 0.94 | 0.85 | 0.96 | 0.88 | 0.95 | 0.93 | 0.95 | 0.93 | 0.95 | 0.86 | 0.98 |
| Days Before Discharge 2 | 859 | 0.92 | 0.81 | 0.93 | 0.78 | 0.57 | 0.97 | 0.98 | 0.94 | 0.81 | 0.98 | 0.91 | 0.94 | 0.98 | 0.93 | 0.90 | 0.94 | 0.84 | 0.97 |
| Days Before Discharge 3 | 822 | 0.92 | 0.82 | 0.91 | 0.79 | 0.58 | 0.96 | 0.97 | 0.91 | 0.74 | 0.97 | 0.88 | 0.92 | 0.96 | 0.92 | 0.87 | 0.93 | 0.81 | 0.96 |
| Days Before Discharge 4 | 759 | 0.94 | 0.80 | 0.90 | 0.77 | 0.58 | 0.96 | 0.97 | 0.89 | 0.71 | 0.96 | 0.86 | 0.90 | 0.97 | 0.88 | 0.86 | 0.89 | 0.74 | 0.95 |
| Days Before Discharge 5 | 680 | 0.93 | 0.77 | 0.86 | 0.73 | 0.55 | 0.93 | 0.96 | 0.88 | 0.72 | 0.94 | 0.82 | 0.89 | 0.96 | 0.86 | 0.83 | 0.87 | 0.71 | 0.93 |
| Days Before Discharge 6 | 603 | 0.92 | 0.77 | 0.85 | 0.73 | 0.57 | 0.92 | 0.93 | 0.84 | 0.67 | 0.92 | 0.77 | 0.87 | 0.93 | 0.85 | 0.78 | 0.88 | 0.73 | 0.91 |
| Days Before Discharge 7 | 539 | 0.90 | 0.76 | 0.82 | 0.73 | 0.58 | 0.90 | 0.91 | 0.82 | 0.64 | 0.91 | 0.76 | 0.84 | 0.91 | 0.81 | 0.75 | 0.83 | 0.68 | 0.88 |
AUC, area under the receiver operating characteristic curve; GB, gradient boosting decision trees; NPV, negative predictive value; PPV, positive predictive value.
Figure 1Daily AUCs from the three final models (admission, last-value and time-vary) and their performance over time (7 days after admission and prior to discharge) on the test set and ‘imputed’ test set (N=864). (A) Compares the AUCs each day from admission of patients’ stay. (B) Compares the AUCs each day to discharge ofpatients’ stay. AUC, area under the receiver operating characteristic curve.
Figure 2Calibration plots using time-vary model on test set (A) on admission, (B) 7 days after admission and (C) 3 days before discharge (N=864). The plots show a slight propensity for the model to over predict during various points of patients’ stays.
Figure 3Proportion of actual mortality by predicted mortality score decile ranking in imputed test set. (A) On admission, (B) 7 days after admission and (C) 3 days before discharge (N=864). The model shows an increase in actual mortality among decile groups with higher predicted mortality.
Figure 4Ranking of most important 30 of 142 features of the final selected model based on per cent relative importance of the last lab value available in the test set. The purple graph on rightmost column of figure displays the variable importance value. The map also lists the average influence of a feature’s level on a patient’s overall prediction score with darker red boxes and darker blue boxes indicating an increase and decrease in the prediction, respectively (N = 864). Full map in supplemental material.