| Literature DB >> 33287804 |
Eva L H Tsui1, Carrie S M Lui2, Pauline P S Woo3, Alan T L Cheung3, Peggo K W Lam3, Van T W Tang3, C F Yiu3, C H Wan3, Libby H Y Lee4.
Abstract
BACKGROUND: This is the first study on prognostication in an entire cohort of laboratory-confirmed COVID-19 patients in the city of Hong Kong. Prognostic tool is essential in the contingency response for the next wave of outbreak. This study aims to develop prognostic models to predict COVID-19 patients' clinical outcome on day 1 and day 5 of hospital admission.Entities:
Keywords: COVID-19; Clinical outcome; Disease severity; Prediction; Prognostic; Step-down care; Triage
Mesh:
Year: 2020 PMID: 33287804 PMCID: PMC7719738 DOI: 10.1186/s12911-020-01338-0
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 3Relationship between each key feature under Day 1 and Day 5 of admission and the SHAP value for each outcome group (Red: Critical / Serious, Yellow: Stable, Green: Satisfactory)
Patient profile of 1037 COVID-19 confirmed cases as of 30 April 2020 (with data during hospitalisation updated till 10 May 2020)
| Overall (n = 1037) | Worst condition upon discharge or till 10 May 2020 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Critical/serious (n = 50) | Stable (n = 485) | Satisfactory (n = 502) | |||||||
| Demographics | |||||||||
| Age (years) | |||||||||
| Mean ± SD | 37.8 ± 17.8 | 60.6 ± 14.0 | 37.6 ± 17.5 | 35.6 ± 16.8 | < 0.0001 | ||||
| Median | 35 | 62 | 34 | 32 | |||||
| Range | 0–96 | 25–96 | 0–93 | 0–89 | |||||
| Gender | |||||||||
| Male | 558 (53.8%) | 32 (64.0%) | 232 (47.8%) | 294 (58.6%) | 0.0011 | ||||
| Female | 479 (46.2%) | 18 (36.0%) | 253 (52.2%) | 208 (41.4%) | |||||
| Chronic diseases† | |||||||||
| Nil (Without any of 25 selected diseases) | 915 (88.2%) | 29 (58.0%) | 425 (87.6%) | 461 (91.8%) | < 0.0001 | ||||
| With any of 25 selected diseases | 122 (11.8%) | 21 (42.0%) | 60 (12.4%) | 41 (8.2%) | |||||
| Hypertension | 87 (8.4%) | 16 (32.0%) | 46 (9.5%) | 25 (5.0%) | < 0.0001 | ||||
| Hyperlipidemia | 62 (6.0%) | 12 (24.0%) | 30 (6.2%) | 20 (4.0%) | < 0.0001 | ||||
| Diabetes | 33 (3.2%) | 6 (12.0%) | 15 (3.1%) | 12 (2.4%) | 0.0011 | ||||
| Source of infection | |||||||||
| Local cases | 422 (40.7%) | 30 (60.0%) | 219 (45.2%) | 173 (34.5%) | < 0.0001 | ||||
| Imported cases | 615 (59.3%) | 20 (40.0%) | 266 (54.8%) | 329 (65.5%) | |||||
| Symptoms | |||||||||
| With presenting symptoms upon COVID-19 confirmation | 844 (81.4%) | 48 (96.0%) | 406 (83.7%) | 390 (77.7%) | 0.0013 | ||||
| Duration between symptom onset and admission | |||||||||
| Mean ± SD (Days) | 5.8 ± 5.4 | 5.5 ± 3.7 | 5.2 ± 4.7 | 6.4 ± 6.2 | 0.1650 | ||||
| Less than 5 days | 447 (53.0%) | 22 (45.8%) | 225 (55.4%) | 200 (51.3%) | 0.0514 | ||||
| 5–9 days | 246 (29.1%) | 17 (35.4%) | 124 (30.5%) | 105 (26.9%) | |||||
| 10 days or above | 151 (17.9%) | 9 (18.8%) | 57 (14.0%) | 85 (21.8%) | |||||
| Symptoms (incl. data captured during hospitalisation) | |||||||||
| Cough | 514 (49.5%) | 41 (82.0%) | 246 (50.7%) | 227 (45.2%) | < 0.0001 | ||||
| Fever | 495 (47.7%) | 40 (80.0%) | 242 (49.9%) | 213 (42.4%) | < 0.0001 | ||||
| Sore Throat | 272 (26.2%) | 9 (18.0%) | 142 (29.3%) | 121 (24.1%) | 0.0724 | ||||
| Headache | 159 (15.3%) | 7 (14.0%) | 74 (15.3%) | 78 (15.5%) | 0.9575 | ||||
| Diarrhea | 117 (11.3%) | 9 (18.0%) | 61 (12.6%) | 47 (9.4%) | 0.0856 | ||||
| Fatigue | 110 (10.6%) | 9 (18.0%) | 53 (10.9%) | 48 (9.6%) | 0.1726 | ||||
| Myalgia | 109 (10.5%) | 9 (18.0%) | 51 (10.5%) | 49 (9.8%) | 0.1938 | ||||
| Dyspnoea | 88 (8.5%) | 23 (46.0%) | 37 (7.6%) | 28 (5.6%) | < 0.0001 | ||||
| Chills | 77 (7.4%) | 9 (18.0%) | 42 (8.7%) | 26 (5.2%) | 0.0016 | ||||
| Pneumonia | 36 (3.5%) | 13 (26.0%) | 14 (2.9%) | 9 (1.8%) | < 0.0001 | ||||
| Vomiting | 13 (1.3%) | 2 (4.0%) | 5 (1.0%) | 6 (1.2%) | 0.1965 | ||||
| Others not listed above | 268 (25.8%) | 9 (18.0%) | 124 (25.6%) | 135 (26.9%) | 0.3843 | ||||
*p value based on chi-square test for categorical variables and Kruskal–Wallis test for continuous variables
†Refer to 25 chronic diseases (Including Diabetes, Hypertension, Hyperlipidemia, Chronic obstructive pulmonary disease, Coronary Heart Disease, Chronic Heart Failure, Chronic Kidney Disease, Stroke, Glaucoma, Hip fracture, Hepatitis B, Dementia, Depression, Parkinsonism, Non-Hodgkin Lymphoma, Cancer of lung, colorectum, breast, liver, prostate, stomach, cervix, corpus, ovary and nasopharynx) in the HA chronic diseases virtual registry up to 2018, and the latest information of Diabetes, Hypertension and Hyperlipidemia
‡For those discharged before Day 5, the worst condition up to day of discharge is tabulated
Laboratory readings on Day 1 and Day 5 of admission among 1037 COVID-19 confirmed cases as of 30 April 2020
| On Day 1 of admission | |||||||
|---|---|---|---|---|---|---|---|
| Critical/serious (n = 50) | Stable (n = 485) | Satisfactory (n = 502) | |||||
| n | Mean ± SD | n | Mean ± SD | n | Mean ± SD | ||
| Laboratory readings | |||||||
| CT value# | 47 | 24.29 ± 6.05 | 436 | 25.10 ± 7.05 | 474 | 27.17 ± 7.79 | 0.0001 |
| C-reactive protein (mg/L) | 45 | 101.71 ± 93.66 | 403 | 10.81 ± 23.42 | 440 | 7.98 ± 19.84 | < 0.0001 |
| Albumin (g/L) | 50 | 34.34 ± 6.72 | 474 | 41.96 ± 4.58 | 483 | 41.57 ± 4.87 | < 0.0001 |
| Globulin (g/L) | 48 | 35.71 ± 6.76 | 446 | 32.65 ± 5.11 | 416 | 32.71 ± 4.97 | 0.0054 |
| Albumin-globulin ratio | 48 | 1.01 ± 0.30 | 446 | 1.33 ± 0.30 | 416 | 1.33 ± 0.30 | < 0.0001 |
| Total protein (g/L) | 50 | 70.40 ± 6.86 | 474 | 74.98 ± 5.37 | 482 | 74.97 ± 5.36 | < 0.0001 |
| Neutrophil count (109/L) | 48 | 5.12 ± 3.61 | 466 | 3.46 ± 1.60 | 470 | 3.57 ± 1.69 | 0.0004 |
| Lymphocyte count (109/L) | 48 | 0.93 ± 0.36 | 466 | 1.53 ± 0.75 | 470 | 1.53 ± 0.59 | < 0.0001 |
| Neutrophil–lymphocyte ratio | 48 | 6.43 ± 4.75 | 466 | 2.72 ± 1.90 | 470 | 2.65 ± 1.62 | < 0.0001 |
| White blood cell count (109/L) | 50 | 6.54 ± 3.62 | 473 | 5.61 ± 1.93 | 484 | 5.70 ± 1.98 | 0.1906 |
| Bilirubin (μmol/L) | 50 | 10.72 ± 5.13 | 474 | 9.20 ± 5.96 | 482 | 9.18 ± 5.83 | 0.0130 |
| Potassium (mmol/L) | 50 | 3.78 ± 0.48 | 474 | 3.88 ± 0.38 | 480 | 3.89 ± 0.37 | 0.1846 |
| Creatinine (as times of upper normal reference) | 50 | 0.83 ± 0.35 | 474 | 0.77 ± 0.14 | 481 | 0.75 ± 0.12 | 0.1474 |
| LDH | 44 | 1.59 ± 0.68 | 435 | 0.84 ± 0.25 | 464 | 0.85 ± 0.23 | < 0.0001 |
| ALP | 50 | 0.58 ± 0.29 | 474 | 0.57 ± 0.23 | 483 | 0.58 ± 0.19 | 0.1511 |
| ALT | 50 | 0.92 ± 0.64 | 474 | 0.61 ± 0.48 | 482 | 0.63 ± 0.51 | < 0.0001 |
| Platelet | 50 | 0.54 ± 0.19 | 470 | 0.58 ± 0.20 | 483 | 0.60 ± 0.19 | 0.0094 |
| MPV | 48 | 0.88 ± 0.10 | 393 | 0.87 ± 0.15 | 429 | 0.88 ± 0.12 | 0.9139 |
LDH lactate dehydrogenase, ALP alkaline phosphatase, ALT alanine aminotransferase, MPV mean platelet volume
#CT value is set to 40 if PCR test is "not detected"
Fig. 1Top 20 Features* ranked according to importance in the XGBoost model. *Feature importance of total protein, gender, pneumonia, chronic disease, sore throat, fatigue, myalgia, chills, headache, and vomiting under Day-1 model and that of total protein, myalgia, pneumonia, fever, vomiting, chronic disease, headache, fatigue, sore throat, chills under Day-5 model were excluded from this figure
Predictive performance of the full model and the simplified model
| Full Model* (based on 30 features) | ||||||
|---|---|---|---|---|---|---|
| On Day 1 of admission | On Day 5 of admission# | |||||
| Predicted class | Predicted class | |||||
| Critical/serious | Stable | Satisfactory | Critical/serious | Stable | Satisfactory | |
| Actual class | ||||||
| Critical/serious | 6 | 0 | 4 | 10 | 0 | 0 |
| Stable | 6 | 86 | 5 | 0 | 96 | 1 |
| Satisfactory | 1 | 0 | 100 | 0 | 0 | 101 |
| Sensitivity | ||||||
| By class | 60.0% | 88.7% | 99.0% | 100.0% | 99.0% | 100.0% |
| Macro averaged | 82.6% | 99.7% | ||||
| Micro averaged | 92.3% | 99.5% | ||||
| Specificity | ||||||
| By class | 96.5% | 100.0% | 91.6% | 100.0% | 100.0% | 99.1% |
| Macro averaged | 96.0% | 99.5% | ||||
| Micro averaged | 96.1% | 99.5% | ||||
| Accuracy | 92.3% | 99.5% | ||||
*Model performance based on testing dataset (n = 208)
#upon discharge if hospital discharged before Day 5
Comparison on predictive performance of three alternative machine learning classification algorithms using the same features under the full model
| Algorithm | Full model* (based on 30 features) | ||||
|---|---|---|---|---|---|
| On Day 1 of admission | |||||
| Sensitivity | Specificity | Accuracy (%) | |||
| Macro averaged (%) | Micro averaged (%) | Macro averaged (%) | Micro averaged (%) | ||
| Decision Tree # | 76.1 | 90.4 | 78.3 | 90.4 | 90.4 |
| Random forest # | 77.8 | 90.9 | 69.6 | 90.9 | 90.9 |
| As compared against the study’s chosen model | |||||
| XGBoost | 82.6 | 92.3 | 96.0 | 96.1 | 92.3 |
*Model performance based on testing dataset (n = 208)
#median imputation method was adopted to handle missing data values in study subjects
Fig. 2Decision rules using the key features under the Day-1 simplified model and their thresholds
Predictive performance of the simplified model for an extended testing data (COVID-19 confirmed cases during 1 May–9 Aug 2020), which was supplemented to this study when the manuscript was revised in mid-August 2020
| Simplified model* (based on 7 features) | ||||||
|---|---|---|---|---|---|---|
| On Day 1 of admission | On Day 5 of admission# | |||||
| Predicted class | Predicted class | |||||
| Critical/serious | Stable | Satisfactory | Critical/serious | Stable | Satisfactory | |
| Critical/serious | 187 | 13 | 23 | 207 | 7 | 9 |
| Stable | 227 | 1389 | 62 | 187 | 1480 | 11 |
| Satisfactory | 7 | 0 | 1076 | 18 | 0 | 1065 |
| By outcome class | 83.9% | 82.8% | 99.4% | 92.8% | 88.2% | 98.3% |
| Macro averaged | 88.9% | 92.2% | ||||
| Micro averaged | 88.9% | 92.2% | ||||
| By outcome class | 91.5% | 99.0% | 95.5% | 92.6% | 99.5% | 98.9% |
| Macro averaged | 97.2% | 98.8% | ||||
| Micro averaged | 94.4% | 96.1% | ||||
| Accuracy | 88.9% | 92.2% | ||||
*Model performance based on testing dataset (n = 2984)