| Literature DB >> 35317245 |
Mostafa Shanbehzadeh1, Raoof Nopour2, Hadi Kazemi-Arpanahi3,4.
Abstract
Introduction: Coronavirus disease 2019 (COVID-19) outbreak has overwhelmed many healthcare systems worldwide and put them at the edge of collapsing. As intensive care unit (ICU) capacities are limited, deciding on the proper allocation of required resources is crucial. This study aimed to develop and compare models for early predicting ICU admission in COVID-19 patients at the point of hospital admission. Materials and methods: Using a single-center registry, we studied the records of 512 COVID-19 patients. First, the most important variables were identified using Chi-square test (at p < 0.01) and logistic regression (with odds ratio at P < 0.05). Second, we trained seven decision tree (DT) algorithms (decision stump (DS), Hoeffding tree (HT), LMT, J-48, random forest (RF), random tree (RT) and REP-Tree) using the selected variables. Finally, the models' performance was evaluated. Furthermore, we used an external dataset to validate the prediction models.Entities:
Keywords: COVID-19; Coronavirus; Decision tree; Intensive care unit; Machine learning
Year: 2022 PMID: 35317245 PMCID: PMC8930186 DOI: 10.1016/j.imu.2022.100919
Source DB: PubMed Journal: Inform Med Unlocked ISSN: 2352-9148
Fig. 1Flow chart describing patient selection.
The characteristics of laboratory variables.
| NO | Variable (Units) | Ranges | Description |
|---|---|---|---|
| 1 | Blood creatinine (mg/dL)1 | Reference: 0.7–1.3 (men), 0.6–1.1 (women) Low: <0.7 (men), <0.6 (women), High: >1.3 (men), >1.1 (women) | The creatinine rate in the blood |
| 2 | Red cell count (mc/mL)2 | Reference: 4.35–5.65 (men), 3.92–5.13 (women) | The red cells count in plasma |
| Low: <4.35 (men), <3.92 (women) | |||
| High: > 5.65 (men) > 5.13 (women) | |||
| 3 | Hematocrit (L/L)3 | Reference: 0.40–0.54 (men), 0.37–0.47 (women) | The proportion of the red cells count to the plasma cells count |
| Low: <40 (men) | |||
| <0.37 (women) | |||
| High: >0.54 (men) > 0.47 (women) | |||
| 4 | Hemoglobin rate (g/dL)4 | Reference: 14.0–17.5 (men) 12.3–15.3 (women) | The protein rate in red blood cells that carries iron |
| Low: <14.0 (men) < 12.3 (women) | |||
| High: >17.5 (men) > 15.3 (women) | |||
| 5 | Platelet count (Cells/μL)5 | Reference: 150,000–400,000. | Number of platelet cells count in the plasma |
| Low: <150000 | |||
| High: >400000 | |||
| 6 | Absolute lymphocyte count (103 Cells/μL)5 | Reference: 1–4.8 | The absolute number of lymphocyte cells in the blood that can be acquired by multiplying the number of white cells and lymphocyte percentage |
| Low:< 1 | |||
| High:> 4.8 | |||
| 7 | Absolute neutrophil count (103Cells/μL)5 | Reference: 2.5–6 | The absolute number of neutrophil cells in the blood that can be acquired by multiplying the number of white cells and neutrophil percentage |
| Low <2.5 | |||
| High:> 6 | |||
| 8 | Blood calcium (mg/dL)1 | Reference: 8.6–10.3 | The calcium rate in the blood |
| Low: <8.6 | |||
| High: >10.3 | |||
| 9 | Blood sodium (mEq/L)6 | Reference: 135-145 | The sodium rate in the blood |
| Low:< 135 | |||
| High:> 145 | |||
| 10 | Blood magnesium (mEq/L)6 | Reference: 1.3–2.1 | The magnesium rate in the blood |
| Low<1.3 | |||
| High: >2.1 | |||
| 11 | Blood phosphor (mg/dL)1 | Reference: 3.4–4.5 | The phosphor rate in the blood |
| Low: <3.4 | |||
| High:> 4.5 | |||
| 12 | Blood potassium (mEq/L)6 | Reference: 3.5–5.2 | The potassium rate in the blood |
| Low: <3.5 | |||
| High>5.2 | |||
| 13 | Blood urea nitrogen (mg/dL)1 | Reference: 6-24 | Amount of urea nitrogen found in blood |
| Low: <6 | |||
| High:> 24 | |||
| 14 | Total bilirubin (mg/dL)1 | Reference: 1.2 | Amount of bilirubin in the blood |
| Low:< 1.2 | |||
| High:> 1.2 | |||
| 15 | Aspartate aminotransferase (units/L)7 | Reference: 8-33 | The amount of aspartate aminotransferase enzymes in the blood |
| Low: <8 | |||
| High:> 33 | |||
| 16 | Alanine aminotransferase (units/L)7 | Reference: 29–33 (men) 19–25 (women) | The amount of alanine aminotransferase enzymes in the blood |
| Low: <29 (men) < 19 (women) | |||
| High: >33 (men) > 25 (men) | |||
| 17 | Serum albumin (g/dL)8 | Reference: 3.4–5.4. | albumin amount which are in vertebrate blood |
| Low: <3.4 | |||
| High:> 5.4 | |||
| 18 | Blood glucose (mg/dL)1 | Reference: <140 | The glucose rate in the blood |
| Diabetes: >200 Prediabetes: 140-199 | |||
| 19 | Lactate dehydrogenase (Units/L)7 | Reference: 140 -280 | Amounts of lactic acid dehydrogenase in the blood |
| Low: <140 | |||
| High: >280 | |||
| 20 | Activated partial thromboplastin time (s)9 | Reference: 30-40 | Measures the time that the clot is formed in a blood specimen |
| Fast:<30 | |||
| Slow: >40 | |||
| 21 | Prothrombin time (s)9 | Reference: 11–13.5. Fast: <11 | Measures the time that the liquid portion of blood are clotted |
| Slow: >13.5 | |||
| 22 | Alkaline phosphatase (Units/L)7 | Reference: 44-147 | The amount of Alkaline phosphatase enzymes in the blood |
| Low: <44 | |||
| High:> 147 | |||
| 23 | C-reactive protein (mg/L)10 | Reference: <10 | The amount of this protein in the blood and increases in inflammation conditions |
| High: ≥10. | |||
| 24 | Erythrocyte sedimentation rate (mm/hr)11 | Reference: 0–22 (men), 0–29 (women) Abnormal: >22 (men), >29 (women) | Measure the quantity at which red-type blood cells subsist at the end of a test tube containing a blood specimen |
| 25 | White cell count (Cells/mL)12 | Reference: 4500–11,000 | The white-type cells count in the plasma |
| Low:<4500 | |||
| High:> 11000 | |||
| 26 | Hypersensitive troponin (ng/L)13 | Normal: =<14 | This test can be used for heart attack and insufficiency, in other words the >14 in bloodstream indicates heart attack |
| Abnormal: >14 |
1- Milligram per deciliter. 2- Million cells per microliter. 3- Number of red cells per liter per number of cells per liter. 4- Grams per deciliter. 5- Number of cells per microliter. 6- Miliequivalents per liter. 7- Units per liter. 8- Grams per deciliter. 9-Seconds. 10- Milligrams per liter. 11- Millimeters per hour. 12- Cell per microliter. 13- Nanograms per liter.
The most important variable at P < 0.01 using Chi-squared test.
| No. | Variable name | Variable type | Frequency or mean | P (level) | |
|---|---|---|---|---|---|
| 1 | Length of hospitalization | Numeric | 5.03 | 28.71 | <0.001 |
| 2 | Contusion | Nominal | Have (180) | 7.97 | <0.01 |
| Haven't (302) | |||||
| 3 | Oxygen therapy | Nominal | Have (437) | 7.99 | <0.01 |
| Haven't [ | |||||
| 4 | Dyspnea | Nominal | Have (442) | 7.023 | <0.01 |
| Haven't [ | |||||
| 5 | Loss of taste | Nominal | Have (124) | 8.722 | <0.01 |
| Haven't (358) | |||||
| 6 | Loss of smell | Nominal | Have (137) | 13.372 | <0.001 |
| Haven't (345) | |||||
| 7 | Runny nose | Nominal | Have (202) | 10.239 | <0.01 |
| Haven't (280) | |||||
| 8 | Other underline diseases | Nominal | Have (339) | 23.277 | <0.001 |
| Haven't (143) | |||||
| 9 | Cardiac diseases | Nominal | Have (157) | 12.491 | <0.001 |
| Haven't (325) | |||||
| 10 | Blood pressure | Nominal | Have (189) | 13.281 | <0.001 |
| Haven't (293) | |||||
| 11 | Diabetes | Nominal | Have (124) | 10.026 | <0.01 |
| Haven't (358) | |||||
| 12 | White cell count | Numeric | 9684 | 196.616 | <0.01 |
| 13 | Absolute lymphocyte count | Numeric | 21.702 | 83.41 | <0.01 |
| 14 | Absolute neutrophil count | Numeric | 76.71 | 97.661 | <0.01 |
| 15 | Blood sodium | Numeric | 138.27 | 40.667 | <0.01 |
| 16 | Blood glucose | Numeric | 148.4 | 12.884 | <0.01 |
| 17 | Activated partial thromboplastin time | Numeric | 35.453 | 117.458 | <0.001 |
| 18 | Hypertensive troponin | Nominal | Abnormal [ | 14.588 | <0.01 |
| Normal (444) | |||||
| 19 | Age | Numeric | 57.25 | 35.292 | <0.001 |
| 20 | Pleural fluid | Nominal | Have (275) | 30.583 | <0.001 |
| Haven't (78) |
The most important determinant in predicting ICU hospitalization using odds ratio.
| No | Variable | Wald | df | P-value | Odds ratio | 95% Confidence interval for odds ratio | |
|---|---|---|---|---|---|---|---|
| Lower | Upper | ||||||
| 1 | Oxygen therapy | 4.007 | 1 | 0.031 | 1.375 | 1.055 | 2.545 |
| 2 | Dyspnea | 3.830 | 1 | 0.036 | 1.335 | 2.032 | 4.523 |
| 3 | Loss of taste | 4.565 | 1 | 0.033 | 1.489 | 1.254 | 1.943 |
| 4 | Loss of smell | 4.726 | 1 | 0.030 | 1.474 | 1.242 | 1.929 |
| 5 | Runny nose | 3.473 | 1 | 0.042 | 1.570 | 1.315 | 2.030 |
| 6 | Other underline disease | 2.690 | 1 | 0.010 | 1.499 | 1.002 | 1.945 |
| 7 | Cardiac disease | 3.137 | 1 | 0.028 | 2.671 | 1.323 | 3.396 |
| 8 | Blood pressure | 0.179 | 1 | 0.673 | 0.853 | 0.408 | 1.784 |
| 9 | Diabetes | 3.356 | 1 | 0.031 | 2.776 | 1.437 | 3.285 |
| 10 | White-cell count | 0.000 | 1 | 0.092 | 1.000 | 1.000 | 1.000 |
| 11 | Absolute lymphocyte count | 0.075 | 1 | 0.784 | 0.987 | 0.899 | 1.084 |
| 12 | Absolute neutrophil count | 0.878 | 1 | 0.349 | 1.042 | 0.956 | 1.135 |
| 13 | Sodium | 0.816 | 1 | 0.366 | 1.039 | 0.956 | 1.129 |
| 14 | Glucose | 0.885 | 1 | 0.347 | 1.002 | 0.998 | 1.007 |
| 15 | Activated partial thromboplastin time | 4.072 | 1 | 0.017 | 3.004 | 1.977 | 5.031 |
| 16 | Hypersensitive troponin | 5.741 | 1 | 0.117 | 0.016 | 0.001 | 0.471 |
| 17 | Age | 6.380 | 1 | 0.012 | 3.565 | 2.227 | 5.708 |
| 18 | Pleural fluid | 2.285 | 1 | 0.025 | 1.222 | 0.89 | 2.999 |
| 19 | Length of hospitalization | 3.101 | 1 | 0.019 | 2.022 | 1.225 | 3.166 |
| 20 | Contusion | 2.277 | 1 | 0.131 | 0.622 | 0.336 | 1.152 |
Fig. 2Different evaluation criteria of decision tree algorithms.
Fig. 3AUC of different decision tree algorithms.
Fig. 4Pruned J-48 decision tree algorithm.
Confusion matrix for external dataset.
| Predicted ICU admitted | Predicted non-ICU admitted | Total | |
|---|---|---|---|
| Real ICU admitted | 53 | 8 | 61 |
| Real non-ICU admitted | 17 | 30 | 47 |
| Total | 70 | 38 | 108 |
Based on Table 4, we obtained the predictive model performance criteria as PPV = 75.7%, NPV = 32%, sensitivity = 86.9%, specificity = 63.8%, accuracy = 76.8% and F-score = 80.9%. The ROC of the J-48 for the external dataset is depicted in Fig. 5.
Fig. 5The ROC of J-48 for the external dataset.