| Literature DB >> 36124024 |
Elaheh Zadeh Hosseingholi1, Saeede Maddahi2,1, Sajjad Jabbari3,1, Ghader Molavi4,1.
Abstract
Background: The coronavirus disease (COVID-19) pandemic has made a great impact on health-care services. The prognosis of the severity of the disease help reduces mortality by prioritizing the allocation of hospital resources. Early mortality prediction of this disease through paramount biomarkers is the main aim of this study. Materials andEntities:
Keywords: Aspartate aminotransferases; blood urea nitrogen; coronavirus disease-19; machine learning; prognosis
Year: 2022 PMID: 36124024 PMCID: PMC9482375 DOI: 10.4103/abr.abr_178_21
Source DB: PubMed Journal: Adv Biomed Res ISSN: 2277-9175
Blood test features with the measurement units, and their related minimum, maximum, and median values in survived or deceased groups of patients
| Feature (measurement) | Deceased | Survived | ||||
|---|---|---|---|---|---|---|
|
|
| |||||
| Minimum | Maximum | Median | Minimum | Maximum | Median | |
| WBC (cell per microliter) | 3200 | 55,200 | 9700 | 1600 | 30,400 | 6300 |
| HGB (grams per deciliter) | 5.8 | 18.7 | 13.45 | 5.3 | 18.9 | 13.4 |
| MCV (femtoliters) | 77.7 | 99.6 | 87.8 | 59.6 | 133.3 | 85.4 |
| MCH (picograms per cell) | 23.4 | 33.3 | 28.3 | 16.7 | 39.2 | 28.5 |
| PLT (cell per microliter) | 31,600 | 337,000 | 153,500 | 37,000 | 535,000 | 192,000 |
| LYM (cell count percent) | 3.7 | 68.7 | 12.4 | 3.8 | 58 | 16.06 |
| NEUT (cell count percent) | 54.2 | 94.7 | 81.34 | 38.4 | 95.3 | 74.3 |
| RDW-SD (femtolitre) | 17 | 64.7 | 46.63 | 16.09 | 81.2 | 44.5 |
| RDW-CV (percent) | 12.5 | 20.4 | 14.25 | 11.08 | 22.7 | 13.1 |
| ESR (millimeter per hour) | 2 | 78 | 34 | 1 | 112 | 37 |
| BUN (milligrams per deciliter) | 8.41 | 211.4 | 45.5 | 5.13 | 110.28 | 18.22 |
| Cr (milligrams per deciliter) | 0.9 | 6.74 | 2.09 | 0.7 | 6 | 1.3 |
| CRP (milligrams per deciliter) | 0.3 | 30 | 16 | 0.4 | 28 | 15 |
| PTT (s) | 28 | 65 | 39.5 | 28 | 54 | 37.5 |
| PT (s) | 13 | 32.5 | 14 | 13 | 24.7 | 13 |
| Na (millimoles per liter) | 125 | 168 | 133.7 | 123.7 | 149.6 | 135 |
| K (millimoles per liter) | 3.2 | 7.1 | 4.32 | 3 | 5.2 | 3.87 |
| AST (units per liter) | 18 | 281 | 54.1 | 4 | 188 | 26 |
| ALT (units per liter) | 12 | 384 | 30.4 | 7 | 126 | 23 |
| ALP (units per liter) | 138 | 445 | 247 | 122 | 912 | 237 |
WBC: White blood cell, HGB: Hemoglobin, MCV: Mean corpuscular volume, MCH: Mean corpuscular hemoglobin, PLT: Platelets, LYM: Lymphocyte’s count, NEUT: Neutrophil’s count, RDW-SD: Red cell distribution width-standard deviation, RDW-CV: Red cell distribution width-coefficient of variation, ESR: Erythrocyte sedimentation rate, BUN: Blood urea nitrogen, Cr: Creatinine, CRP: C-reactive protein, PTT: Partial thromboplastin time, PT: Prothrombin time, Na: Sodium, K: Potassium, AST: Aspartate aminotransferase, ALT: Alanine aminotransferase, ALP: Alkaline phosphatase
Performance results of machine learning algorithms using all features
| Method | MCC | Accuracy | Specificity | Sensitivity | F1-score |
|---|---|---|---|---|---|
| RF | 0.514* | 0.887* | 1* | 0.3 | 0.461 |
| LDA | 0.449 | 0.8709 | 0.96 | 0.4* | 0.500* |
| SVM | 0.416 | 0.8709 | 1* | 0.2 | 0.330 |
*The top results for each score. MCC: Matthews correlation coefficient, RF: Random forests, LDA: Linear discriminant analysis, SVM: Support vector machines
Figure 1Random Forests feature selection. The mean square error decrease (left), Gini impurity decrease (right) for each feature removal
Aggregate ranking of all features
| Finalrank | Feature | MSE decrease in accuracy (%) | MSE decrease rank | Gini impurity | Gini impurity rank | Statics result ( | Statics rank |
|---|---|---|---|---|---|---|---|
| 1 | AST | 6.84 | 1 | 3.12 | 2 | 0.00000044 | 1 |
| 2 | BUN | 5.98 | 2 | 3.92 | 1 | 0.0000016 | 2 |
| 3 | Cr | 3.57 | 3 | 2.16 | 3 | 0.000018 | 3 |
| 4 | K | 2.67 | 4 | 1.78 | 5 | 0.000316 | 4 |
| 5 | PTT | 2.62 | 5 | 1.97 | 4 | 0.11 | 15 |
| 6 | RDW-SD | −0.93 | 17 | 1.77 | 6 | 0.00071 | 6 |
| 7 | PT | 1.67 | 7 | 1.33 | 9 | 0.01 | 10 |
| 8 | PLT | 1.07 | 9 | 1.38 | 8 | 0.03 | 12 |
| 9 | RDW-CV | 0.9 | 11 | 0.99 | 17 | 0.00046 | 5 |
| 10 | NEUT | −1.37 | 21 | 1.61 | 7 | 0.0044 | 8 |
| 11 | LYM | 0.21 | 13 | 1.29 | 10 | 0.079 | 13 |
| 12 | HGB | 1.59 | 8 | 1.28 | 11 | 0.44 | 20 |
| 13 | WBC | −0.12 | 15 | 1.19 | 15 | 0.0047 | 9 |
| 14 | ESR | 1.06 | 10 | 1.28 | 12 | 0.14 | 17 |
| 15 | CRP | 2.38 | 6 | 0.82 | 19 | 0.374 | 19 |
| 16 | Age | −1.3 | 20 | 1.18 | 16 | 0.0034 | 7 |
| 17 | ALT | −0.63 | 16 | 1.22 | 14 | 0.022 | 11 |
| 18 | Na | 0.04 | 14 | 1.24 | 13 | 0.1171 | 16 |
| 19 | MCV | −1.11 | 18 | 0.82 | 18 | 0.098 | 14 |
| 20 | ALP | 0.44 | 12 | 0.8 | 21 | 0.25 | 18 |
| 21 | MCH | −1.2 | 19 | 0.8 | 20 | 0.7317 | 21 |
| 22 | Gender | −1.79 | 22 | 0.13 | 22 | 0.84 | 22 |
MSE: Mean square error, AST: Aspartate aminotransferase, BUN: Blood urea nitrogen, Cr: Creatinine, K: Potassium, PTT: Partial thromboplastin time, RDW-CV: Red cell distribution width-coefficient of variation, PT: Prothrombin time, PLT: Platelets, RDW-SD: Red cell distribution width-standard deviation, NEUT: Neutrophil’s count, LYM: Lymphocyte’s count, HGB: Hemoglobin, WBC: White blood cell, ESR: Erythrocyte sedimentation rate, CRP: C-reactive protein, ALT: Alanine aminotransferase, Na: Sodium, MCV: Mean corpuscular volume, ALP: Alkaline phosphatase, MCH: Mean corpuscular hemoglobin
Figure 2Distribution of age group and survival outcome
Figure 3The distribution of the significant blood biomarker data among healthy survived and deceased groups
Performance of machine learning algorithms
| Method | MCC | Accuracy | Specificity | Sensitivity | F1-score |
|---|---|---|---|---|---|
| Decision tree | 0.53 | 0.88 | 0.96 | 0.5 | 0.58 |
| One rule | 0.44 | 0.87 | 0.96 | 0.4 | 0.5 |
MCC: Matthews correlation coefficient