| Literature DB >> 32820210 |
Gang Wu1, Shuchang Zhou1, Yujin Wang1, Wenzhi Lv2, Shili Wang3, Ting Wang4, Xiaoming Li5.
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in thousands of deaths in the world. Information about prediction model of prognosis of SARS-CoV-2 infection is scarce. We used machine learning for processing laboratory findings of 110 patients with SARS-CoV-2 pneumonia (including 51 non-survivors and 59 discharged patients). The maximum relevance minimum redundancy (mRMR) algorithm and the least absolute shrinkage and selection operator logistic regression model were used for selection of laboratory features. Seven laboratory features selected in the model were: prothrombin activity, urea, white blood cell, interleukin-2 receptor, indirect bilirubin, myoglobin, and fibrinogen degradation products. The signature constructed using the seven features had 98% [93%, 100%] sensitivity and 91% [84%, 99%] specificity in predicting outcome of SARS-CoV-2 pneumonia. Thus it is feasible to establish an accurate prediction model of outcome of SARS-CoV-2 pneumonia based on laboratory findings.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32820210 PMCID: PMC7441177 DOI: 10.1038/s41598-020-71114-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
The fifteen features with higher mRMR scores were selected for the step of LASSO logistic regression.
| Rank | Features | mRMR score | Coefficient after LASSO |
|---|---|---|---|
| 1 | PTA | 0.3531 | − 0.4148 |
| 2 | WBC | 0.1921 | 0.3214 |
| 3 | Urea | 0.1867 | 0.3954 |
| 4 | IL-2r | 0.1773 | 0.2297 |
| 5 | IB | 0.1249 | 0.0951 |
| 6 | Myoglobin | 0.1119 | 0.0526 |
| 7 | TB | 0.1100 | 0 |
| 8 | FgDP | 0.1073 | 0.0243 |
| 9 | hs-CRP | 0.1025 | 0 |
| 10 | Ferritin | 0.0952 | 0 |
| 11 | LDH | 0.0870 | 0 |
| 12 | D-dimer | 0.0860 | 0 |
| 13 | eGFR | 0.0820 | 0 |
| 14 | Neutrophils | 0.0638 | 0 |
| 15 | Sodium | 0.0626 | 0 |
Some candidate features coefficients were shrunk to zero and the remaining variables with non-zero coefficients were selected.
mRMR maximum relevance minimum redundancy, LASSO least absolute shrinkage and selection operator, PTA prothrombin activity, WBC white blood cell, IL-2r interleukin-2 receptor, IB indirect bilirubin, TB total bilirubin, FgDP fibrinogen degradation products, hs-CRP hypersensitive C-reactive protein, LDH lactate dehydrogenase, eGFR estimated glomerular filtration rate.
Figure 1Correlation matrix heatmap of 38 significant features. Spearman’s correlation coefficient was used to compute the relevance and redundancy of the features.
Figure 2The fivefold cross-validation (A) of the least absolute shrinkage and selection operator algorithm for feature selection process. A vertical line was drawn at the optimal value. Some candidate features coefficients were shrunk to zero (B) and the remaining seven variables with non-zero coefficients were finally selected.
Figure 3Contribution of the features to the model. The histogram shows the contribution of the seven features with non-zero coefficients. The features are plotted on the y-axis, and their coefficients are plotted on the x-axis.
Figure 4Bar charts of the signature for patients. The red bars indicate the signatures of discharged patients, while the light green bars indicate the signatures of non-survivors. The AUC was 0.997 for the signature.
Figure 5The precision recall curve for the model. The area under precision recall curve was 0.996.
Medians [inter-quartile range] of laboratory findings of patients with SARS-CoV-2 pneumonia were provided in the table.
| Non-survivors | Discharged patients | ||
|---|---|---|---|
| Leucocyte (109/L) | 11.64 [9.37, 15.61] | 5.22 [4.1, 8.79] | < 0.0001 |
| Platelet (109/L) | 118 [63, 179] | 144.5 [113.5, 227.75] | 0.004 |
| Erythrocyte (1012/L) | 3.48 [2.71, 3.89] | 3.76 [3.59, 4.17] | 0.001 |
| Neutrophils (109/L) | 9.44 [7.36, 12.71] | 4.8 [2.45, 7.37] | < 0.0001 |
| Lymphocyte (109/L) | 0.5 [0.32, 0.74] | 0.70 [0.47, 0.93] | < 0.0001 |
| Hemoglobin (g/L) | 115.5 [91, 127] | 120 [112.5, 129] | 0.16 |
| Potassium (mmol/L) | 4.39 [4.11, 5.19] | 4.11 [3.70, 4.76] | 0.032 |
| Calcium (mmol/L) | 2.01 [1.94, 2.10] | 2.06 [2.00, 2.13] | 0.024 |
| Chlorine (mmol/L) | 99.4 [97.2, 105.3] | 98 [95.83, 99.4] | 0.041 |
| Sodium (mmol/L) | 139.55 [135.2, 145.1] | 135.55 [133.4, 137.85] | 0.017 |
| Glucose (mmol/L) | 9.11 [7.62, 13.66] | 7.44 [6.46, 9.48] | 0.019 |
| Total protein (g/L) | 59.9 [56.8, 65.4] | 63.9 [59.8, 67.9] | 0.030 |
| Globulin (g/L) | 37.9 [33.3, 40.7] | 32.2 [29.25, 33.85] | < 0.0001 |
| Albumin (g/L) | 28 [25.08, 30.95] | 32.3 [28.2, 37.2] | 0.009 |
| Creatinine (μmol/L) | 86 [66, 179.5] | 72 [59, 103.5] | 0.008 |
| Uric acid (μmol/L) | 190.5 [114.5, 309] | 188 [148, 362.4] | 0.54 |
| Total bilirubin (μmol/L) | 17.4 [11.5, 20.4] | 11.4 [8.6, 13.4] | < 0.0001 |
| Direct bilirubin (μmol/L) | 9.75 [6.4, 14.4] | 6.8 [5.35, 9.93] | < 0.0001 |
| Indirect bilirubin (μmol/L) | 9.15 [5.45, 11] | 7.6 [5.8, 9.35] | 0.043 |
| Urea (mmol/L) | 14.55 [10, 20.08] | 7.85 [6.98, 10.17] | < 0.0001 |
| Estimated glomerular filtration rate (ml/min/1.73 m2) | 69.3 [41.35, 89.4] | 82 [73.4,88.6] | 0.008 |
| Glutamic oxaloacetic transaminase (U/L) | 43 [24, 104] | 37 [21, 57] | 0.044 |
| Glutamic-pyruvic transaminase (U/L) | 39 [17, 77.25] | 38 [24.25, 61.75] | 0.70 |
| Myoglobin (ng/mL) | 280.6 [152.15, 736.8] | 67 [24.45, 129.55] | 0.002 |
| High sensitive cardiac troponin I (pg/mL) | 202.2 [68.95, 460.18] | 30.55 [14.5, 47.6] | < 0.0001 |
| MB isoenzyme of creatine kinase (ng/mL) | 5.85 [2.63, 11.93] | 1.2 [0.7, 1.9] | 0.014 |
| Lactate dehydrogenase (U/L) | 490 [358.5, 591] | 306.5 [281.5, 368.25] | < 0.0001 |
| Glutamate dehydrogenase (U/L) | 16.6 [9, 44] | 13.6 [8.23, 25.13] | 0.18 |
| Creatine kinase (U/L) | 180 [43, 503] | 178 [84, 230.5] | 0.24 |
| Prothrombin time (s) | 16.5 [15.3, 19.4] | 14.3 [13.25, 15.83] | < 0.0001 |
| Fibrinogen (g/L) | 5.92 [4.82, 6.3] | 5.22 [4.59, 5.78] | 0.039 |
| Activated partial thromboplastin time (s) | 46.4 [42.5, 56] | 43.5 [40.65, 46.8] | 0.0021 |
| Thrombin time (s) | 17.5 [15.7, 20.6] | 15.95 [15.13, 17.68] | 0.026 |
| D–D dimer (μg/mL) | 5.47 [2.73, 12.52] | 2.22 [1.82, 2.92] | < 0.0001 |
| Prothrombin activity | 62% [55%, 75%] | 79% [64%, 91%] | < 0.0001 |
| International standardized ratio | 1.3 [1.22, 1.56] | 1.08 [0.99, 1.21] | < 0.0001 |
| Fibrinogen degradation products (μg/mL) | 29.65 [17.13, 62.65] | 6.9 [3.9, 11.5] | < 0.0001 |
| Procalcitonin (ng/mL) | 0.97 [0.27, 2.58] | 0.16 [0.11, 0.21] | < 0.0001 |
| N-terminal pro-brain natriuretic peptide (pg/mL) | 3,375.5 [1491.75, 8,102.75] | 963 [522, 1,483.5] | < 0.0001 |
| Ferritin (μg/L) | 1,064.5 [814.25, 2,658.5] | 826.8 [616.75, 1,481.5] | 0.037 |
| Hypersensitive C-reactive protein (mg/L) | 142.7 [80.9, 209] | 43.1 [22.3, 127.1] | < 0.0001 |
| Interleukin-1β (pg/mL) | 5.95 [3.65, 10.28] | 5 [3.5, 9.5] | 0.17 |
| Interleukin-2 receptor (U/mL) | 1,280.5 [1059.25, 1,486.25] | 482 [238, 901] | < 0.0001 |
| Interleukin-6 (pg/mL) | 157.3 [51.62, 227.3] | 75.99 [42.96, 148.80] | < 0.0001 |
| Interleukin-10 (pg/mL) | 11.6 [10.25, 15.6] | 9.9 [5.3, 14.4] | 0.056 |
Features were compared between non-survivors and discharged patients using the Mann–Whitney U test for non-normally distributed features or the independent t test for normally distributed features.
SARS-CoV-2 severe acute respiratory syndrome coronavirus 2.