| Literature DB >> 33748802 |
Louis Ehwerhemuepha1,2, Sidy Danioko2, Shiva Verma3, Rachel Marano1, William Feaster1, Sharief Taraman1, Tatiana Moreno1, Jianwei Zheng2, Ehsan Yaghmaei1,2, Anthony Chang1.
Abstract
BACKGROUND: Cardiovascular and other circulatory system diseases have been implicated in the severity of COVID-19 in adults. This study provides a super learner ensemble of models for predicting COVID-19 severity among these patients.Entities:
Keywords: COVID-19; COVID-19 severity; Cardiovascular conditions; Ensemble learning; Predicting COVID-19 severity; Super learning
Year: 2021 PMID: 33748802 PMCID: PMC7963518 DOI: 10.1016/j.ibmed.2021.100030
Source DB: PubMed Journal: Intell Based Med ISSN: 2666-5212
Summary statistics on all variables.
| Variables | Levels | COVID-19 Infection, n (%) | Unadjusted p values | |
|---|---|---|---|---|
| Mild | Severe | |||
| Sex | Female | 9668 (50.30) | 1625 (41.59) | < 0.001 |
| Male | 8342 (43.40) | 1865 (47.73) | ||
| Unknown | 1212 (6.31) | 417 (10.67) | ||
| Age | Young Adults (18 to 35yrs) | 1344 (6.99) | 95 (2.43) | |
| Middle-Aged Adults (36 to 55yrs) | 4776 (24.85) | 529 (13.54) | ||
| Older Adults (>55yrs) | 13102 (68.16) | 3283 (84.03) | ||
| Race | White | 12341 (64.20) | 2254 (57.69) | |
| Black or African American | 4052 (21.08) | 956 (24.47) | ||
| Asian or Pacific islander | 446 (2.32) | 111 (2.84) | ||
| American Indian or Alaska Native | 264 (1.37) | 51 (1.31) | ||
| Other racial group | 1621 (8.43) | 420 (10.75) | ||
| Unknown racial group | 498 (2.59) | 115 (2.94) | ||
| Payer | Governmental Insurance | 9445 (49.14) | 2289 (58.59) | |
| Private Insurance | 6218 (32.35) | 716 (18.33) | ||
| Self-pay | 626 (3.26) | 41 (1.05) | ||
| Unknown | 2933 (15.26) | 861 (22.04) | ||
| Temperature | Normal | 11898 (61.90) | 2392 (61.22) | < 0.001 |
| High | 2440 (12.69) | 886 (22.68) | ||
| Low | 117 (0.61) | 126 (3.22) | ||
| Unknown | 4767 (24.80) | 503 (12.87) | ||
| Heart rate | Normal | 8396 (43.68) | 2000 (51.19) | |
| High | 2743 (14.27) | 1246 (31.89) | ||
| Low | 470 (2.45) | 121 (3.10) | ||
| Unknown | 7613 (39.61) | 540 (13.82) | ||
| Respiratory rate | Normal | 12864 (66.92) | 1556 (39.83) | |
| High | 3332 (17.33) | 1903 (48.71) | ||
| Low | 27 (0.14) | 37 (0.95) | ||
| Unknown | 2999 (15.60) | 411 (10.52) | ||
| Systolic blood pressure | Normal | 4456 (23.18) | 954 (24.42) | |
| High | 9926 (51.64) | 1753 (44.87) | ||
| Low | 1910 (9.94) | 768 (19.66) | ||
| Unknown | 2930 (15.24) | 432 (11.06) | ||
| Diastolic blood pressure | Normal | 4642 (24.15) | 770 (19.71) | |
| High | 4927 (25.63) | 808 (20.68) | ||
| Low | 6722 (34.97) | 1897 (48.55) | ||
| Unknown | 2931 (15.25) | 432 (11.06) | ||
| Oxygen saturation | 100 - 95% | 13219 (68.77) | 1819 (46.56) | |
| 94 - 90% | 2516 (13.09) | 864 (22.11) | ||
| < 90% | 699 (3.64) | 820 (20.99) | ||
| Unknown | 2788 (14.50) | 404 (10.34) | ||
| Hypertensive heart diseases (I10–I16) | No | 2774 (14.43) | 426 (10.90) | < 0.001 |
| Yes | 16448 (85.57) | 3481 (89.10) | ||
| Ischemic heart diseases (I20–I25) | No | 13742 (71.49) | 2438 (62.40) | |
| Yes | 5480 (28.51) | 1469 (37.60) | ||
| Pulmonary heart diseases (I26–I27) | No | 17676 (91.96) | 3448 (88.25) | |
| Yes | 1546 (8.04) | 459 (11.75) | ||
| Pericarditis (I30–I32) | No | 18851 (98.07) | 3833 (98.11) | 0.932 |
| Yes | 371 (1.93) | 74 (1.89) | ||
| Endocarditis and heart valves disorders (I33–I39) | No | 17343 (90.22) | 3412 (87.33) | < 0.001 |
| Yes | 1879 (9.78) | 495 (12.67) | ||
| Cardiomyopathy (I42–I43) | No | 18043 (93.87) | 3536 (90.50) | |
| Yes | 1179 (6.13) | 371 (9.50) | ||
| Atrioventricular and other conduction disorders (I44–I45) | No | 17593 (91.53) | 3478 (89.02) | |
| Yes | 1629 (8.47) | 429 (10.98) | ||
| Cardiac arrest (I46) | No | 19143 (99.59) | 3880 (99.31) | 0.026 |
| Yes | 79 (0.41) | 27 (0.69) | ||
| Arrythmias (I47–I49) | No | 15167 (78.90) | 2776 (71.05) | < 0.001 |
| Yes | 4055 (21.10) | 1131 (28.95) | ||
| Heart failure (I50) | No | 15599 (81.15) | 2624 (67.16) | |
| Yes | 3623 (18.85) | 1283 (32.84) | ||
| Cerebrovascular disorders (I60–I69) | No | 16782 (87.31) | 3234 (82.77) | |
| Yes | 2440 (12.69) | 673 (17.23) | ||
| Disorders of the arteries, arterioles, and capillaries (I70) | No | 16244 (84.51) | 3093 (79.17) | |
| Yes | 2978 (15.49) | 814 (20.83) | ||
| Disorders of the veins and lymphatic vessels/nodes (I80) | No | 16675 (86.75) | 3316 (84.87) | 0.002 |
| Yes | 2547 (13.25) | 591 (15.13) | ||
| Hypotension (I95) | No | 17085 (88.88) | 3359 (85.97) | < 0.001 |
| Yes | 2137 (11.12) | 548 (14.03) | ||
| Infectious and parasitic diseases (A00-B99) | No | 12836 (66.78) | 2428 (62.14) | < 0.001 |
| Yes | 6386 (33.22) | 1479 (37.86) | ||
| Malignant neoplasms (C00–C96) | No | 17086 (88.89) | 3390 (86.77) | |
| Yes | 2136 (11.11) | 517 (13.23) | ||
| Endocrine, nutritional, and metabolic diseases (E00-E89) | No | 3308 (17.21) | 440 (11.26) | |
| Yes | 15914 (82.79) | 3467 (88.74) | ||
| Mental, behavioral, and neurodevelopmental disorders (F01–F99) | No | 9645 (50.18) | 1833 (46.92) | |
| Yes | 9577 (49.82) | 2074 (53.08) | ||
| Diseases of the nervous system (G00-G99) | No | 10163 (52.87) | 1821 (46.61) | |
| Yes | 9059 (47.13) | 2086 (53.39) | ||
| Diseases of the respiratory system (J00-J99) | No | 8405 (43.73) | 1607 (41.13) | 0.003 |
| Yes | 10817 (56.27) | 2300 (58.87) | ||
| Diseases of the digestive system (K00–K95) | No | 8202 (42.67) | 1669 (42.72) | 0.970 |
| Yes | 11020 (57.33) | 2238 (57.28) | ||
| Diseases of the skin and subcutaneous tissue (L00-L99) | No | 13573 (70.61) | 2683 (68.67) | 0.016 |
| Yes | 5649 (29.39) | 1224 (31.33) | ||
| Diseases of the musculoskeletal system and connective tissue (M00-M99) | No | 6390 (33.24) | 1408 (36.04) | < 0.001 |
| Yes | 12832 (66.76) | 2499 (63.96) | ||
| Diseases of the genitourinary system (N00–N99) | No | 8091 (42.09) | 1350 (34.55) | |
| Yes | 11131 (57.91) | 2557 (65.45) | ||
Fig. 1Visual description of super learning by van der Laan and Rose (2011, 2018).
Cross-validated (training) AUROC.
| Algorithm | Cross-validated (training) AUROC | ||
|---|---|---|---|
| Average | Minimum | Maximum | |
| Super Learner | 0.8006 | 0.7814 | 0.8163 |
| Lasso regression | 0.7964 | 0.7759 | 0.8143 |
| Extreme gradient boosting, max. tree depth of 2 (all variables) | 0.7961 | 0.7774 | 0.8136 |
| Logistic regression (all variables) | 0.7958 | 0.7755 | 0.8137 |
| Logistic regression (forward variable selection) | 0.7957 | 0.7764 | 0.8147 |
| Extreme gradient boosting, max. tree depth of 2 (LASSO variable selection) | 0.7956 | 0.7746 | 0.8131 |
| Linear discriminant analysis (LASSO variable selection) | 0.7948 | 0.7718 | 0.8110 |
| Linear discriminant analysis (all variables) | 0.7947 | 0.7713 | 0.8107 |
| Multivariate adaptive regression splines | 0.7906 | 0.7733 | 0.8105 |
| Random forest (all variables) | 0.7869 | 0.7761 | 0.7981 |
| Random forest (LASSO variable selection) | 0.7845 | 0.7709 | 0.7974 |
| Extreme gradient boosting, max. tree depth of 4 (LASSO variable selection) | 0.7817 | 0.7680 | 0.7963 |
| Extreme gradient boosting, max. tree depth of 4 (all variables) | 0.7804 | 0.7708 | 0.7963 |
| Extreme gradient boosting, max. tree depth of 6 (all variables) | 0.7668 | 0.7488 | 0.7820 |
| Extreme gradient boosting, max. tree depth of 6 (LASSO variable selection) | 0.7663 | 0.7581 | 0.7787 |
Super learner weights (on base learners).
| Base learners | Super Learner Weight | |
|---|---|---|
| Mean | SD | |
| Multivariate adaptive regression splines | 0.203 | 0.048 |
| Extreme gradient boosting, max. tree depth of 2 (all variables) | 0.145 | 0.036 |
| Extreme gradient boosting, max. tree depth of 2 (LASSO variable selection) | 0.131 | 0.030 |
| Linear discriminant analysis (LASSO variable selection) | 0.110 | 0.020 |
| Linear discriminant analysis (all variables) | 0.106 | 0.018 |
| Random forest (LASSO variable selection) | 0.070 | 0.056 |
| Random forest (all variables) | 0.065 | 0.050 |
| Logistic regression (forward variable selection) | 0.054 | 0.035 |
| LASSO regression | 0.031 | 0.022 |
| Extreme gradient boosting, max. tree depth of 6 (all variables) | 0.024 | 0.018 |
| Extreme gradient boosting, max. tree depth of 4 (LASSO variable selection) | 0.023 | 0.030 |
| Logistic regression (all variables) | 0.019 | 0.017 |
| Extreme gradient boosting, max. tree depth of 6 (LASSO variable selection) | 0.012 | 0.020 |
| Extreme gradient boosting, max. tree depth of 4 (LASSO variable selection) | 0.007 | 0.015 |
Fig. 2The precision-recall curve for the Super Learner model.