| Literature DB >> 33090117 |
Angier Allen1, Samson Mataraso1, Anna Siefkas1, Hoyt Burdick2,3, Gregory Braden4, R Phillip Dellinger5, Andrea McCoy6, Emily Pellegrini1, Jana Hoffman1, Abigail Green-Saxena1, Gina Barnes1, Jacob Calvert1, Ritankar Das1.
Abstract
BACKGROUND: Racial disparities in health care are well documented in the United States. As machine learning methods become more common in health care settings, it is important to ensure that these methods do not contribute to racial disparities through biased predictions or differential accuracy across racial groups.Entities:
Keywords: health disparities; machine learning; mortality; prediction; racial disparities
Mesh:
Year: 2020 PMID: 33090117 PMCID: PMC7644374 DOI: 10.2196/22400
Source DB: PubMed Journal: JMIR Public Health Surveill ISSN: 2369-2960
Figure 1Attrition diagram for patient inclusion.
Demographic and medical information history for the Medical Information Mart for Intensive Care–III study sample by discharge status.
| Characteristic | Full sample | White subset | Nonwhite subset | |||
|
| Living (n=19,269) | Deceased (n=9191) | Living (n=15,394) | Deceased (n=7896) | Living (n=3875) | Deceased (n=1322) |
| Female, n (%) | 8129 (42.19) | 4269 (46.45) | 6313 (41.01) | 3672 (46.66) | 1816 (46.86) | 597 (45.16) |
| Age, mean (SD) | 60.11 (17.4) | 71.31 (14.7) | 61.4 (17.1) | 71.91 (14.4) | 54.99 (17.7) | 67.73 (15.5) |
| Cardiovascular, n (%) | 15,869 (82.36) | 8085 (87.97) | 12,790 (83.08) | 6928 (88.04) | 933 (24.08) | 394 (29.80) |
| Renal, n (%) | 5778 (29.99) | 3867 (42.07) | 4376 (28.43) | 3243 (41.21) | 391 (10.09) | 201 (15.20) |
| Diabetes, types 1 and 2, n (%) | 3843 (19.94) | 1854 (20.17) | 2903 (18.86) | 1517 (19.28) | 3079 (79.46) | 1157 (87.52) |
| COPDa, n (%) | 1626 (8.44) | 1139 (12.39) | 1428 (9.28) | 1032 (13.11) | 1402 (36.18) | 624 (47.20) |
| Sepsis, n (%) | 729 (3.78) | 321 (3.49) | 534 (3.47) | 269 (3.42) | 195 (5.03) | 52 (3.93) |
| Severe sepsis, n (%) | 3877 (20.12) | 2517 (27.39) | 2944 (19.12) | 2123 (26.98) | 712 (18.37) | 322 (24.36) |
| Septic shock, n (%) | 1823 (9.46) | 1271 (13.83) | 1432 (9.30) | 1070 (13.60) | 401 (10.35) | 184 (13.92) |
| Mental health disorder, n (%) | 7351 (38.15) | 2994 (32.58) | 5882 (38.21) | 2563 (32.57) | 1469 (37.91) | 431 (32.60) |
| Pneumonia, n (%) | 3265 (16.94) | 2186 (23.78) | 2553 (16.58) | 1864 (23.69) | 198 (5.11) | 107 (8.09) |
| Liverb, n (%) | 1602 (8.31) | 1020 (11.10) | 1201 (7.80) | 836 (10.62) | 940 (24.26) | 337 (25.49) |
| Cancer, n (%) | 2941 (15.26) | 2766 (30.09) | 2297 (14.92) | 2335 (29.67) | 644 (16.62) | 431 (32.60) |
| HIV/AIDS, n (%) | 201 (1.04) | 102 (1.11) | 115 (0.75) | 62 (0.79) | 86 (2.22) | 40 (3.03) |
aCOPD: chronic obstructive pulmonary disease.
bAcute and subacute necrosis of liver, chronic liver disease and cirrhosis, liver abscess and sequelae of chronic liver disease, and other disorders of liver.
Performance metrics for the machine learning algorithm and all comparator scores for mortality prediction on the total study population.
| Statistics | MLAa | MEWSb | APACHEc | SAPS-IId |
| AUROCe | 0.780 | 0.580 | 0.700 | 0.660 |
| Sensitivity | 0.751 | 0.523 | 0.678 | 0.674 |
| Specificity | 0.656 | 0.577 | 0.596 | 0.511 |
| DORf | 5.739 | 1.499 | 3.106 | 2.157 |
| LR+g | 2.181 | 1.238 | 1.678 | 1.378 |
| LR–h | 0.380 | 0.826 | 0.540 | 0.639 |
aMLA: machine learning algorithm.
bMEWS: Modified Early Warning Score.
cAPACHE: Acute Physiologic Assessment and Chronic Health Evaluation.
dSAPS II: Simplified Acute Physiology Score II.
eAUROC: area under the receiver operating characteristic.
fDOR: diagnostic odds ratio.
gLR+: positive likelihood ratio.
hLR–: negative likelihood ratio.
Figure 2Comparison of feature importance between (A) models trained with and without preprocessing of the training data and (B) white and nonwhite subgroups on the model trained with preprocessing of the training data.