| Literature DB >> 32596877 |
Andreas Mitterecker1, Axel Hofmann2, Kevin M Trentino3, Adam Lloyd3, Michael F Leahy4, Karin Schwarzbauer1, Thomas Tschoellitsch5, Carl Böck5, Sepp Hochreiter1, Jens Meier5.
Abstract
BACKGROUND: The ability to predict transfusions arising during hospital admission might enable economized blood supply management and might furthermore increase patient safety by ensuring a sufficient stock of red blood cells (RBCs) for a specific patient. We therefore investigated the precision of four different machine learning-based prediction algorithms to predict transfusion, massive transfusion, and the number of transfusions in patients admitted to a hospital. STUDY DESIGN AND METHODS: This was a retrospective, observational study in three adult tertiary care hospitals in Western Australia between January 2008 and June 2017. Primary outcome measures for the classification tasks were the area under the curve for the receiver operating characteristics curve, the F1 score, and the average precision of the four machine learning algorithms used: neural networks (NNs), logistic regression (LR), random forests (RFs), and gradient boosting (GB) trees.Entities:
Mesh:
Year: 2020 PMID: 32596877 PMCID: PMC7540018 DOI: 10.1111/trf.15935
Source DB: PubMed Journal: Transfusion ISSN: 0041-1132 Impact factor: 3.157
Demographic data
| Variable | All patients N = 206 271 (100%) | Had no RBC transfusion n = 180 615 (87.6%) | Had RBC transfusion n = 25 656 (12.4%) | Had transfusion but no massive transfusion n = 24 688 (12.0%) | Had massive transfusion n = 968 (0.4%) |
|---|---|---|---|---|---|
| Patients, n (%) | |||||
| Hospital 1 | 60 246 (29.2) | 53 085 (29.4) | 7161 (2.8) | 6943 (28.1) | 218 (22.5) |
| Hospital 2 | 29 832 (14.5) | 25 986 (14.4) | 3846 (1.5) | 3752 (15.2) | 94 (9.7) |
| Hospital 3 | 116 193 (56.3) | 101 544 (56.2) | 14 649 (5.7) | 13 993 (56.7) | 656 (67.8) |
| Specialty, n (%) | |||||
| General surgery | 47 471 (23.0) | 43 411 (24.0) | 4060 (15.8) | 3737 (15.1) | 323 (33.4) |
| General medicine | 37 510 (18.2) | 32 286 (17.9) | 5224 (20.4) | 5109 (20.7) | 115 (11.9) |
| Orthopedics | 29 449 (14.3) | 25 348 (14.0) | 4101 (16.0) | 4054 (16.4) | 47 (4.9) |
| Cardiology | 27 699 (13.4) | 26 159 (14.5) | 1540 (6.0) | 1445 (5.9) | 95 (9.8) |
| Other | 64 142 (31.1) | 53 411 (29.6) | 10 731 (41.8) | 10 343 (41.9) | 388 (40.1) |
| Age, y, median (range) | 65 (48‐78) | 65 (47‐78) | 70 (56‐81) | 71 (56‐81) | 60 (76‐43) |
| Sex, n (%) | |||||
| Female | 90 213 (43.7) | 78 625 (43.5) | 11 588 (45.2) | 11 317 (45.8) | 271 (28.0) |
| Male | 116 058 (56.3) | 101 990 (56.5) | 14 068 (54.8) | 13 371 (54.2) | 697 (72.0) |
| Charlson comorbidity index | 0 (0/1) | 0 (0/1) | 1 (0/3) | 1 (0/3) | 1 (0/3) |
| Length of stay, d, median (range) | 4 (3‐8) | 4 (2‐7) | 9 (2‐19) | 9 (5‐19) | 19 (9‐35) |
| Length of stay ICU, h, median (range) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 90 (24‐216) |
| Anemia at admission | |||||
| None | 120 425 | 116 193 | 4232 | 3963 | 269 |
| Mild | 39 871 | 36 066 | 3805 | 3598 | 339 |
| Severe | 7354 | 862 | 6492 | 6339 | 153 |
| Hemoglobin concentration admission, g/dL, median (range) | 12.9 (11.2‐14.3) | 13.2 (11.7‐14.5) | 9.5 (7.9‐11.5) | 9.4 (7.9‐11.5) | 10.9 (8.9‐12.9) |
| RBC transfusion, median (range) | 0 (0‐0) | 0 (0‐0) | 2 (2‐4) | 2 (2‐4) | 11 (5‐17) |
| Cryo transfusion, median (range) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 4 (0‐9) |
| FFP transfusion, median (range) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 6 (2‐10) |
| Platelet transfusion, median (range) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 0 (0‐0) | 1 (0‐3) |
| Complications | |||||
| Postprocedural | 14 353 | 9220 | 5133 | 4655 | 478 |
| Infections | 2704 | 1278 | 1426 | 1318 | 108 |
| Cardiovascular | 14 401 | 9529 | 4872 | 4507 | 365 |
| Respiratory | 7358 | 4564 | 4564 | 2556 | 238 |
| Gastrointestinal | 7815 | 7815 | 2794 | 2484 | 174 |
| Genitourinary | 7237 | 5157 | 2658 | 2374 | 191 |
| Hematological | 4350 | 866 | 3484 | 3253 | 231 |
| Mortality (%) | 2.2 | 1.6 | 6.4 | 5.9 | 18.1 |
Note: Demographic parameters of patients included.
Prediction of transfusion
| Method | AUC | AP | BA | Sens | Spec | Prec | NPV | F1 |
|---|---|---|---|---|---|---|---|---|
| Neural network | 0.966 (± 0.004) | 0.828 (± 0.012) |
|
| 0.958 (± 0.009) | 0.719 (± 0.022) |
| 0.749 (± 0.006) |
| Logistic regression | 0.965 (± 0.005) | 0.820 (± 0.011) | 0.856 (± 0.006) | 0.894 (± 0.012) |
|
| 0.966 (± 0.004) | 0.748 (± 0.010) |
| Random forest | 0.963 (± 0.004) | 0.821 (± 0.011) | 0.858 (± 0.004) | 0.584 (± 0.006) | 0.964 (± 0.006) | 0.737 (± 0.011) | 0.966 (± 0.006) | 0.743 (± 0.006) |
| Gradient boosting |
|
| 0.864 (± 0.008) | 0.872 (± 0.006) | 0.965 (± 0.005) | 0.747 (± 0.025) | 0.968 (± 0.007) |
|
Note: Statistical parameters of the prediction of transfusion vs no transfusion by different models.
Abbreviations: AP, average precision; AUC, area under the receiver operating characteristic curve; BA, balanced accuracy; F1, harmonic mean of precision and recall; NPV, negative predictive value; Prec, precision or positive predictive value; Sens, sensitivity; Spec, specificity.
The bold values is the highest (best) for each method respectively.
FIGURE 1Transfusion of at least 1 RBC unit. Transfusion of at least 1 RBC unit. A, ROC curves for the different methods. B, Precision‐recall curve for the different methods [Color figure can be viewed at wileyonlinelibrary.com]
Feature importance for transfusion
| Rank | Random forest | Gradient boosting | Logistic regression | |||
|---|---|---|---|---|---|---|
| Feature | Importance | Feature | Importance | Feature | Importance | |
| 1 | Hb at admission | 137.95 | Hb at admission | 157.49 | Hb at admission | 32.43 |
| 2 | Secondary diagnosis code D64.9: Anemia, unspecified | 49.09 | Age | 101.28 | Secondary diagnosis code D64.9: Anemia, unspecified | 14.50 |
| 3 | Age | 36.53 | CCI | 33.03 | DRG F10B: Interventional coronary procedures | 10.34 |
| 4 | Secondary diagnosis code D50.0: Iron deficiency | 26.04 | Secondary diagnosis code D64.9: Anemia, unspecified | 15.54 | Secondary diagnosis code D50.0: Iron deficiency | 9.32 |
| 5 | CCI | 18.31 | Hb at admission grouped | 12.82 | Secondary diagnosis code D62: Acute posthemorrhagic anaemia | 6.90 |
| 6 | Secondary diagnosis code D62: Acute posthemorrhagic anaemia | 16.62 | Sex | 11.53 | DRG minor class | 6.52 |
| 7 | Sex | 10.16 | Secondary diagnosis code Y92.22: Health service area | 10.88 | DRG F41B: Circulatory disorders, Adm | 5.69 |
| 8 | Secondary diagnosis code D63.0: Anemia neuroplastic disease | 7.36 | Admission year 8 | 8.03 | DRG I68B: Nonsurgical spinal disorders, minor complexity | 5.67 |
Note: Feature importance for transfusion of at least 1 RBC unit for random forest, gradient boosting, and logistic regression. Details of how the importance of features was calculated can be found inAppendix S1 C, available as supporting information in the online version of this paper.
Abbreviations: CCI, Charlson Comorbidity Index; DRG, diagnosis‐related group; Hb, hemoglobin.
Prediction of massive transfusion
| Method | AUC | AP | BA | Sens | Spec | Prec | NPV | F1 |
|---|---|---|---|---|---|---|---|---|
| Neural network | 0.945 (± 0.014) | 0.162 (± 0.028) | 0.656 (± 0.034) | 0.780 (± 0.049) | 0.994 (± 0.002) | 0.206 (± 0.060) | 0.997 (± 0.001) | 0.245 (± 0.046) |
| Logistic regression |
| 0.176 (± 0.021) | 0.645 (± 0.042) | 0.721 (± 0.055) | 0.995 (± 0.002) | 0.211 (± 0.032) | 0.997 (± 0.000) | 0.241 (± 0.031) |
| Random forest | 0.932 (± 0.018) | 0.174 (± 0.031) |
| 0.002 (± 0.003) | 0.993 (± 0.002) | 0.191 (± 0.025) |
| 0.244 (± 0.026) |
| Gradient boosting | 0.947 (± 0.013) |
| 0.661 (± 0.038) |
|
|
| 0.997 (± 0.000) |
|
Note: Statistical parameters of the prediction of massive transfusion vs no massive transfusion by different models.
Abbreviations: AP, average precision; AUC, area under the receiver operating characteristic curve; BA, balanced accuracy; F1, harmonic mean of precision and recall; NPV, negative predictive value; Prec, precision or positive predictive value; Sens, sensitivity; Spec, specificity.
The bold values is the highest (best) for each method respectively.
FIGURE 2Massive transfusion. Prediction of massive transfusion. A, ROC curves for the different methods. B, Precision‐recall curve for the different methods [Color figure can be viewed at wileyonlinelibrary.com]
Feature importance for massive transfusion
| Rank | Random forest | Gradient boosting | Logistic regression | |||
|---|---|---|---|---|---|---|
| Feature | Importance | Feature | Importance | Feature | Importance | |
| 1 | Hb at admission | 73.38 | Hb at admission | 111.55 | DRG: Uncommon group | 7.79 |
| 2 | Age | 37.01 | Age | 77.77 | Primary diagnosis code: Uncommon diagnosis | 5.94 |
| 3 | Secondary diagnosis code D62: Acute posthemorrhagic anemia | 26.09 | CCI | 30.64 | Secondary diagnosis Z72.0: Tobacco use | 5.79 |
| 4 | CCI | 23.62 | DRG minor class | 22.00 | Secondary diagnosis code U73.9: Unspecified activity | 5.28 |
| 5 | Secondary diagnosis code T81.0: Hemorrhage | 22.22 | Secondary diagnosis T81.0: Hemorrhage | 18.07 | Hb at admission | 5.27 |
| 6 | DRG: Uncommon group | 20.68 | Secondary diagnosis D62: Acute posthemorrhagic anemia | 17.28 | DRG F62B: Heart failure and shock … | 5.19 |
| 7 | DRG A06B: Tracheostomy … | 18.25 | Primary diagnosis code: Uncommon diagnosis | 15.71 | Age | 4.99 |
| 8 | DRG: Uncommon group | 17.47 | Sex | 14.93 | DRG F62A: Heart failure and shock … | 4.67 |
Note: Feature importance for massive transfusion for random forest, gradient boosting, and logistic regression. Details of how the importance of features was calculated can be found in Appendix S1 C, available as supporting information in the online version of this paper.
Abbreviations: CCI, Charlson Comorbidity Index; DRG, diagnosis‐related group; Hb, hemoglobin.
Prediction of number of RBCs transfused
| Method | RMSE |
|
|---|---|---|
| Baseline: Mean | 19.533 (± 1.488) | 0.000 (± 0.000) |
| Baseline: Median | 22.236 (± 1.273) | −0.140 (± 0.022) |
| Neural network | 16.549 (± 1.200) | 0.152 (± 0.008) |
| Huber regression | 17.140 (± 1.379) | 0.122 (± 0.009) |
| Random forest | 16.890 (± 1.295) | 0.135 (± 0.004) |
| Gradient boosting |
|
|
Note: Statistical parameters of the prediction of the number of RBCs transfused by different models.
Abbreviations: MAE, mean absolute error; RMSE, root mean square error; R 2, coefficient of determination.