| Literature DB >> 35715454 |
Ira S Hofer1,2, Marina Kupina3, Lori Laddaran4, Eran Halperin5,6,7.
Abstract
Manuscripts that have successfully used machine learning (ML) to predict a variety of perioperative outcomes often use only a limited number of features selected by a clinician. We hypothesized that techniques leveraging a broad set of features for patient laboratory results, medications, and the surgical procedure name would improve performance as compared to a more limited set of features chosen by clinicians. Feature vectors for laboratory results included 702 features total derived from 39 laboratory tests, medications consisted of a binary flag for 126 commonly used medications, procedure name used the Word2Vec package for create a vector of length 100. Nine models were trained: baseline features, one for each of the three types of data Baseline + Each data type, (all features, and then all features with feature reduction algorithm. Across both outcomes the models that contained all features (model 8) (Mortality ROC-AUC 94.32 ± 1.01, PR-AUC 36.80 ± 5.10 AKI ROC-AUC 92.45 ± 0.64, PR-AUC 76.22 ± 1.95) was superior to models with only subsets of features. Featurization techniques leveraging a broad away of clinical data can improve performance of perioperative prediction models.Entities:
Mesh:
Year: 2022 PMID: 35715454 PMCID: PMC9205878 DOI: 10.1038/s41598-022-13879-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Study design.
Patient characteristics.
| Property | Population |
|---|---|
| Patients, | 79,662 |
| Admissions, | 93,335 |
| Surgeries, | 101,070 |
| Mortalities, | 2312 (2.29) |
| Kidney failure | 15,985 (15.8) |
| Mean age | 55.79 (18–89) |
| Female patients, | 41,062 (51.55) |
| 1 | 5629 (5.57) |
| 2 | 34,468 (34.10) |
| 3 | 47,596 (47.09) |
| 4 | 11,294 (11.17) |
| 5 | 713 (0.71) |
| General surgery | 20,097 (19.88) |
| Orthopaedics | 15,346 (15.18) |
| Urology | 12,600 (12.47) |
| Neurosurgery | 10,971 (10.85) |
| Other | 41,639 (41.2) |
Patient characteristics for the cohort used for training and testing models. Number of patients and percent of the cohort are shown. The selected surgical services represent the top four most frequent surgical services.
Performance metrics for XGBoost model. XGBoost model performance metrics for predicting in-hospital mortality using different sets of features.
| Mortality | AKI | |||
|---|---|---|---|---|
| ROC-AUC | PR-AUC | ROC-AUC | PR-AUC | |
| Model 1 (baseline) | 92.13 ± 0.23 | 22.93 ± 1.13 | 91.01 ± 0.54 | 72.13 ± 1.65 |
| Model 2 (labs) | 86.53 ± 0.38 | 20.03 ± 1.06 | 86.49 ± 0.48 | 68.78 ± 0.84 |
| Model 3 (proc_name) | 50.04 ± 0.30 | 3.16 ± 3.11 | 50.05 ± 0.11 | 18.12 ± 5.05 |
| Model 4 (medications) | 72.26 ± 0.83 | 9.14 ± 0.58 | 70.75 ± 0.29 | 40.06 ± 0.52 |
| Model 5 (baseline ± labs) | 92.95 ± 0.25 | 23.87 ± 1.32 | 92.10 ± 0.48 | 75.44 ± 1.87 |
| Model 6 (baseline ± proc_name) | 92.89 ± 0.82 | 27.08 ± 4.12 | 91.40 ± 0.45 | 72.79 ± 1.51 |
| Model 7 (baseline ± meds) | 93.09 ± 0.22 | 24.24 ± 1.25 | 91.37 ± 0.55 | 73.13 ± 1.62 |
| Model 8 (all sets) | 94.32 ± 1.01 | 36.80 ± 5.10 | 92.45 ± 0.64 | 76.22 ± 1.95 |
| Model 9 (all_sets ± feature selection) | 93.76 ± 0.95 | 32.33 ± 5.23 | 92.32 ± 0.47 | 75.22 ± 1.71 |
Figure 2AUC curves.
Performance metrics for XGBoost model using different sets of features.
| Mortality | AKI | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| F1 | F1 | |||||||||||
| Score | Accurarcy | Recall | Precision | Specificity | NPv | Score | Accuracy | Recall | Precision | Specificity | NPV | |
| Model 1 (baseline) | 36.3 ± 2.2 | 96.6 ± 0.3 | 45.2 ± 3.6 | 33.5 ± 2.9 | 97.7 ± 0.4 | 98.8 ± 0.1 | 66.1 ± 1.0 | 89.5 ± 0.5 | 67.0 ± 1.4 | 66.1 ± 1.9 | 93.5 ± 0.7 | 94.1 ± 0.2 |
| Model 2 (labs) | 48.3 ± 2.9 | 92.8 ± 0.8 | 49.1 ± 2.7 | 50.7 ± 3.5 | 96.1 ± 0.5 | 95.7 ± 0.5 | 63.0 ± 0.8 | 89.0 ± 0.4 | 60.7 ± 1.2 | 66.4 ± 1.8 | 94.1 ± 0.5 | 92.9 ± 0.2 |
| Model 3 (proc_name) | 4.2 ± 0.4 | 5.9 ± 5.0 | 96.3 ± 4.8 | 3.3 ± 1.9 | 4.1 ± 5.2 | 27.1 ± 0.4 | 15.7 ± 0.3 | 100.0 ± | 15.7 ± 0.3 | 0.1 ± 0.0 | ||
| Model 4 (medications) | 22.7 ± 1.7 | 95.3 ± 0.6 | 32.1 ± 2.9 | 21.8 ± 4.0 | 96.6 ± 0.6 | 98.6 ± 0.1 | 46.2 ± 0.9 | 82.0 ± 0.5 | 49.8 ± 1.5 | 43.8 ± 1.5 | 87.9 ± 0.8 | 90.5 ± 0.3 |
| Model 5 (baseline + labs) | 40.6 ± 2.7 | 97.1 ± 0.4 | 50.9 ± 4.8 | 40.3 ± 4.0 | 98.0 ± 0.4 | 99.0 ± 0.1 | 69.2 ± 0.8 | 90.6 ± 0.4 | 67.6 ± 1.1 | 71.4 ± 1.4 | 94.8 ± 0.5 | 94.1 ± 0.2 |
| Model 6 (baseline + proc_name) | 37.8 ± 2.9 | 96.8 ± 0.3 | 47.7 ± 3.2 | 34.5 ± 3.9 | 97.8 ± 0.3 | 98.9 ± 0.1 | 66.3 ± 0.8 | 89.7 ± 0.4 | 66.9 ± 1.5 | 66.4 ± 1.4 | 93.7 ± 0.5 | 94.1 ± 0.2 |
| Model 7 (baseline + meds) | 38.3 ± 2.7 | 96.8 ± 0.4 | 46.3 ± 3.8 | 37.3 ± 3.8 | 97.9 ± 0.5 | 98.9 ± 0.1 | 67.2 ± 0.9 | 90.1 ± 0.3 | 66.0 ± 1.3 | 69.1 ± 1.7 | 94.5 ± 0.5 | 93.9 ± 0.2 |
| Model 8 (all sets) | 43.8 ± 2.5 | 97.5 ± 0.2 | 50.2 ± 3.1 | 42.3 ± 3.9 | 98.4 ± 0.2 | 99.0 ± 0.1 | 69.4 ± 0.8 | 90.8 ± 0.2 | 69.5 ± 1.4 | 69.9 ± 1.4 | 94.6 ± 0.4 | 94.6 ± 0.2 |
| Model 9 (all_sets +feature selection) | 41.7 ± 2.9 | 97.1 ± 0.2 | 51.2 ± 3.3 | 37.7 ± 3.7 | 98.0 ± 0.3 | 99.0 ± 0.1 | 68.6 ± 1.0 | 90.4 ± 0.4 | 68.6 ± 1.7 | 69.6 ± 1.9 | 94.3 ± 0.6 | 94.4 ± 0.2 |