| Literature DB >> 29888065 |
Ben J Marafino1, R Adams Dudley2, Nigam H Shah3, Jonathan H Chen3.
Abstract
Risk adjustment models for intensive care outcomes have yet to realize the full potential of data unlocked by the increasing adoption of EHRs. In particular, they fail to fully leverage the information present in longitudinal, structured clinical data - including laboratory test results and vital signs - nor can they infer patient state from unstructured clinical narratives without lengthy manual abstraction. A fully electronic ICU risk model fusing these two types of data sources may yield improved accuracy and more personalized risk estimates, and in obviating manual abstraction, could also be used for real-time decision-making. As a first step towards fully "electronic" ICU models based on fused data, we present results of generalized additive modeling applied to a sample of over 36,000 ICU patients. Our approach outperforms those based on the SAPS and OASIS systems (A UC: 0.908 vs. 0.794 and 0.874), and appears to yield more granular and easily visualized risk estimates.Entities:
Year: 2018 PMID: 29888065 PMCID: PMC5961794
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
List of structured data sources and the types of derived features engineered from each source. All derived features are with respect to a window of maximum length 24 hours following ICU admission.
| Laboratory tests | Vital signs | Derived feature types |
|---|---|---|
| Blood urea nitrogen | Heart rate | Mean |
| Bilirubin | Respiratory rate | Standard deviation |
| Creatinine | Temperature | Maximum |
| Lactate | Mean arterial pressure | Minimum |
| Glucose | SaO2 (oxygen saturation) | Last value minus first value (∆ |
| Sodium | FiO2 | Absolute value of difference between last and first values |
| Potassium | Glasgow Coma Score (GCS) – total | Slope of linear trend fit to data using least squares |
| Bicarbonate | GCS – eye response | |
| Hematocrit | GCS – motor response | |
| White blood cell count | GCS – verbal response | |
| Platelet count | ||
| Arterial PaCO2 |
Figure 2.Examples of bivariate risk surfaces estimated for pairs of features derived from unstructured data. Risk estimates are represented by a red (low risk) to white (high) spectrum; the green lines denote contours joining areas of the plot having equal risk.
Figure 3.Bivariate risk surfaces for pairs of features, where one derives from unstructured data (x-axis) and the other from structured data sources (y-axis).
Characteristics of the dataset.
| Patients, total number | 36,043 |
|---|---|
| Deaths (%) | 3,895 (10.8%) |
| Age, mean (IQR) | 61.9 (51-76) |
| Of which male (%) | 20,836 (57.8%) |
| Coronary care | 5,255 (14.6%) |
| Cardiac surgery recovery unit | 7,394 (20.5%) |
| Medical (including Neuro ICU) | 12,549 (34.8%) |
| Surgical | 5,963 (16.5%) |
| Trauma/Surgical | 4,882 (13.6%) |
Model performance comparison.
| Model | AUC (95% CI) |
|---|---|
| Logistic regression on SAPS score only | 0.794 (0.790-0.798) |
| Logistic regression on OASIS score only | 0.874 (0.864-0.881) |
| Logistic regression on fused dataset (FUSED) | 0.857 (0.841-0.872) |
| Gradient boosting machine on FUSED | 0.910 (0.901-0.920) |
| Support vector machine on FUSED | 0.873 (0.855-0.893) |
| Generalized additive model on structured features | 0.853 (0.840-0.866) |