| Literature DB >> 34013107 |
Sidney Le1, Angier Allen1, Jacob Calvert1, Paul M Palevsky2, Gregory Braden3, Sharad Patel4, Emily Pellegrini1, Abigail Green-Saxena1, Jana Hoffman1, Ritankar Das1.
Abstract
INTRODUCTION: Acute kidney injury (AKI) is common among hospitalized patients and has a significant impact on morbidity and mortality. Although early prediction of AKI has the potential to reduce adverse patient outcomes, it remains a difficult condition to predict and diagnose. The purpose of this study was to evaluate the ability of a machine learning algorithm to predict for AKI as defined by Kidney Disease: Improving Global Outcomes (KDIGO) stage 2 or 3 up to 48 hours in advance of onset using convolutional neural networks (CNNs) and patient electronic health record (EHR) data.Entities:
Keywords: acute kidney injury; convolutional neural net; electronic health record data; machine learning; prediction; serum creatinine
Year: 2021 PMID: 34013107 PMCID: PMC8116756 DOI: 10.1016/j.ekir.2021.02.031
Source DB: PubMed Journal: Kidney Int Rep ISSN: 2468-0249
Figure 1Inclusion diagram. Patients were required to be at least 18 years of age and must have at least 1 measurement of at least 1 of the input features. MIMIC-III, Medical Information Mart for Intensive Care III.
Demographic characteristics of MIMIC-III ICU encounters found in the 48 hour data set and meeting the inclusion criteria of Figure 1
| Characteristic | Count | % | |
|---|---|---|---|
| Gender | Female | 3186 | 46.71 |
| Male | 3635 | 53.29 | |
| Age (d): | 18–29 | 317 | 4.65 |
| 30–39 | 307 | 4.50 | |
| 40–49 | 665 | 9.75 | |
| 50–59 | 1246 | 18.27 | |
| 60–69 | 1599 | 23.44 | |
| 70+ | 2687 | 39.39 | |
| Length of stay (d): | <3 | 43 | 0.63 |
| 3–5 | 4282 | 62.78 | |
| 6–8 | 1200 | 17.59 | |
| 9–11 | 528 | 7.74 | |
| ≥12 | 768 | 11.26 | |
| Inhospital death | Yes | 1747 | 25.61 |
| No | 5074 | 74.39 | |
| KDIGO stage 2 or 3 | Positive | 520 | 7.62 |
| Negative | 6301 | 92.38 | |
| KDIGO stage 1, 2, or 3 | Positive | 1410 | 20.67 |
| Negative | 5411 | 79.33 | |
ICU, intensive care unit; IQR, interquartile range; KDIGO, Kidney Disease: Improving Global Outcomes; MIMIC-III, Medical Information Mart for Intensive Care III.
We note that the determination of KDIGO positive or negative was made after the data preprocessing steps described in the Methods section.
Results from 10-fold cross-validation of predictions 48 hours before onset on the MIMIC-III data set
| Performance metric | CNN | XGBoost | SOFA | No Doc2Vec | Stage 1 included | Stage 3 only |
|---|---|---|---|---|---|---|
| AUROC mean (SD) | 0.856 (0.034) | 0.654 (0.011) | 0.701 | 0.763 (0.035) | 0.778 (0.037) | 0.819 (0.036) |
| Sensitivity mean (SD) | 0.804 (0.000) | 0.798 (0.000) | 0.798 | 0.805 (0.006) | 0.806 (0.008) | 0.806 (0.000) |
| Specificity mean (SD) | 0.763 (0.057) | 0.380 (0.006) | 0.441 | 0.623 (0.064) | 0.649 (0.074) | 0.679 (0.079) |
| PPV mean (SD) | 0.236 (0.039) | 0.095 (0.001) | 0.127 | 0.163 (0.022) | 0.310 (0.044) | 0.105 (0.023) |
| NPV mean (SD) | 0.975 (0.002) | 0.956 (0.001) | 0.960 | 0.970 (0.003) | 0.940 (0.006) | 0.985 (0.002) |
| Accuracy mean (SD) | 0.765 (0.052) | 0.411 (0.005) | 0.612 | 0.638 (0.056) | 0.672 (0.062) | 0.683 (0.076) |
| DOR mean (SD) | 14.076 (3.779) | 2.421 (0.059) | 3.123 | 7.123 (1.899) | 8.167 (2.425) | 9.566 (3.410) |
| LR+ mean (SD) | 3.558 (0.739) | 1.287 (0.012) | 1.429 | 2.191 (0.362) | 2.389 (0.478) | 2.658 (0.660) |
| LR− mean (SD) | 0.258 (0.021) | 0.532 (0.008) | 0.458 | 0.316 (0.035) | 0.301 (0.035) | 0.288 (0.035) |
| F1 mean (SD) | 0.361 (0.047) | 0.169 (0.001) | 0.214 | 0.270 (0.030) | 0.444 (0.045) | 0.184 (0.036) |
AUROC, area under the receiver operating characteristic curve; CNN, convolutional neural network; DOR, diagnostic odds ratio; KDIGO, Kidney Disease: Improving Global Outcomes; LR+, positive likelihood ratio; LR−, negative likelihood ratio; MIMIC-III, Medical Information Mart for Intensive Care III; NPV, negative predictive value; PPV, positive predictive value; SD, standard deviation; SOFA, sequential organ failure assessment.
The CNN model is compared with an XGBoost classifier and the SOFA score. SOFA required no training and thus could be applied to the entire test set at once; hence, no SD is reported. Additional comparison is made to the CNN model without the use of the Doc2Vec network (i.e., without unstructured text data) and for the prediction of KDIGO criteria of any stage.
Results from 10-fold cross-validation of predictions 24 hours before onset on the MIMIC-III data set
| Performance metric | CNN | XGBoost | SOFA | No Doc2Vec | Stage 1 included | Stage 3 only |
|---|---|---|---|---|---|---|
| AUROC mean (SD) | 0.863 (0.009) | 0.729 (0.009) | 0.727 | 0.769 (0.028) | 0.834 (0.004) | 0.867 (0.009) |
| Sensitivity mean (SD) | 0.803 (0.000) | 0.801 (0.000) | 0.784 | 0.801 (0.003) | 0.798 (0.005) | 0.795 (0.000) |
| Specificity mean (SD) | 0.772 (0.021) | 0.463 (0.026) | 0.537 | 0.585 (0.066) | 0.716 (0.018) | 0.785 (0.024) |
| PPV mean (SD) | 0.221 (0.016) | 0.111 (0.005) | 0.151 | 0.153 (0.019) | 0.359 (0.014) | 0.131 (0.014) |
| NPV mean (SD) | 0.978 (0.001) | 0.964 (0.002) | 0.961 | 0.968 (0.003) | 0.944 (0.001) | 0.988 (0.000) |
| Accuracy mean (SD) | 0.773 (0.020) | 0.489 (0.024) | 0.684 | 0.602 (0.060) | 0.728 (0.014) | 0.784 (0.023) |
| DOR mean (SD) | 13.905 (1.617) | 3.484 (0.367) | 4.200 | 5.861 (1.440) | 10.030 (0.822) | 14.396 (2.212) |
| LR+ mean (SD) | 3.545 (0.319) | 1.494 (0.073) | 1.692 | 1.970 (0.292) | 2.821 (0.178) | 3.740 (0.452) |
| LR− mean (SD) | 0.256 (0.007) | 0.431 (0.024) | 0.403 | 0.344 (0.038) | 0.282 (0.007) | 0.261 (0.008) |
| F1 mean (SD) | 0.345 (0.019) | 0.194 (0.007) | 0.247 | 0.256 (0.027) | 0.494 (0.013) | 0.224 (0.020) |
AUROC, area under the receiver operating characteristic curve; CNN, convolutional neural network; DOR, diagnostic odds ratio; KDIGO, Kidney Disease: Improving Global Outcomes; LR+, positive likelihood ratio; LR−, negative likelihood ratio; MIMIC-III, Medical Information Mart for Intensive Care III; NPV, negative predictive value; PPV, positive predictive value; SD, standard deviation; SOFA, sequential organ failure assessment.
The CNN model is compared with an XGBoost classifier and the SOFA score. SOFA required no training and thus could be applied to the entire test set at once; hence, no SD is reported. Additional comparison is made to the CNN model without the use of the Doc2Vec network (i.e., without unstructured text data) and for the prediction of KDIGO criteria of any stage.
Figure 2ROC curve comparison of prediction performance using a CNN classifier, an XGB classifier, and the SOFA score, 48 hours before AKI onset on the MIMIC-III ICU hold-out data set. AKI, acute kidney injury; AUROC, area under the receiver operating characteristic curve; CNN, convolutional neural network; ICU, intensive care unit; MIMIC-III, Medical Information Mart for Intensive Care III; ROC, receiver operating characteristic; SOFA, Sequential Organ Failure Assessment score; XGB, XGBoost.