| Literature DB >> 31965266 |
Lucas M Fleuren1,2, Thomas L T Klausch3, Charlotte L Zwager4, Linda J Schoonmade5, Tingjie Guo4, Luca F Roggeveen4,6, Eleonora L Swart7, Armand R J Girbes4, Patrick Thoral4, Ari Ercole8,9, Mark Hoogendoorn6, Paul W G Elbers4,9.
Abstract
PURPOSE: Early clinical recognition of sepsis can be challenging. With the advancement of machine learning, promising real-time models to predict sepsis have emerged. We assessed their performance by carrying out a systematic review and meta-analysis.Entities:
Keywords: Machine learning; Meta-analysis; Prediction; Sepsis; Septic shock; Systematic review
Mesh:
Year: 2020 PMID: 31965266 PMCID: PMC7067741 DOI: 10.1007/s00134-019-05872-y
Source DB: PubMed Journal: Intensive Care Med ISSN: 0342-4642 Impact factor: 17.440
Fig. 1Left versus right alignment. Left alignment (top) versus right alignment (bottom). Cases are aligned at the alignment point, in the feature window data are collected, the prediction window is the time of the prediction ahead of sepsis onset. Red sepsis cases, green non-septic cases
Fig. 2Flow diagram. Papers identified in databases, title/abstract screened, read full text, and included in the synthesis. Reasons for exclusion are listed
Fig. 3Prospective versus retrospective models. Percentages specified per paper and for all models
Fig. 4Overview of retrospective diagnostic test accuracy studies. Papers are binned per hospital setting, data are sorted in ascending order of AUROC values. AUROC ranges are displayed per paper. AUROC area under the curve of the receiver operating characteristic, SVM support vector machines, GLM generalized linear model, NB Naive Bayes, EM ensemble methods, NNM neural network model, DT decision trees, PHM proportional hazards model, LSTM long short term memory, Hrs bef. onset hours before onset * DT, EM, GLM, LSTM, NB, NNM, SVM
Prospective models
| Paper | Design | Target condition | Patient encounters | Machine learning model | Comparators | |
|---|---|---|---|---|---|---|
| ED | Brown et al. | Prospective validation | Severe sepsis and Septic shock | 93,773 (15 months) | Cut05 Primary outcome Sensitivity: 0.764 False positive rate: 0.47 Secondary outcome AUC: 0.859 | Nurse triage Primary outcome Sensitivity: 0.543 False positive rate: 0.31 Secondary outcome AUC: 0.756 |
SIRS Primary outcome Sensitivity: 0.216 False positive rate: 0.004 Secondary outcome AUC: 0.606 | ||||||
| In-hospital | Thiel et al. | Prospective validation | Septic shock | 27,674 (24 months) | RPARTa 2006 Primary outcome Misclassification rate: 8.4% | None |
RPARTa 2007 Primary outcome Misclassification rate: 8.8% | ||||||
aRecursive partitioning and regression tree (RPART) analysis
bOnly baseline and steady state are reported
cNurses scored patient twice daily to see if they met the SIRS criteria
dElastic net regularization (generalized linear model)
eSignificant results
Target condition definitions per paper per setting
Group listed when more than one paper used definitions. Combined outcomes are not shown, sorted alphabetically
aOrgan dysfunction: initiation of vasopressors or mechanical ventilation, elevated lactate level, or significant changes in baseline creatinine level, bilirubin level, or platelet count
bUndefine
Description of the data per paper and per model
| EDa | In-hospitala | ICUa | ||||
|---|---|---|---|---|---|---|
| Absolute | Proportion | Absolute | Proportion | Absolute | Proportion | |
| Prospective design | 1 | 0.25 | 2 | 0.29 | 1 | 0.07 |
| Privacy statement | 0 | 0.00 | 3 | 0.43 | 5 | 0.33 |
| MIMICb | – | – | – | – | 9 | 0.60 |
| Description of patients | 4 | 1.00 | 2 | 0.29 | 5 | 0.33 |
| Inclusion criteria | 3 | 0.75 | 4 | 0.57 | 12 | 0.80 |
| Country—USA | 4 | 1.00 | 7 | 1.00 | 15 | 1.00 |
| Target condition | ||||||
| Sepsis | 20 | 0.91 | 10 | 0.23 | 37 | 0.71 |
| Severe sepsis | 0 | 0.00 | 1 | 0.02 | 12 | 0.23 |
| Severe sepsis & septic shock | 2 | 0.09 | 1 | 0.02 | 0 | 0.00 |
| Septic shock | 0 | 0.00 | 31 | 0.72 | 3 | 0.06 |
| Components of target condition definition | ||||||
| ICD | 20 | 0.91 | 32 | 0.74 | 17 | 0.33 |
| SIRS | 0 | 0.00 | 4 | 0.09 | 19 | 0.37 |
| SOFA | 0 | 0.00 | 1 | 0.02 | 21 | 0.40 |
| Data split design | ||||||
| Train-(validate)-test | 20 | 0.91 | 15 | 0.35 | 21 | 0.40 |
| Cross-validation | 0 | 0.00 | 25 | 0.58 | 28 | 0.54 |
| Data granularity | ||||||
| 1-hourly values | – | – | 6 | 0.14 | 30 | 0.58 |
| > 1/hourly values | – | – | 0 | 0.00 | 18 | 0.35 |
| Not described | – | – | 31 | 0.72 | 4 | 0.08 |
| Missing values strategies | ||||||
| Feedforward | 0 | 0.00 | 8 | 0.19 | 14 | 0.27 |
| Mean imputation | 0 | 0.00 | 9 | 0.21 | 12 | 0.23 |
| Zero imputation | 0 | 0.00 | 16 | 0.37 | 0 | 0.00 |
| Nearest neighbor | 0 | 0.00 | 0 | 0.00 | 16 | 0.31 |
| Physiological imputation | 13 | 0.59 | 0 | 0.00 | 0 | 0.00 |
| Otherc | 7 | 0.32 | 3 | 0.07 | 2 | 0.04 |
| Not described | 2 | 0.09 | 7 | 0.16 | 8 | 0.15 |
| Model | ||||||
| Generalized linear model | 3 | 0.14 | 6 | 0.14 | 15 | 0.29 |
| Naïve Bayes | 11 | 0.50 | 3 | 0.07 | 0 | 0.00 |
| Ensemble methods | 4 | 0.18 | 9 | 0.21 | 7 | 0.13 |
| Proportional hazard | 0 | 0.00 | 0 | 0.00 | 9 | 0.17 |
| Decision tree | 0 | 0.00 | 9 | 0.21 | 0 | 0.00 |
| Support vector machines | 4 | 0.18 | 3 | 0.07 | 11 | 0.21 |
| Neural network | 0 | 0.00 | 8 | 0.19 | 6 | 0.12 |
| Long short-term memory (LSTM) | 0 | 0.00 | 5 | 0.12 | 4 | 0.08 |
ICD International Statistical Classification of Diseases and Related Health Problems, SIRS systemic inflammatory response syndrome, SOFA sequential organ failure assessment
aStudy by Mao et al. (2017) with an ED, In-hospital, ICU setting has been omitted for brevity
bStudies that included MIMIC in at least one of their reported models
cOthers: proxy variable, removal of variable, and predictive mean matching
Fig. 5Features used in the papers. Features are grouped by type. ESR erythrocyte sedimentation rate, HR heart rate, MAP mean arterial pressure
QUADAS-2 risk of bias assessment per setting
| Paper | Setting | Risk of bias | |||||
|---|---|---|---|---|---|---|---|
| Patient selection | Index test | Reference standard | Flow and timing | Data management | |||
| ED | Horng et al. [ | Sepsis | ? | ||||
| Haug et al. [ | Sepsis | ? | |||||
| Delahanty et al. [ | Sepsis | ? | |||||
| Brown et al. [ | Severe sepsis and septic shock | ? | |||||
| In-hospital | Khojandi et al. [ | Sepsis | ? | ||||
| Futoma et al. [ | Sepsis | ? | |||||
| McCoy et al. [ | Severe sepsis, Sepsis | ? | |||||
| Lin et al. [ | Septic shock | ? | |||||
| Khoshnevisan et al. [ | Septic shock | ? | |||||
| Thiel et al. [ | Septic shock | ? | |||||
| Giannini et al. [ | Severe sepsis and septic shock | ? | |||||
| ICU | Wang et al. [ | Sepsis | ? | ||||
| Shashikumar II et al. [ | Sepsis | ? | |||||
| Shashikumar I et al. [ | Sepsis | ? | |||||
| Scherpf et al. [ | Sepsis | ? | |||||
| Desautels et al. [ | Sepsis | ? | |||||
| Nemati et al. [ | Sepsis | ? | |||||
| Calvert II et al. [ | Sepsis | ? | |||||
| Kam et al. [ | Sepsis | ? | |||||
| Van Wyk I et al. [ | Sepsis | ? | |||||
| Van Wyk II et al. [ | Sepsis | ? | |||||
| Moss et al. [ | Severe sepsis | ? | |||||
| Guillén et al. [ | Severe sepsis | ? | |||||
| Shimabukuro et al. [ | Severe sepsis | ? | |||||
| Henry et al. [ | Septic shock | ? | |||||
| Calvert I et al. [ | Septic shock | ? | |||||
| ED/In-hospital/ICU | Barton et al. [ | Sepsis | ? | ||||
| Mao et al. [ | Sepsis, severe sepsis, septic shock | ? | |||||
low risk, high risk, ? unclear risk
GRADE evidence profile for area under the receiving operating characteristic curve (AUROC)
| Study Characteristics | Quality Assessment | Outcome | |||||||
|---|---|---|---|---|---|---|---|---|---|
| No of studies | Design | Limitations (Unclear risk of bias studies/total) | Indirectness of patients, settingb | Indirectness of outcome | Inconsistencyc | Imprecision | AUROC high risk of bias/unclear risk of bias | Quality of evidence | |
| ED | |||||||||
| 3 studies (3.270.608 patients) | Cohort studies | High risk of bias (2/3) | None | Serious indirectness—differences in outcome definition | Not available | None | 0.95–0.97/0.65–0.97 | ⊕ ⊕ ⊙ ⊙ Low | |
| In-hospital | |||||||||
| 2 studies (51,540 patients) | Cohort studies | High risk of bias (0/2) | None | Serious indirectness—differences in outcome definition | Not available | None | 0.86–0.94 | ⊕ ⊕ ⊙ ⊙ Low | |
| ICU | |||||||||
| 8 studies (125.162 patients) | Cohort studies | High risk of bias (2/8) | None | Serious indirectness—differences in outcome definition | Not available | None | 0.70–0.99/0.81–0.88 | ⊕ ⊕ ⊙ ⊙ Low | |
| 3 studies (6.647 patients) | Cohort studies | High risk of bias (0/3) | None | Serious indirectness—differences in outcome definition | Not available | None | 0.68–0.95 | ⊕ ⊕ ⊙ ⊙ Low | |
| 2 studies (16.234 patientsa) | Cohort studies | High risk of bias (1/2) | None | Serious indirectness—differences in outcome definition | Not available | None | 0.89–0.96/0.83–0.83 | ⊕ ⊕ ⊙ ⊙ Low | |
Only settings with at least two studies are reported
aCalvert et al. (2016) had no information on total number of patients studied
bEvidence profile is binned per setting
cConfidence intervals were inconsistently reported, and therefore no heterogeneity assessment was performed
Univariate and multivariate outcomes
| Variables | Univariate analysis | Multivariate analysis | ||||
|---|---|---|---|---|---|---|
| Coeff | SE | Coeff | SE | |||
| Temperature as feature | 0.788 | 0.239 | 0.002 | 0.812 | 0.218 | 0.000 |
| Lab values as feature | 0.835 | 0.311 | 0.008 | 0.842 | 0.291 | 0.003 |
| Type of model (ref. = EM) | 0.018 | 0.020 | ||||
| Generalized linear model | − 0.211 | 0.251 | − 0.211 | 0.231 | ||
| Naïve Bayes | − 0.651 | 0.312 | − 0.682 | 0.291 | ||
| Neural network | 0.344 | 0.300 | 0.172 | 0.278 | ||
| Proportional hazard | − 0.464 | 0.851 | − 0.506 | 0.673 | ||
| Support vector machines | − 0.168 | 0.256 | − 0.161 | 0.241 | ||
| Decision trees | − 1.013 | 0.419 | − 1.088 | 0.399 | ||
| Target condition defined as Seymour (Sepsis-3) | − 1.039 | 0.459 | 0.025 | |||
| Target condition definition contains SOFA | − 0.935 | 0.438 | 0.033 | |||
| Respiratory rate as feature | 0.672 | 0.250 | 0.008 | |||
| Heart rate as feature | 0.680 | 0.327 | 0.037 | |||
| Arterial blood gas as feature | 0.802 | 0.313 | 0.011 | |||
Coeff coefficient, SE standard error, ref. reference model, EM ensemble methods, SOFA sequential organ failure assessment
Fig. 6Relative effect of hours before sepsis onset on AUROC for different models. Expected change in AUROC for three models at different prediction windows (hours before sepsis onset)
| Retrospective studies demonstrate that machine learning models can accurately predict sepsis and septic shock onset. Prospective clinical studies at the bedside are needed to assess their effect on patient-relevant outcomes. |