| Literature DB >> 30111827 |
Franz Ratzinger1, Helmuth Haslacher1, Thomas Perkmann1, Matilde Pinzan1, Philip Anner2, Athanasios Makristathis3, Heinz Burgmann4, Georg Heinze5, Georg Dorffner6.
Abstract
Bacteraemia is a life-threating condition requiring immediate diagnostic and therapeutic actions. Blood culture (BC) analyses often result in a low true positive result rate, indicating its improper usage. A predictive model might assist clinicians in deciding for whom to conduct or to avoid BC analysis in patients having a relevant bacteraemia risk. Predictive models were established by using linear and non-linear machine learning methods. To obtain proper data, a unique data set was collected prior to model estimation in a prospective cohort study, screening 3,370 standard care patients with suspected bacteraemia. Data from 466 patients fulfilling two or more systemic inflammatory response syndrome criteria (bacteraemia rate: 28.8%) were finally used. A 29 parameter panel of clinical data, cytokine expression levels and standard laboratory markers was used for model training. Model tuning was performed in a ten-fold cross validation and tuned models were validated in a test set (80:20 random split). The random forest strategy presented the best result in the test set validation (ROC-AUC: 0.729, 95%CI: 0.679-0.779). However, procalcitonin (PCT), as the best individual variable, yielded a similar ROC-AUC (0.729, 95%CI: 0.679-0.779). Thus, machine learning methods failed to improve the moderate diagnostic accuracy of PCT.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30111827 PMCID: PMC6093921 DOI: 10.1038/s41598-018-30236-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Clinical data of study participants.
| Feature | Missing | No bacteraemia ( | Bacteraemia ( | p–value |
|---|---|---|---|---|
| Age | 0.0% | 56.7 (41.9–69.0) | 60.1 (45.3–69.2) | 0.206 |
| BMI | 0.0% | 24.8 (21.6–28.8) | 24.8 (20.8–27.6) | 0.183 |
| Sex | 0.0% | 192: 140 (57.8%: 42.2%) | 71: 63 (53.0%: 47.0%) | 0.354 |
| AB naïvety | 0.0% | 264 (79.5%) | 116 (86.6%) | 0.087 |
| Catheter | 0.0% | 85 (25.6%) | 45 (33.6%) | 0.088 |
| Post–surgical | 0.0% | 22 (6.6%) | 7 (5.2%) | 0.777 |
| Neoplasm | 0.0% | 129 (38.9%) | 60 (44.8%) | 0.252 |
| HBR | 0.0% | 99 (92–107) | 100 (91–110) | 0.195 |
| RR | 0.0% | 20 (16–24) | 21 (16.5–24) | 0.094 |
| BT | 0.0% | 38.4 (38–38.9) | 38.5 (38.1–39.1) | 0.005* |
| SIRS–No# | 0.0% | 130:155:47 | 52:58:24 | 0.558 |
| COPD | 0.0% | 42 (12.8%) | 12 (9.0%) | 0.268 |
| Asplenia | 0.0% | 8 (2.4%) | 2 (1.5%) | 0.731 |
| Dysuria | 0.0% | 28 (8.5%) | 15 (11.2%) | 0.379 |
| Dialysis | 0.0% | 15 (4.5%) | 7 (5.2%) | 0.810 |
| Diabetes | 0.0% | 57 (17.2%) | 25 (18.7%) | 0.689 |
*significant after applying the Bonferroni-Holm correction, BMI = body mass index, AB naivety = antibiotics prior to blood culture sampling: yes:no, HBR = heart beat rate, RR = respiration rate, BT = body temperature, COPD = Chronic obstructive pulmonary disease.
Laboratory data analysed in the study.
| Feature | Unit | Missing | No bacteraemia | Bacteraemia | p–value | ROCs (95%CI) |
|---|---|---|---|---|---|---|
| PCT | ng/ml | 1.5% | 0.3 (0.1–1.0) | 1.6 (0.4–5.4) | <0.001* | 0.729 (0.679–0.779) |
| CRP | mg/dl | 0.4% | 12.9 (7.9–20.4) | 15.0 (9.6–22.8) | 0.020 | 0.569 (0.512–0.626) |
| LBP | µg/ml | 1.1% | 23.1 (15.6–35.4) | 29.7 (19.7–44.25) | <0.001* | 0.610 (0.553–0.667) |
| IL-6 | pg/ml | 1.7% | 42.8 (19.2–99.9) | 49.6 (28.9–130.0) | 0.028 | 0.566 (0.508–0.623) |
| Fib | mg/dl | 6.0% | 607 (446–752) | 613 (490–714) | 0.875 | 0.505 (0.447–0.563) |
| SI | µg/dl | 1.7% | 26.0 (14.8–57.0) | 21.0 (12.3–46.8) | 0.042 | 0.561 (0.502–0.619) |
| TP | g/l | 2.4% | 61.6 (55.8–67.5) | 60.5 (53.2–65.8) | 0.062 | 0.556 (0.498–0.614) |
| ALAT | U/L | 1.9% | 25.0 (15.0–45.0) | 33.0 (18.0–64.0) | 0.005 | 0.585 (0.526–0.643) |
| Alb | g/l | 2.4% | 31.3 (27.9–35.1) | 29.5 (25.6–33.1) | <0.001* | 0.603 (0.546–0.673) |
| Bili | mg/dl | 3.9% | 0.6 (0.5–1.0) | 0.8 (0.6–1.4) | <0.001* | 0.616 (0.558–0.673) |
| γ–GT | U/L | 2.4% | 63 (30–130) | 102 (44–260) | <0.001* | 0.617(0.559–0.675) |
| Crea | mg/dl | 0.4% | 0.9 (0.8–1.3) | 0.9 (0.8–1.3) | 0.707 | 0.511 (0.452–0.570) |
| LDH | U/L | 4.3% | 220 (168–312) | 200 (157–286) | 0.161 | 0.543 (0.485–0.602) |
| Hb | g/dl | 2.4% | 10.0 (9.0–11.7) | 10.0 (9.1–10.9) | 0.340 | 0.529 (0.473–0.584) |
| Plt | G/l | 2.4% | 208 (116–326) | 190 (134–279) | 0.426 | 0.524 (0.457–0.580) |
| WBC | G/l | 2.4% | 8.7 (5.4–13.6) | 8.7 (5.5–12.6) | 0.629 | 0.486 (0.428–0.543) |
| NeuR | % | 7.1% | 75.5 (63.9–82.8) | 79.5 (67.2–86.6) | 0.021 | 0.570 (0.510–0.631) |
| EosR | % | 4.3% | 0.9 (0.2–2.5) | 0.8 (0.2–1.9) | 0.308 | 0.531 (0.472–0.589) |
| IL-10** | pg/ml | 0.0% | 2.2 (1.4–4.6) | 3.2 (1.7–7.3) | 0.002 | 0.589 (0.532–0.645) |
| IL-17a** | pg/ml | 0.0% | 0.8 (0.0–3.1) | 2.7 (0–7.5) | <0.001* | 0.601 (0.542–0.660) |
| MIP-1b** | pg/ml | 0.0% | 52.1 (29.7–82.5) | 72.05 (43.6–134.7) | <0.001* | 0.615 (0.557–0.673) |
*significant after applying the Bonferroni-Holm correction; CRP = C-reactive protein, LBP = lipopolysaccharide binding protein, PCT = procalcitonin; IL-6 = interleukin-6, Fib = fibrinogen according to Clauss, SI = serum iron, TP = total protein, ALAT = alanine transaminase, Alb = albumin, Bili = bilirubin, γ-GT = gamma-glutamyl transpeptidase, Crea = creatinine, LDH = lactate dehydrogenase, Hb = haemoglobin, Plt = platelets, WBC = white blood cell counts, NeuR = relative proportion of neutrophils, EosR = relative proportion of eosinophils, IL-10 = interleukin-10, IL-17a = interleukin-17a, MIP-1b = macrophage inflammatory protein-1β, **not in routine use.
Figure 1Correlogram of features with the highest correlation to PCT. The labelling of the x and y axis is presented in the diagonal. Following parameters are displayed: PCT = procalcitonin, CRP = C-reactive protein, TP = total protein, LBP = lipopolysaccharide binding protein, Alb = albumin, Crea = creatinine, IL-6 = interleukin-6, NeuR = relative proportion of neutrophils, Plt = platelets, Bili = bilirubin; Spearman correlation coefficient is presented in the left lower part of the correlogram p-values are denoted as following: ***<0.0001, **<0.001,*<0.05, in the right upper part of the correlogram scatterplots of the presented features are shown.
Figure 2Missing data aggregation plot. left = distribution of missing data, shown in percentage, right = missing pattern analysis (aggregation missingness plot, VIM package), percentages of missing patterns are displayed on the right side, 81% of the total study population had no missing values.
Comparison of the ROC-AUC of the used ML strategies in different patient groups.
| All patients | AB naïvety | Patients with 2 SIRS criteria | Patients with 3 SIRS criteria | Patients with 4 SIRS criteria | |
|---|---|---|---|---|---|
|
| 466 | 380 | 182 | 213 | 71 |
| Bacteraemia rate | 28.8% | 30.5% | 28.6% | 27.2% | 33.8% |
| PCT | 0.729 (0.679–0.779) | 0.734 (0.680–0.787) | 0.679 (0.598–0.762) | 0.756 (0.678–0.833) | 0.751 (0.633–0.869) |
|
| 0.738 (0.606–0.870) | 0.727 (0.548–0.905) | 0.698 (0.349–0.999) | 0.781 (0.573–0.988) | 0.585 (0.188–0.981) |
|
| 0.698 (0.549–0.857) | 0.688 (0.499–0.876) | 0.640 (0.355–0.925) | 0.714 (0.497–0.930) | 0.583 (0.181–0.985) |
|
| 0.654 (0.493–0.815) | 0.627 (0.396–0.858) | 0.594 (0.334–0.854) | 0.690 (0.466–0.914) | 0.612 (0.214–0.999) |
PCT = procalcitonin, rf = random forest, nn = neural network, en = elastic net.