| Literature DB >> 34484472 |
Oscar Garnica1, Diego Gómez2, Víctor Ramos2, J Ignacio Hidalgo1, José M Ruiz-Giardín3.
Abstract
Background: The bacteraemia prediction is relevant because sepsis is one of the most important causes of morbidity and mortality. Bacteraemia prognosis primarily depends on a rapid diagnosis. The bacteraemia prediction would shorten up to 6 days the diagnosis, and, in conjunction with individual patient variables, should be considered to start the early administration of personalised antibiotic treatment and medical services, the election of specific diagnostic techniques and the determination of additional treatments, such as surgery, that would prevent subsequent complications. Machine learning techniques could help physicians make these informed decisions by predicting bacteraemia using the data already available in electronic hospital records. Objective: This study presents the application of machine learning techniques to these records to predict the blood culture's outcome, which would reduce the lag in starting a personalised antibiotic treatment and the medical costs associated with erroneous treatments due to conservative assumptions about blood culture outcomes.Entities:
Keywords: Bacteraemia diagnosis; Bacteraemia prediction; Blood culture’s outcome prediction; COVID-19; Health policy; Healthcare economy; Individualised electronic patient record analysis; K-Nearest neighbours; Machine learning; Modelling; Personalised antibiotic treatment; Predictive; Preventive and personalised medicine (PPPM/3PM); Random forest; Support vector machine
Year: 2021 PMID: 34484472 PMCID: PMC8405861 DOI: 10.1007/s13167-021-00252-3
Source DB: PubMed Journal: EPMA J ISSN: 1878-5077 Impact factor: 6.543
Fig. 1Accuracy of the individual features when only two classes (missing and non-missing) are used to predict bacteraemia
Fig. 2Percentage of missing values for all the features in . The features are sorted on x-axis as in Table 5. The annotations in the graph mark the inflection points, and they facilitate cross-searching in Table 5
Features in the study sorted according to the number of missing values
| # | Description |
|---|---|
| 1 | Days in Intensive Care Unit before blood culture extraction |
| 2 | Suspected source of bacteraemia previous to blood culture |
| 3 | C-reactive protein level |
| 4 | Days after last catheter was placed |
| 5 | Altered coagulation values |
| 6 | Heart rate |
| 7 | Catheter type |
| 8 | Urea (mg dl− 1) |
| 9 | Diastolic blood pressure |
| 10 | Systolic blood pressure |
| 11 | Hypotension |
| 12 | Fever. Armpit temperature> 38 ∘C at the time of blood extraction |
| 13 | Armpit temperature at blood extraction in Emergency Room |
| 14 | Consciousness level at the moment of bacteraemia |
| 15 | Use of vasopressor agents at the time of bacteraemia |
| 16 | Cardiorespiratory resucitation at the moment of bacteraemia |
| 17 | Days to CO2 detection |
| 18 | Days with fever before blood culture is obtained |
| 19 | First blood culture vial with growth |
| 20 | Genitourinary manipulations |
| 21 | Vascular manipulations |
| 22 | Thrombocytopenia |
| 23 | Leukocytosis |
| 24 | Respiratory manipulations |
| 25 | Digestive manipulations |
| 26 | Symptoms related to the source of fever |
| 27 | Glycemia |
| 28 | Neutropenia |
| 29 | Previous surgery |
| 30 | Steroids |
| 31 | Immunosuppressants |
| 32 | Drug addiction |
| 33 | Urine sediment |
| 34 | Blood creatinine (mg dl− 1) |
| 35 | Comorbidites by Weinstein classification |
| 36 | Alcoholism |
| 37 | Renal insufficiency |
| 38 | Intravenous drug addiction |
| 39 | Cardiopathy |
| 40 | Diabetes |
| 41 | Chronic respiratory disease |
| 42 | Hepatopathy |
| 43 | Active neoplasia |
| 44 | Hospitalization longer than 48h in last 12 months |
| 45 | Leukocytes ( |
| 46 | Polymorphonuclear leukocytes (%) |
| 47 | Syndromes related to the source of fever |
| 48 | Platelets ( |
| 49 | Hospitalization in the last 30 days |
| 50 | Hemoglobin (gdl− 1) |
| 51 | Specialty where bacteraemia is suspected |
| 52 | Days in Hospital before blood extraction |
| 53 | Antibiotics |
| 54 | Systematic urine analysis |
| 55 | Comorbidities |
| 56 | Number of blood culture vials obtained |
| 57 | Growth at least in anaerobic environments |
| 58 | Growth at least in aerobic environments |
| 59 | Polymicrobial bacteraemia microorganisms |
| 60 | Growth medium of true bacteraemias |
| 61 | Day of blood extraction |
| 62 | Month of blood extraction |
| 63 | Anaerobic bacteraemias versus other bacteraemias |
| 64 | Fungal bacteraemias versus other bacteraemias |
| 65 | Anaerobic microorganisms |
| 66 | Polymicrobial origin bacteraemia |
| 67 | Age |
| 68 | Gender |
| 69 | Final classification of blood culture |
Fig. 3Number of features versus number of non-missing values in dataset
Accuracy, specificity, sensitivity, positive predictive value (PPV), negative predictive value (NPV) and area under the curve (AUC) of the models
| ML | Model | Accuracy (%) | Sensitivity | Specificity | PPV | NPV | AUC | |
|---|---|---|---|---|---|---|---|---|
| Training | Testing | (%) | (%) | (%) | (%) | |||
| SVM | pre_culture | 76.9 ± 1.7 | 75.9 | 80.7 | 71.4 | 72.8 | 79.6 | 0.85 |
| mid_culture | 83.0 ± 1.4 | 80.5 | 81.3 | 79.7 | 80.5 | 80.5 | 0.88 | |
| RF | pre_culture | 79.5 ± 1.4 | 78.2 | 86.1 | 70.7 | 73.6 | 84.3 | 0.86 |
| mid_culture | 85.6 ± 1.4 | 85.9 | 87.4 | 84.4 | 85.2 | 86.6 | 0.93 | |
| KNN | pre_culture | 72.8 ± 2.3 | 76.5 | 89.6 | 65.2 | 69.0 | 87.9 | 0.85 |
| mid_culture | 78.0 ± 2.7 | 78.4 | 87.4 | 69.9 | 73.6 | 85.2 | 0.88 | |
For the sake of saving space, the standard deviation is presented in compact notation
Feature importance for SVM
The left-hand side of the table ranks the top 10 features for the pre_culture model, whereas the right-hand side ranks the top 10 features for the mid_culture model. In blue, the new features included in the mid-culture model. For the sake of saving space, the standard deviation is presented in compact notation, that is, 0.4514(540) ≡ 0.4514 ± 0.0540. The number close to the feature name refers to the Id. in Table 5 that describes the feature
Fig. 4ROC for the best SVM, RF and KNN for models
Feature importance for RF
The left-hand side of the table ranks the top 10 features for the pre_culture model whereas the right-hand side ranks the top 10 features for the mid_culture model. In blue, the new features included in the mid-culture model. The number close to the feature name refers to the Id. in Table 5 that describes the feature
Feature importance for KNN
The left-hand side of the table ranks the top 10 features for the pre_culture model whereas the right-hand side ranks the top 10 features for the mid_culture model. In blue, the new features included in the mid-culture model