| Literature DB >> 35333881 |
Bongjin Lee1, Hyun Jung Chung2, Hyun Mi Kang3, Do Kyun Kim4, Young Ho Kwak4.
Abstract
Serious bacterial infection (SBI) in children, such as bacterial meningitis or sepsis, is an important condition that can lead to fatal outcomes. Therefore, since it is very important to accurately diagnose SBI, SBI prediction tools such as 'Refined Lab-score' or 'clinical prediction rule' have been developed and used. However, these tools can predict SBI only when there are values of all factors used in the tool, and if even one of them is missing, the tools become useless. Therefore, the purpose of this study was to develop and validate a machine learning-driven model to predict SBIs among febrile children, even with missing values. This was a multicenter retrospective observational study including febrile children <6 years of age who visited Emergency departments (EDs) of 3 different tertiary hospitals from 2016 to 2018. The SBI prediction model was trained with a derivation cohort (data from two hospitals) and externally tested with a validation cohort (data from a third hospital). A total of 11,973 and 2,858 patient records were included in the derivation and validation cohorts, respectively. In the derivation cohort, the area under the receiver operating characteristic curve (AUROC) of the RF model was 0.964 (95% confidence interval [CI], 0.943-0.986), and the area under the precision-recall curve (AUPRC) was 0.753 (95% CI, 0.681-0.824). The conventional LR (CLR) model showed corresponding values of 0.902 (95% CI, 0.894-0.910) and 0.573 (95% CI, 0.560-0.586), respectively. In the validation cohort, the AUROC (95% CI) of the RF model was 0.950 (95% CI, 0.945-0.956), the AUPRC was 0.605 (95% CI, 0.593-0.616), and the CLR presented corresponding values of 0.815 (95% CI, 0.789-0.841) and 0.586 (95% CI, 0.553-0.619), respectively. We developed a machine learning-driven prediction model for SBI among febrile children, which works robustly despite missing values. And it showed superior performance compared to CLR in both internal validation and external validation.Entities:
Mesh:
Year: 2022 PMID: 35333881 PMCID: PMC8956167 DOI: 10.1371/journal.pone.0265500
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Flow chart of study subjects and the process of five-fold cross validation.
Baseline characteristics of datasets.
| Variables | Derivation cohort (n = 11,973) | Validation cohort (n = 2,858) | |||
|---|---|---|---|---|---|
| Age, months | 20.0 (11.0–37.0) | 21.0 (12.0–35.0) | |||
| Female | 5,467 (45.7) | 1,312 (45.9) | |||
| Clinical findings | Fever duration, hours | 24.0 (12.0–48.0) | 24.0 (7.0–48.0) | ||
| Immunizations administered as recommended schedule | 9,822 (95.9) | 1,258 (91.8) | |||
| Attends day care center | 3,995 (40.8) | 900 (70.1) | |||
| Physical examination findings | Heart rate, beats/minute | 150.0 (133.0–167.0) | 140.0 (129.0–153.0) | ||
| Respiratory rate, breaths/minute | 30.0 (26.0–36.0) | 22.0 (20.0–28.0) | |||
| Body temperature, ℃ | 38.3 (37.7–39.0) | 38.3 (37.7–39.0) | |||
| Rash | 703 (6.5) | 119 (4.3) | |||
| Laboratory examination finding | Leukocyte, cells/mm3 | 7,365.0 (1,550.0–12,355.0) | 10,350.0 (6,515.0–14,425.0) | ||
| C-reactive protein, mg/dL | 1.0 (0.3–3.0) | 0.9 (0.3–2.3) | |||
| Procalcitonin, μg/L | 1.8 (1.1–3.7) | 0.1 (0.1–0.1) | |||
| pH | 7.4 (7.4–7.4) | 7.4 (7.4–7.4) | |||
| Urinalysis | Bacteriuria | 786 (14.0) | 126 (14.3) | ||
| Leukocyte esterase | Negative | 4,429 (79.0) | 652 (74.0) | ||
| Trace | 250 (4.5) | 68 (7.7) | |||
| 1+ | 373 (6.7) | 70 (7.9) | |||
| 2+ | 264 (4.7) | 45 (5.1) | |||
| 3+ | 288 (5.1) | 46 (5.2) | |||
| Urine culture performed | 2,756 (23.0) | 764 (26.7) | |||
| Blood culture performed | 3,470 (29.0) | 952 (33.3) | |||
| Cerebrospinal fluid examination performed | 208 (1.7) | 2 (0.1) | |||
| Serious bacterial infection | Bacteremia | 26 (5.6 | 0 (0.0 | ||
| Urinary tract infection | 434 (93.1 | 93 (98.9 | |||
| Lobar pneumonia | 4 (0.9 | 1 (1.1 | |||
| Bacterial CNS infection | 1 (0.2 | 0 (0.0 | |||
| Septic arthritis | 1 (0.2 | 0 (0.0 | |||
Continuous data are presented as median (interquartile range) and categorical data as number (%).
aPercentage of each item in all serious bacterial infection cases.
CNS, central nervous system.
Fig 2Internal and external validation of predictive models.
The area under the receiver operating characteristics curves of the derivation cohort (A), the curves of the validation cohort (B), and the area under the precision-recall curves of the derivation cohort (C) and validation cohort (D) are shown. AUC = the area under the curve, CI = confidence interval.
Fig 3Feature importance of the RF model using the reduction in GINI impurity.
Important factors for SBI prediction are listed in the order of importance, and the feature importance was obtained using the scikit-learn library [20]. CSF = cerebrospinal fluid.
Conventional logistic regression analysis.
| Variables | Univariable analysis | Multivariable analysis | |||||
|---|---|---|---|---|---|---|---|
| OR | 95% CI |
| OR | 95% CI |
| ||
| Age, months | 0.902 | 0.892–0.910 | <0.001 | ||||
| Sex | Female | Reference | Reference | ||||
| Male | 1.687 | 1.388–2.050 | <0.001 | 4.268 | 0.893–20.388 | 0.069 | |
| Fever duration, hours | 0.996 | 0.993–0.998 | 0.001 | ||||
| Immunizations administered as recommended schedule | 0.547 | 0.354–0.845 | 0.007 | ||||
| 0.321 | 0.264–0.390 | <0.001 | |||||
| 0.322 | 0.265–0.391 | <0.001 | |||||
| Attends day care center | 0.172 | 0.119–0.248 | <0.001 | ||||
| Rash | 0.282 | 0.145–0.548 | <0.001 | ||||
| Heart rate, beats/minute | 1.016 | 1.012–1.020 | <0.001 | ||||
| Respiratory rate, breaths/minute | 1.056 | 1.046–1.065 | <0.001 | ||||
| Body temperature, ℃ | 0.876 | 0.795–0.966 | 0.008 | 2.167 | 1.198–3.921 | 0.011 | |
| Leukocyte, cells/mm3 | 1.000 | 1.000–1.000 | 0.075 | ||||
| C-reactive protein, mg/dL | 1.089 | 1.068–1.110 | <0.001 | ||||
| pH | 8.598 | 1.272–58.101 | 0.027 | ||||
| Procalcitonin, μg/L | 1.001 | 1.000–1.001 | 0.003 | ||||
| Bacteriuria | 36.992 | 29.150–46.945 | <0.001 | ||||
| CSF examination performed | 5.478 | 3.777–7.945 | <0.001 | ||||
| Blood culture performed | 39.479 | 27.362–56.962 | <0.001 | ||||
| Urine culture performed | 152.482 | 85.804–270.977 | <0.001 | 10.906 | 1.044–113.932 | 0.046 | |
| Leukocyte esterase | Negative | Reference | Reference | ||||
| Trace | NA | NA | NA | NA | NA | NA | |
| 1+ | 8.133 | 5.839–11.328 | <0.001 | 12.455 | 2.372–65.393 | 0.003 | |
| 2+ | 24.402 | 17.834–33.387 | <0.001 | 9.266 | 0.644–133.354 | 0.102 | |
| 3+ | 80.037 | 58.667–109.191 | <0.001 | 102.649 | 15.377–685.242 | <0.001 | |
aAs a categorical variable, ‘Yes’ was analyzed with ‘No’ as a reference for each item.
OR, odds ratio; CI, confidence interval; NA, not applicable; CSF, cerebrospinal fluid.
Values of categorical variables used in the analysis.
| Variables | Derivation cohort (n = 11,973) | Validation cohort (n = 2,858) | |
|---|---|---|---|
| Sex | Female | 5,467 (45.7) | 1,312 (45.9) |
| Male | 6,506 (54.3) | 1,546 (54.1) | |
| Immunizations administered as recommended schedule | Yes | 9,822 (82.0) | 1,258 (44.0) |
| No | 422 (3.5) | 112 (3.9) | |
| Missing | 1,729 (14.4) | 1,488 (52.1) | |
| Attends day care center | Yes | 3,995 (33.4) | 900 (31.5) |
| No | 5,789 (48.4) | 384 (13.4) | |
| Missing | 2,189 (18.3) | 1,574 (55.1) | |
| Yes | 10,059 (84.0) | 2,858 (100.0) | |
| No | 1,914 (16.0) | 0 (0.0) | |
| Yes | 10,054 (84.0) | 2,858 (100.0) | |
| No | 1,919 (16.0) | 0 (0.0) | |
| Rash | Yes | 703 (5.9) | 119 (4.2) |
| No | 10,151 (84.8) | 2,660 (93.1) | |
| Missing | 1,119 (9.3) | 79 (2.8) | |
| Bacteriuria | Yes | 786 (6.6) | 126 (4.4) |
| No | 4,818 (40.2) | 755 (26.4) | |
| Missing | 6,369 (53.2) | 1,977 (69.2) | |
| CSF examination performed | Yes | 208 (1.7) | 2 (0.1) |
| No | 11,765 (98.3) | 2,856 (99.9) | |
| Blood culture performed | Yes | 3,470 (29.0) | 952 (33.3) |
| No | 8,503 (71.0) | 1,906 (66.7) | |
| Urine culture performed | Yes | 2,756 (23.0) | 764 (26.7) |
| No | 9,217 (77.0) | 2,094 (73.3) | |
| Leukocyte esterase | Negative | 4,430 (37.0) | 652 (22.8) |
| Trace | 249 (2.1) | 68 (2.4) | |
| 1+ | 373 (3.1) | 70 (2.4) | |
| 2+ | 264 (2.2) | 45 (1.6) | |
| 3+ | 288 (2.4) | 46 (1.6) | |
| Missing | 6,369 (53.2) | 1,977 (69.2) | |
Data are presented as number (%).
CSF, cerebrospinal fluid.