| Literature DB >> 36223331 |
Alicia E Genisca1, Kelsey Butler2, Monique Gainey3, Tzu-Chun Chu4, Lawrence Huang5, Eta N Mbong6, Stephen B Kennedy7, Razia Laghari6, Fiston Nganga6, Rigobert F Muhayangabo6, Himanshu Vaishnav8, Shiromi M Perera9, Moyinoluwa Adeniji5, Adam C Levine1, Ian C Michelow10, Andrés Colubri2.
Abstract
BACKGROUND: Ebola Virus Disease (EVD) causes high case fatality rates (CFRs) in young children, yet there are limited data focusing on predicting mortality in pediatric patients. Here we present machine learning-derived prognostic models to predict clinical outcomes in children infected with Ebola virus.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36223331 PMCID: PMC9555640 DOI: 10.1371/journal.pntd.0010789
Source DB: PubMed Journal: PLoS Negl Trop Dis ISSN: 1935-2727
Demographic and clinical characteristics of patients in the West Africa derivation cohort.
| Characteristic | Survived, n = 345 | Died, n = 234 | OR (95% CI) | p-value |
|---|---|---|---|---|
|
| ||||
| Age (years), median (IQR) | 11 (7, 14) | 6 (3, 13) | 0.55 (0.46–0.65) |
|
| Male sex, n (%) | 159 (46) | 112 (48) | 1.07 (0.77–1.5) | 0.67 |
|
| ||||
| Asthenia | 297 (88) | 171 (74) | 0.39 (0.25–0.6) |
|
| Headache | 203 (60) | 93 (42) | 0.48 (0.34–0.68) |
|
| Abdominal pain | 165 (50) | 78 (36) | 0.57 (0.4–0.81) |
|
| Bleeding | 43 (14) | 45 (22) | 1.68 (1.06–2.67) |
|
| Joint pain | 113 (38) | 48 (29) | 0.68 (0.45–1.01) | 0.060 |
| Bone or muscle pain | 121 (37) | 61 (29) | 0.71 (0.48–1.02) | 0.067 |
| Respiratory distress | 26 (7.9) | 27 (13) | 1.69 (0.95–2.99) | 0.071 |
| Vomiting | 206 (61) | 121 (54) | 0.76 (0.54–1.07) | 0.12 |
| Nausea | 172 (60) | 104 (54) | 0.78 (0.54–1.13) | 0.19 |
| Conjunctivitis | 47 (18) | 20 (15) | 0.8 (0.45–1.4) | 0.45 |
| Diarrhea | 187 (56) | 125 (59) | 1.13 (0.8–1.61) | 0.48 |
| Hiccups | 24 (7.3) | 12 (5.7) | 0.77 (0.37–1.55) | 0.48 |
| Fever | 298 (88) | 199 (87) | 0.88 (0.54–1.46) | 0.63 |
| Anorexia | 225 (74) | 128 (75) | 1.10 (0.72–1.70) | 0.67 |
| Swallowing problems | 60 (18) | 40 (19) | 1.01 (0.64–1.57) | 0.97 |
| Ct value, median (IQR) | 26.8 (23.6, 30.8) | 21.7 (18.9, 26.5) | 0.35 (0.23–0.51) |
|
aOR is for each 5-year increase in age
bBold values denote statistical significance
cn (%)
Abbreviations: IQR: interquartile range; Ct: cycle threshold; OR: odds ratio; CI: confidence intervals
Fig 1Map of children with Ebola Virus Disease (EVD).
The map shows the geographical distribution of children with EVD included in triage data from the Ebola Data Platform, collected during the West African EVD outbreak from 2014–2016. Bubble size corresponds to the number of cases reported, and color corresponds to observed case fatality rate. Plotted with the R package tmap [52], using base layer maps in the public domain from the Natural Earth project (https://www.naturalearthdata.com/about/terms-of-use/).
Comparison of baseline characteristics in West Africa derivation and DRC validation cohorts.
| Derivation Cohort | Validation Cohort | ||
|---|---|---|---|
| Case-fatality rate (n, %) | 234 (40.4) | 22 (29.7) | |
|
| |||
| Age | 10 (5–14) | 5 (1.5–14) | |
| Ct value | 25.1 (20.9–29.5) | 19.3 (17.6–26.1) | |
|
| |||
| Bleeding | 88 (15.1) | 17 (22.9) | |
| Diarrhea | 57 (9.8) | 40 (54.1) | |
| Respiratory distress | 9.7 (1.7) | 16 (21.6) | |
| Dysphagia | 19 (3.3) | 16 (21.6) | |
aCovariates presented are those included in the EPiC model.
Abbreviations: IQR: interquartile range; Ct: cycle threshold
We sought to improve model performance by recalibrating the intercept and slope of the calibration plot and adding a biomarker to the model that was only available in the DRC data. An analysis of peak laboratory test results measured within the first 48 hours after admission identified three variables each significantly (p <0.01) correlated with mortality: ALT (r = 0.57), AST (r = 0.56), and CK (r = 0.51). We omitted ALT because it is highly colinear with AST (Pearson correlation = 0.83. Despite limited availability of test results in the validation data (AST: n = 29; CK: n = 33), we used these new variables to build additional models. Models that incorporated an additional predictor outperformed the original EPiC model on the validation data, in which adding CK as a predictor produced an AUC of 0.87 (95% CI: 0.74–1) while adding AST gave an AUC of 0.90 (95% CI: 0.77–1). We also considered a third model with both AST and CK added as predictors, since the association between these two biomarkers was moderate (Pearson correlation = 0.52), suggesting that they contain some amount of mutually independent information that could be combined to improve the predictions. Indeed, the model with AST and CK yields a higher AUC of 0.95 (95% CI: 0.86–1). The confusion matrix for this model exhibits an almost perfect discriminative capability with only 1 misclassification in each outcome category (S5A and S5B Table). However, the sample size for this model was reduced further to n = 23, since it requires patients to have data for both biomarkers. The ROCs and calibration plots for these three models are shown in Fig 3.
Fig 2Performance characteristics of the prediction model.
Discrimination (A) and calibration (B) plots of the Ebola Virus Disease Prognosis in Children (EPiC) model are shown for the Democratic Republic of the Congo validation dataset. In the discrimination plot, the receiver operating characteristic (ROC) curve is plotted (central black line) together with the 95% confidence interval band (blue shaded area). In the calibration plot, the dots represent the mean estimate of the observed probability for each 10% bin of predicted probability (with probability being risk of death), the vertical lines passing through each dot are the corresponding confidence intervals for the observed probability, the dashed line is the best linear fit passing through the mean values, and the red line is the LOESS curve fitting all the individual observed/predicted pairs in the data.
Fig 3Discrimination and calibration curves.
Area under the receiver operating characteristic curves (AUC) (A, C, E) and calibration curves (B, D, F) of the Ebola Virus Disease Prognosis in Children (EPiC) model are shown with aspartate aminotransferase (AST) (A, B), creatine kinase (CK) (C, D), or both (E, F) as additional predictors for the Democratic Republic of the Congo validation dataset. The interpretation of the plots is the same as in Fig 2.