| Literature DB >> 34950616 |
Faiza Khurshid1, Helen Coo1, Amal Khalil2, Jonathan Messiha3, Joseph Y Ting4, Jonathan Wong4, Prakesh S Shah5,6,7.
Abstract
Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699-0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity.Entities:
Keywords: bronchopulmonary dysplasia; calibration; chronic lung disease; discrimination; machine learning; prediction
Year: 2021 PMID: 34950616 PMCID: PMC8688959 DOI: 10.3389/fped.2021.759776
Source DB: PubMed Journal: Front Pediatr ISSN: 2296-2360 Impact factor: 3.418
Variables entered in models to predict bronchopulmonary dysplasia or death prior to NICU discharge among very preterm infants on Days 1, 7, and 14 of admission to Canadian NICUs, 2016–2018.
|
|
|
|
|
|
|---|---|---|---|---|
| Inborn | Yes (born in hospital where NICU located) | Y | Y | Y |
| Sex | Boy | Y | Y | Y |
| Gestational age (weeks and days) | <33.0 weeks | Y | Y | Y |
| Small for gestational age (birthweight <10th percentile for gestational age and sex) | Yes | Y | Y | Y |
| SNAPPE-II score ( | 0–162 | Y | Y | Y |
| Hypertension | Pre-existing | Y | Y | Y |
| Complete course of antenatal steroids in week preceding delivery | Yes | Y | Y | Y |
| Preterm premature rupture of membranes | Yes (≥24 h between rupture of membranes and birth) | Y | Y | Y |
| Mode of delivery | Caesarean | Y | Y | Y |
| Delivery room resuscitation requiring intubation | Yes (intubation, chest compression, and/or epinephrine administered in delivery room) | Y | Y | Y |
| Surfactant administered on or before day of prediction (e.g., for Day 1 models, “Yes” if surfactant administered on day admitted to NICU; for Day 7 models, “Yes” if administered on or before day 7 of NICU stay) | Yes | Y | Y | Y |
| Nitric oxide administered on first day of NICU stay | Yes | Y | N | N |
| Number of days on nitric oxide up to and including day of prediction | 0–7 (Day 7 models) | N | Y | Y |
| Inotropes administered on first day of NICU stay | Yes | Y | N | N |
| Number of days on inotropes up to and including day of prediction | 0–7 (Day 7 models) | N | Y | Y |
| HFV or IPPV on first day of NICU stay | Yes | Y | N | N |
| Number of days of HFV or IPPV up to and including day of prediction | 0–7 (Day 7 models) | N | Y | Y |
| NIV or CPAP on first day of NICU stay | Yes | Y | N | N |
| Number of days of NIV or CPAP up to and including day of prediction | 0–7 (Day 7 models) | N | Y | Y |
| Culture-confirmed sepsis on or before Day 7 of NICU stay | Yes | N | Y | N |
| Culture-confirmed sepsis on or before Day 14 of NICU stay | Yes | N | N | Y |
CPAP, Continuous positive airway pressure.
HFV, High-frequency ventilation.
IPPV, Intermittent positive pressure ventilation.
NICU, Neonatal intensive care unit.
NIV, Non-invasive ventilation.
Prior to one-hot encoding, categorizing continuous variables where violations of assumption of linearity with respect to logit of outcome detected, and imputing missing values.
Figure 1Selection of study cohort.
Characteristics of infants born at <33 weeks of gestation who were admitted to a Canadian tertiary-care NICU from 2016 to 2018 and whose data were included in models to predict bronchopulmonary dysplasia (BPD) or death prior to NICU discharge.
|
| |||
|---|---|---|---|
| Inborn | <5 (<0.05) | 2,757 (86) | 5,180 (89) |
| Male | 8 (0.09) | 1,785 (56) | 3,143 (54) |
| Gestational age, completed weeks | 0 | 26.5 (2.4) | 29.5 (2.1) |
| Small for gestational age | 9 (0.10) | 441 (14) | 487 (8.4) |
| SNAPPE-II score ( | 115 (1.3) | 16.6 (14.1) | 6.1 (8.3) |
| Hypertension | 217 (2.4) | ||
| Pre-existing | 116 (3.7) | 198 (3.5) | |
| Gestational | 430 (14) | 935 (16) | |
| Yes, timing unknown | 10 (0.32) | 16 (0.28) | |
| Complete case of antenatal steroids in week preceding delivery | 97 (1.1) | 1,233 (39) | 2,219 (39) |
| Preterm premature rupture of membranes | 399 (4.4) | 768 (25) | 1,271 (23) |
| Caesarean delivery | 18 (0.20) | 2,007 (63) | 3,639 (62) |
| Delivery room resuscitation requiring intubation | 79 (0.88) | 1,556 (49) | 853 (15) |
| Surfactant | 0 | ||
| First day of NICU stay | 1,961 (62) | 1,485 (26) | |
| On or before day 7 of NICU stay | 2,384 (75) | 1,940 (33) | |
| On or before day 14 of NICU stay | 2,393 (75) | 1,943 (33) | |
| Nitric oxide | 0 | ||
| First day of NICU stay | 192 (6.0) | 43 (0.74) | |
| Frequency up to and including day 7 of NICU stay, days | 0.33 (1.1) | 0.07 (0.50) | |
| Frequency up to and including day 14 of NICU stay, days | 0.51 (1.7) | 0.07 (0.53) | |
| Inotropes | 0 | ||
| First day of NICU stay | 325 (10) | 108 (1.9) | |
| Frequency up to and including day 7 of NICU stay, days | 0.6 (1.4) | 0.18 (0.73) | |
| Frequency up to and including day 14 of NICU stay, days | 0.9 (2.1) | 0.22 (0.94) | |
| High-frequency ventilation or intermittent positive pressure ventilation | 0 | ||
| First day of NICU stay | 2,060 (65) | 1,441 (25) | |
| Frequency up to and including day 7 of NICU stay, days | 4.4 (2.8) | 2.0 (2.4) | |
| Frequency up to and including day 14 of NICU stay, days | 8.0 (5.6) | 2.9 (4.1) | |
| Non-invasive ventilation or continuous positive airway pressure | 0 | ||
| First day of NICU stay | 984 (31) | 3,496 (60) | |
| Frequency up to and including day 7 of NICU stay, days | 2.5 (2.8) | 4.6 (2.5) | |
| Frequency up to and including day 14 of NICU stay, days | 5.8 (5.5) | 9.3 (4.6) | |
| Culture-confirmed sepsis | 0 | ||
| On or before day 7 of NICU stay | 228 (7.2) | 132 (2.3) | |
| On or before day 14 of NICU stay | 483 (15) | 255 (4.4) | |
The data in this table describe the cohort used to develop the Day 1 prediction models.
IQR, Interquartile range.
NICU, Neonatal intensive care unit.
SD, Standard deviation.
Calculated using denominator of n = 9,006.
Denominator excludes missing values.
IQR could not be calculated.
Area under the curve (AUC) for models predicting bronchopulmonary dysplasia or death prior to NICU discharge at three time points (Days 1, 7, and 14 of NICU stay) among infants born at <33 weeks and <29 weeks of gestation who were admitted to Canadian tertiary-care NICUs, 2016–2018.
|
|
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |||||||
|
|
|
|
|
|
|
| ||||||
| Standard Logistic Regression (LR) | 0.861 | 0.860 | 0.884 | 0.884 | 0.877 | 0.878 | 0.779 | 0.782 | 0.776 | 0.783 | 0.776 | 0.790 |
| Penalized LR | 0.861 | 0.861 | 0.884 | 0.884 | 0.878 | 0.879 | 0.781 | 0.780 | 0.777 | 0.782 | 0.775 | 0.790 |
| Support Vector Machine | 0.830 | 0.837 | 0.859 | 0.861 | 0.853 | 0.858 | 0.758 | 0.750 | 0.737 | 0.756 | 0.768 | 0.772 |
| K-Nearest Neighbor | 0.814 | 0.811 | 0.812 | 0.822 | 0.815 | 0.817 | 0.724 | 0.719 | 0.707 | 0.706 | 0.708 | 0.716 |
| Artificial Neural Network | 0.862 | 0.859 | 0.871 | 0.881 | 0.877 | 0.872 | 0.772 | 0.780 | 0.758 | 0.774 | 0.752 | 0.769 |
| Random Forest | 0.817 | 0.819 | 0.851 | 0.854 | 0.863 | 0.857 | 0.721 | 0.699 | 0.737 | 0.725 | 0.753 | 0.760 |
| Soft Voting Ensemble | 0.861 | 0.860 | 0.880 | 0.882 | 0.881 | 0.879 | 0.777 | 0.774 | 0.772 | 0.775 | 0.775 | 0.787 |
| Stacking Neural Network Ensemble | 0.862 | 0.862 | 0.886 | 0.885 | 0.884 | 0.878 | 0.758 | 0.775 | 0.770 | 0.772 | 0.766 | 0.772 |
NICU, Neonatal intensive care unit.
Calibration of models predicting bronchopulmonary dysplasia or death prior to NICU discharge at three time points (Days 1, 7, and 14 of NICU stay) among infants born at <33 weeks of gestation who were admitted to Canadian tertiary-care NICUs in 2016–2018, as assessed in test dataset.
|
|
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Standard Logistic Regression (LR) | 0.354 | 0.343 | 0.08 | 1.05 | 0.332 | 0.331 | 0.006 | 1.08 | 0.324 | 0.315 | 0.07 | 1.04 |
| Penalized LR | 0.354 | 0.343 | 0.08 | 1.07 | 0.332 | 0.331 | 0.007 | 1.10 | 0.324 | 0.315 | 0.07 | 1.06 |
| Support Vector Machine | 0.354 | 0.347 | 0.04 | 1.07 | 0.332 | 0.331 | 0.008 | 1.10 | 0.324 | 0.316 | 0.05 | 1.04 |
| K-Nearest Neighbor | 0.354 | 0.337 | 7.61 | 0.36 | 0.332 | 0.329 | — | — | 0.324 | 0.309 | — | — |
| Artificial Neural Network | 0.354 | 0.346 | 0.05 | 1.16 | 0.332 | 0.338 | −0.04 | 1.25 | 0.324 | 0.319 | 0.04 | 1.08 |
| Random Forest | 0.354 | 0.348 | 0.05 | 0.35 | 0.332 | 0.333 | −0.006 | 0.46 | 0.324 | 0.319 | 0.04 | 0.60 |
| Soft Voting Ensemble | 0.354 | 0.346 | 0.05 | 1.10 | 0.332 | 0.327 | 0.04 | 1.15 | 0.324 | 0.319 | 0.04 | 1.11 |
| Stacking Neural Network Ensemble | 0.354 | 0.365 | −0.07 | 1.10 | 0.332 | 0.333 | −0.003 | 1.16 | 0.324 | 0.327 | −0.02 | 1.17 |
NICU, Neonatal intensive care unit.
Model did not converge.
Calibration slope could not be estimated (see preceding footnote).
Calibration of models predicting bronchopulmonary dysplasia or death prior to NICU discharge at three time points (Days 1, 7, and 14 of NICU stay) among infants born at <29 weeks of gestation who were admitted to Canadian tertiary-care NICUs in 2016–2018, as assessed in test dataset.
|
|
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Standard Logistic Regression (LR) | 0.591 | 0.588 | 0.01 | 1.08 | 0.566 | 0.564 | 0.01 | 1.05 | 0.555 | 0.546 | 0.05 | 1.00 |
| Penalized LR | 0.591 | 0.588 | 0.01 | 1.11 | 0.566 | 0.564 | 0.01 | 1.07 | 0.555 | 0.547 | 0.05 | 1.05 |
| Support Vector Machine | 0.591 | 0.594 | −0.02 | 1.02 | 0.566 | 0.563 | 0.01 | 0.99 | 0.555 | 0.550 | 0.02 | 1.00 |
| K-Nearest Neighbor | 0.591 | 0.582 | 0.05 | 0.07 | 0.566 | 0.567 | −0.003 | 0.07 | 0.555 | 0.552 | 0.02 | 0.09 |
| Artificial Neural Network | 0.591 | 0.678 | −0.46 | 0.98 | 0.566 | 0.621 | −0.28 | 1.00 | 0.555 | 0.569 | −0.07 | 1.20 |
| Random Forest | 0.591 | 0.584 | 0.04 | 0.39 | 0.566 | 0.558 | 0.05 | 0.53 | 0.555 | 0.544 | 0.06 | 0.69 |
| Soft Voting Ensemble | 0.591 | 0.599 | −0.04 | 1.06 | 0.566 | 0.564 | 0.01 | 1.06 | 0.555 | 0.554 | 0.003 | 1.09 |
| Stacking Neural Network Ensemble | 0.591 | 0.569 | 0.10 | 1.49 | 0.566 | 0.630 | −0.29 | 1.40 | 0.555 | 0.594 | −0.19 | 1.21 |
NICU, Neonatal intensive care unit.
Figure 2Feature importance plots illustrating relative importance of each variable in predicting the outcome of bronchopulmonary dysplasia or death prior to NICU discharge among infants born at <33 and <29 weeks of gestation in Canada, 2016–2018. HFV, High-frequency ventilation; IPPV, Intermittent positive pressure ventilation; NIV, Non-invasive ventilation; CPAP, Continuous positive airway pressure.