| Literature DB >> 35958416 |
Minjie Duan1,2, Tingting Shu3, Binyi Zhao4, Tianyu Xiang5, Jinkui Wang6, Haodong Huang2,7, Yang Zhang1,2, Peilin Xiao4, Bei Zhou4, Zulong Xie4, Xiaozhu Liu4.
Abstract
Background: Short-term readmission for pediatric pulmonary hypertension (PH) is associated with a substantial social and personal burden. However, tools to predict individualized readmission risk are lacking. This study aimed to develop machine learning models to predict 30-day unplanned readmission in children with PH.Entities:
Keywords: machine learning; pediatric pulmonary hypertension; prediction; readmission; risk factors
Year: 2022 PMID: 35958416 PMCID: PMC9360407 DOI: 10.3389/fcvm.2022.919224
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1The flowchart of the patient selection.
Baseline characteristics of pediatrics with pulmonary hypertension (PH).
|
|
| ||
|---|---|---|---|
| Age (years), median [Q1, Q3] | 0.27 [0.09, 0.62] | 0.10 [0.00, 0.60] | < 0.001 |
| Male, | 185 (57.8) | 3,070 (54.9) | 0.307 |
| IPAH, | 1 (0.3) | 56 (1) | 0.220 |
| Connective tissue disease, | 0 (0) | 5 (0.1) | 0.593 |
| Dilated cardiomyopathy, | 0 (0) | 9 (0.2) | 0.473 |
| CHD, | 316 (98.8) | 5,464 (97.7) | 0.215 |
| BPD, | 12 (3.8) | 125 (2.2) | 0.080 |
| Interstitial lung disease, | 3 (0.9) | 31 (0.6) | 0.378 |
| Obstructive sleep apneas, | 0 (0) | 5 (0.1) | 0.593 |
| Asthma, | 5 (1.6) | 30 (0.5) | 0.020 |
| Hypothyroidism, | 1 (0.3) | 24 (0.4) | 0.755 |
| Persistent PH in newborn, | 2 (0.6) | 50 (0.9) | 0.616 |
| Congenital diaphragmatic hernia, | 1 (0.3) | 27 (0.5) | 0.666 |
| Chromosomal abnormalities, | 13 (4.1) | 336 (6.0) | 0.151 |
| Preterm birth, | 24 (7.5) | 912 (16.3) | < 0.001 |
| Low-weight-birth infants, | 14 (4.4) | 459 (8.2) | 0.014 |
| Very-low-birth-weight infants, | 3 (0.9) | 96 (1.7) | 0.291 |
| Sepsis, | 26 (8.1) | 1,035 (18.5) | < 0.001 |
| Intracranial hemorrhage, | 27 (8.4) | 1,341 (24.0) | < 0.001 |
| Arrhythmia, | 1 (0.3) | 67 (1.2) | 0.149 |
| Multi-organ dysfunction syndromes, | 0 (0) | 7 (0.1) | 0.527 |
| Respiratory failure, | 99 (30.9) | 2,042 (36.5) | 0.044 |
| Heart failure, | 13 (4.1) | 178 (3.2) | 0.387 |
| Severe pneumonia, | 81 (25.3) | 845 (15.1) | < 0.001 |
|
| |||
| Prostacyclin, | 3 (0.9) | 90 (1.6) | 0.348 |
| PDE-5i, | 21 (6.6) | 716 (12.8) | 0.001 |
| Endothelin receptor antagonists, | 4 (1.3) | 31 (0.6) | 0.115 |
| Combination therapy, | 0 (0) | 3 (0.1) | 0.679 |
| Congenital heart surgery, | 8 (2.5) | 689 (12.3) | < 0.001 |
| Mechanical ventilation, | 19 (5.9) | 1,509 (27.0) | < 0.001 |
| Nonmedical order discharge, | 71 (22.2) | 1,805 (32.3) | < 0.001 |
| LOS (days), median [Q1, Q3] | 10 [7, 13] | 12 [7, 21] | < 0.001 |
n, number; Q1, the first quartile; Q3, the third quartile; IPAH, idiopathic pulmonary arterial hypertension; CHD, congenital heart disease; BPD, bronchopulmonary dysplasia; PDE-5i, phosphodiesterase 5 inhibitors; LOS, length of stay.
Performance of different models.
|
|
|
| |
|---|---|---|---|
| 1 | eXtreme Gradient Boosting | 0.9063 | 0.7474 |
| 2 | Random Forest Classifier | 0.9050 | 0.7284 |
| 3 | Light Gradient Boosting Machine | 0.9021 | 0.7571 |
| 4 | CatBoost Classifier | 0.9000 | 0.7521 |
| 5 | Extra Trees Classifier | 0.8973 | 0.6879 |
| 6 | Decision Tree Classifier | 0.8761 | 0.5802 |
| 7 | Gradient Boosting Classifier | 0.8524 | 0.7640 |
| 8 | K Neighbors Classifier | 0.8154 | 0.6455 |
| 9 | Ada Boost Classifier | 0.8084 | 0.7584 |
| 10 | SVM-Linear Kernel | 0.7047 | 0.0000 |
| 11 | Logistic Regression | 0.5963 | 0.7085 |
| 12 | Ridge Classifier | 0.5825 | 0.0000 |
| 013 | Linear Discriminant Analysis | 0.5820 | 0.7059 |
| 14 | Naive Bayes | 0.1819 | 0.5829 |
| 15 | Quadratic Discriminant Analysis | 0.0916 | 0.5072 |
AUC, area under the curve.
Figure 2Twelve features with non-zero regression coefficients. LOS, length of stay; PDE-5i, phosphodiesterase 5 inhibitors; CHD, congenital heart disease.
Figure 3Receiver operating characteristic (ROC) curve for five machine learning-based prediction models. ROC, receiver operating characteristic curve; LightGBM, Light Gradient Boosting Machine; XGBoost, eXtreme gradient boosting; LR, logistic regression.
Performance evaluation of the 5 prediction models.
|
|
| |||
|---|---|---|---|---|
| CatBoost | 0.8114 | 0.7401 | 0.7813 | 0.7378 |
| XGBoost | 0.8067 | 0.7458 | 0.7188 | 0.7473 |
| LightGBM | 0.7992 | 0.6855 | 0.7396 | 0.6824 |
| Random Forest | 0.7817 | 0.7012 | 0.7396 | 0.6990 |
| Logistic Regression | 0.7248 | 0.5186 | 0.8125 | 0.5018 |
AUC, area under the curve; CI, confidence interval; LightGBM, light gradient boosting machine; XGBoost, eXtreme gradient boosting.
Performance evaluation of the CatBoost model using the validation subset.
|
|
|
| |||
| Male | 945 | 0.8192 | 0.7566 | 0.7857 | 0.7548 |
| Female | 829 | 0.8015 | 0.7214 | 0.7750 | 0.7186 |
|
| |||||
| <1 years | 1464 | 0.8102 | 0.7575 | 0.7901 | 0.7556 |
| 1–6 years | 247 | 0.7963 | 0.6559 | 0.7143 | 0.6524 |
| 6–18 years | 63 | 0.9839 | 0.6667 | 1.000 | 0.6613 |
N, number; AUC, area under the curve; CI, confidence interval; NA, not applicable.
Figure 4Importance score ranking of features in 4 readmission-predicting algorithms. (A) CatBoost. (B) Light Gradient Boosting Machine. (C) eXtreme gradient boosting. (D) Random forest.
Figure 5Shapley Additive Explanations (SHAP) for the CatBoost model. (A) shows the most impactful features on prediction (ranked from most to least important). (B) shows the distribution of the impacts of each feature on the model output. Within each row, each dot represents a patient. The colors of the dots represent the feature values: red for larger values and blue for lower. (C, D) show the individualized predictions for two patients. The bars in red and blue represent risk factors and protective factors, respectively; longer bars represent greater feature importance. LOS, length of stay; PDE-5i, phosphodiesterase 5 inhibitors; CHD, congenital heart disease.