| Literature DB >> 32518246 |
Ali Jalali1,2, Hannah Lonsdale3, Nhue Do4, Shelby Kutty5, Mohamed Rehman3, Luis M Ahumada6,3, Jacquelin Peck7, Monesha Gupta8, Sharon R Ghazarian9, Jeffrey P Jacobs10.
Abstract
The Norwood surgical procedure restores functional systemic circulation in neonatal patients with single ventricle congenital heart defects, but this complex procedure carries a high mortality rate. In this study we address the need to provide an accurate patient specific risk prediction for one-year postoperative mortality or cardiac transplantation and prolonged length of hospital stay with the purpose of assisting clinicians and patients' families in the preoperative decision making process. Currently available risk prediction models either do not provide patient specific risk factors or only predict in-hospital mortality rates. We apply machine learning models to predict and calculate individual patient risk for mortality and prolonged length of stay using the Pediatric Heart Network Single Ventricle Reconstruction trial dataset. We applied a Markov Chain Monte-Carlo simulation method to impute missing data and then fed the selected variables to multiple machine learning models. The individual risk of mortality or cardiac transplantation calculation produced by our deep neural network model demonstrated 89 ± 4% accuracy and 0.95 ± 0.02 area under the receiver operating characteristic curve (AUROC). The C-statistics results for prediction of prolonged length of stay were 85 ± 3% accuracy and AUROC 0.94 ± 0.04. These predictive models and calculator may help to inform clinical and organizational decision making.Entities:
Mesh:
Year: 2020 PMID: 32518246 PMCID: PMC7283236 DOI: 10.1038/s41598-020-62971-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Demographics of patients in the PHN SVR dataset. The patient’s sex, race and age on the day of their Norwood surgical procedure is shown. Gestational age is the gestational age at birth reported in weeks, indicating presence and degree of prematurity with full term ≥ 37 weeks. % below federal poverty level is an indication of socioeconomic status. The turquoise color represents patients who survived to one year, red represents those who died.
Selected variables from the dataset for machine learning modeling.
| Category | Variables | Description | Scaling |
|---|---|---|---|
| Preoperative | Age | Days | softmax |
| Sex | Male/Female | None | |
| Low birth weight | Birth weight < 2500g | None | |
| Poverty score | Percentage income below federal poverty level | softmax | |
| Race | Race of the patient | softmax | |
| Surgery Volume | Number of Norwood surgeries at the hospital/year | min-max | |
| Gestational age at birth | Weeks | softmax | |
| Pre-term | Yes/No | None | |
| Prenatal diagnosis of congenital heart disease | Yes/No | None | |
| Fetal age at prenatal diagnosis | Weeks | softmax | |
| Any associated anatomic diagnoses? | Yes/No | None | |
| Apgar 1 | Apgar score at 1 minute | min-max | |
| Apgar 5 | Apgar score at 5 minutes | min-max | |
| Highest serum lactate | mmol/L | softmax | |
| Inhaled | Yes/No | None | |
| Inhaled | Yes/No | None | |
| Anatomic diagnosis | HLHS, Transposition of great arteries (TGA), etc. | min-max | |
| Pre-surgery cardiac catheterization? | Yes/No | None | |
| Fetal interventions | Yes/No | None | |
| Aortic Atresia | Yes/No | None | |
| Obstructed pulmonary venous return? | Yes/No | None | |
| Presence of HLHS? | Yes/No | None | |
| Number of significant pre-operative complications | 0-5 | min-max | |
| Number of pre-Norwood surgical interventions | 0-4 | min-max | |
| Type of shunt | Blalock Taussig or RV-to-PA | None | |
| Intraoperative | Treatment | MBTS or RV-to-PA | None |
| Cross-Clamp time | Minutes | softmax | |
| Bypass time | Minutes | softmax | |
| DHCA time | Deep Hypothermic Circulatory Arrest time in minutes | softmax | |
| RCP time | Retrograde Cerebral Perfusion time in minutes | softmax | |
| RCP flow | Retrograde Cerebral Perfusion cc/Kg/min | softmax | |
| Low Temp | Lowest temperature °C | softmax | |
| Lowest Hematocrit | % | softmax | |
| Ultrafiltration used during Cardiopulmonary Bypass (CPB)? | Yes/No | None | |
| Ultrafiltration used post CPB? | Yes/No | None | |
| Steroids given intraoperatively | Yes/No | None | |
| Trasylol (Aprotinin) given intra-operatively | Yes/No | None | |
| Alpha adrenergic receptor blockade? | Yes/No | None | |
| Was patient placed on extracorporeal membrane oxygenation? | Yes/No | None | |
| Exterior diameter ascending aorta | mm | softmax | |
| Type of arch reconstruction | Classic or direct | None | |
| Coarctectomy | Yes/No | None | |
| MBTS diameter | mm | softmax | |
| MBTS length | mm | softmax | |
| RV-to-PA diameter | mm | softmax | |
| RV-to-PA length | mm | softmax | |
| Was patient extubated in the operating room? | Yes/No | None | |
| Did patient require cardiopulmonary resuscitation? | Yes/No | None | |
| Oxygen saturation at the end of surgery | % | softmax |
Preoperative data was used for mortality prediction model, while a combination of both preoperative and intraoperative data was only used for prolonged LOS prediction. Scaling shows the methodology used to scale the data.
Figure 2Histogram distribution and box plot of of the LOS data for patients who survived the Norwood procedure.
Optimal parameters of the developed models based on the Bayesian optimization technique.
| Model | Parameter | Range | Mortality | LOS |
|---|---|---|---|---|
| DNN | First hidden layer size | [100–200] | 120 | 110 |
| Second hidden layer size | [80–180] | 100 | 100 | |
| Third hidden layer size | [20–70] | 30 | 40 | |
| Dropout ratio | [0.2–0.6] | 0.5,0.5,0.2 | 0.5,0.5,0.2 | |
| [0.1–0.4] | 0.2 | 0.1 | ||
| [0.5–0.8] | 0.5 | 0.6 | ||
| [0.85–0.95] | 0.9 | 0.9 | ||
| [0.001–0.005] | 0.001 | 0.001 | ||
| [0.001–0.005] | 0.001 | 0.001 | ||
| GB | No of trees | [100–200] | 160 | 130 |
| Learning rate | [0.1–0.3] | 0.09 | 0.15 | |
| Maximum depth | [3–7] | 5 | 4 | |
| Stochastic? | [Yes/No] | Yes | Yes | |
| RF | No of trees | [100–200] | 120 | 150 |
| Criterion | [Gini/Entropy] | Gini | Entropy | |
| Maximum depth | [3–7] | 6 | 6 | |
| Maximum features | [10–22] | 13 | 21 | |
| DT | Criterion | [Gini/Entropy] | Gini | Gini |
| Maximum depth | [3–7] | 7 | 5 | |
| Maximum features | [10–22] | 11 | 22 | |
| LR | 0.17 | 0.19 | ||
| Solver | [Stochastic Average Gradient (SAG)/Newton] | SAG | SAG |
The network size is the number of neurons in each layer. Dropout technique only applies to the hidden layers.
Detailed results of machine learning classifiers for prediction of mortality and post-surgery prolonged LOS.
| Model | Precision | Recall | F-Score | Accuracy | AUROC |
|---|---|---|---|---|---|
| Deep Neural Network | 0.94 ± 0.03 | 0.86 ± 0.04 | 0.89 ± 0.03 | 0.89 ± 0.04 | 0.95 ± 0.02 |
| Gradient Boosting | 0.87 ± 0.03 | 0.78 ± 0.04 | 0.83 ± 0.04 | 0.84 ± 0.04 | 0.90 ± 0.04 |
| Random Forest | 0.71 ± 0.04 | 0.27 ± 0.03 | 0.43 ± 0.03 | 0.75 ± 0.05 | 0.84 ± 0.03 |
| Decision Tree | 0.43 ± 0.04 | 0.14 ± 0.05 | 0.29 ± 0.06 | 0.65 ± 0.04 | 0.58 ± 0.04 |
| Ridge Regression | 0.43 ± 0.04 | 0.10 ± 0.04 | 0.28 ± 0.03 | 0.61 ± 0.04 | 0.55 ± 0.03 |
| Deep Neural Network | 0.85 ± 0.04 | 0.91 ± 0.04 | 0.89 ± 0.04 | 0.85 ± 0.03 | 0.94 ± 0.04 |
| Gradient Boosting | 0.87 ± 0.04 | 0.82 ± 0.05 | 0.83 ± 0.03 | 0.82 ± 0.03 | 0.88 ± 0.03 |
| Random Forest | 0.62 ± 0.03 | 0.51 ± 0.05 | 0.55 ± 0.04 | 0.61 ± 0.03 | 0.67 ± 0.03 |
| Decision Tree | 0.56 ± 0.04 | 0.49 ± 0.04 | 0.52 ± 0.05 | 0.53 ± 0.04 | 0.59 ± 0.05 |
| Ridge Regression | 0.59 ± 0.05 | 0.32 ± 0.04 | 0.35 ± 0.04 | 0.63 ± 0.06 | 0.54 ± 0.07 |
Figure 3ROC curves afor each machine learning model after testing using all of the 50 MCMC MI datasets. (left) Risk of mortality prediction (right) Prolonged LOS prediction.
Detailed results clustering F-statistics.
| Number of Clusters | F-Statistics |
|---|---|
| 2 | 369.5 |
| 3 | 228.1 |
| 4 | 163.7 |
| 5 | 127.3 |
| 6 | 102.6 |
| 7 | 85.5 |
| 8 | 74.77 |
Figure 4The calculator display of the mortality or cardiac transplantation risk and cluster for an example patient. Section labeled 1 contains two columns that allows the user to input a new patient’s data values such as age, sex, race, anatomic diagnosis etc. These values are used by the DNN model to calculate the patient specific risk score and provide a prediction for one-year transplant free survival (section 2). Section 3, a stacked bar graph, is used to depict cluster segments for the registry’s population risk scores and allow clinicians to evaluate a patient’s risk as low, medium or high.