| Literature DB >> 35481281 |
Yuanda Zhu1, Janani Venugopalan2, Zhenyu Zhang3,4, Nikhil K Chanani5, Kevin O Maher5, May D Wang2.
Abstract
More than 5 million patients have admitted annually to intensive care units (ICUs) in the United States. The leading causes of mortality are cardiovascular failures, multi-organ failures, and sepsis. Data-driven techniques have been used in the analysis of patient data to predict adverse events, such as ICU mortality and ICU readmission. These models often make use of temporal or static features from a single ICU database to make predictions on subsequent adverse events. To explore the potential of domain adaptation, we propose a method of data analysis using gradient boosting and convolutional autoencoder (CAE) to predict significant adverse events in the ICU, such as ICU mortality and ICU readmission. We demonstrate our results from a retrospective data analysis using patient records from a publicly available database called Multi-parameter Intelligent Monitoring in Intensive Care-II (MIMIC-II) and a local database from Children's Healthcare of Atlanta (CHOA). We demonstrate that after adopting novel data imputation on patient ICU data, gradient boosting is effective in both the mortality prediction task and the ICU readmission prediction task. In addition, we use gradient boosting to identify top-ranking temporal and non-temporal features in both prediction tasks. We discuss the relationship between these features and the specific prediction task. Lastly, we indicate that CAE might not be effective in feature extraction on one dataset, but domain adaptation with CAE feature extraction across two datasets shows promising results.Entities:
Keywords: clinical decision support; convolutional autoencoder; domain adaptation (DA); gradient boosting; intensive care units; mortality prediction
Year: 2022 PMID: 35481281 PMCID: PMC9036368 DOI: 10.3389/frai.2022.640926
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Summary of Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) and Children's Healthcare of Atlanta (CHOA) data.
|
|
|
|
|
|---|---|---|---|
| Sample size | 40,416 patient records | 5,739 patient records | |
| Demographics | Gender, age, height, weight, ethnicity, comorbidity | DOB, gender, age, height, weight, ethnicity, religion, date of death, co morbidity with other diseases | Non-temporal |
| Lab data | Urea, albumin, bilirubin, creatinine, sodium, potassium, calcium | Urea, albumin, bilirubin, creatinine, sodium, potassium, calcium | Temporal |
| Chart data | Heart rate, blood pressure, NBP, CVP, SaO2, arterial PH, arterial PaCO2, arterial PaO2 | Temporal | |
| Microbiology | Types of microbes, amount of microbes, dilution | Non-temporal | |
| Medication data | Medication and IV administered, dosage, duration time, concentrations and rate of administration, composition of IV imposed | Non-temporal |
Figure 1Structure of proposed convolutional autoencoder.
Figure 2Diagrams for three experiments settings. The top one is to apply classifiers directly on the Children's Healthcare of Atlanta (CHOA) data. The middle one is to apply non-negative matrix factorization (NMF) and convolutional autoencoder (CAE) on CHOA without domain adaptation. The bottom one is domain adaptation using CAE.
Average and standard deviation (SD) of AUC-ROC score for mortality prediction using shallow classifiers with temporal features only, non-temporal features only, and both types of features.
|
|
|
|
|
|---|---|---|---|
| Gradient boosting |
| 0.91020 (0.03810) | 0.90132 (0.01824) |
| Random forest | 0.94583 (0.01982) | 0.92798 (0.03218) | 0.87685 (0.02455) |
| Linear regression | 0.71654 (0.08129) | 0.80372 (0.05604) | 0.74497 (0.06087) |
| SVM | 0.62956 (0.02939) | 0.62049 (0.03315) | 0.78573 (0.04997) |
The highest average score is highlighted in bold and underline.
Average and standard deviation (SD) of AUC-ROC score for ICU readmission prediction results using shallow classifiers with temporal features only, non-temporal features only, and both types of features.
|
|
|
|
|
|---|---|---|---|
| Gradient boosting |
| 0.60214 (0.03058) | 0.75414 (0.03747) |
| Random forest | 0.74103 (0.03934) | 0.60882 (0.01912) | 0.74421 (0.04731) |
| Linear regression | 0.55374 (0.01909) | 0.60293 (0.02486) | 0.70807 (0.03570) |
| SVM | 0.54261 (0.03597) | 0.46318 (0.01806) | 0.66418 (0.04506) |
The highest average score is highlighted in bold and underline.
Mortality prediction results on and Children's Healthcare of Atlanta (CHOA) data (both temporal and non-temporal features) under three different experimental settings.
|
|
|
|
|
|---|---|---|---|
| Gradient boosting |
| 0.74864 (0.04162) | 0.86375 (0.02267) |
| Random forest | 0.94583 (0.01982) | 0.74855 (0.03668) | 0.86375 (0.02267) |
| Linear regression | 0.71654 (0.08129) | 0.69893 (0.05303) | 0.76208 (0.03429) |
| SVM | 0.62956 (0.02939) | 0.50896 (0.03349) | 0.44725 (0.12380) |
The highest average score is highlighted in bold and underline.
Figure 3Boxplot of mortality prediction results using four different classifiers with three experiment settings.
Top 10 features in mortality prediction using gradient boosting on Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) and Children's Healthcare of Atlanta (CHOA) data.
|
|
|
|
|---|---|---|
| 1 | sofa_max | component_name_art base deficit |
| 2 | sofa_min | dx_code_v49.86 |
| 3 | hospital_los | component_name_nrbc |
| 4 | icustay_los | component_name_patient fi02 |
| 5 | sapsi_max | component_name_plasma free hgb |
| 6 | sapsi_min | ventilator_days |
| 7 | subject_icustay_total_num | dx_rank |
| 8 | cost_weight | component_name_ast (sgot) |
| 9 | sofa_first | dx_present_on_admit_yn |
| 10 | peptic_ulcer | value_in_range_yn |
ICU readmission prediction results on Children's Healthcare of Atlanta (CHOA) data (both temporal and non-temporal features) under three different experimental settings.
|
|
|
|
|
|---|---|---|---|
| Gradient boosting |
| 0.57404 (0.01665) | 0.62448 (0.03845) |
| Random forest | 0.74103 (0.03934) | 0.59993 (0.01988) | 0.63117 (0.03904) |
| Linear regression | 0.55374 (0.01909) | 0.57664 (0.00749) | 0.58234 (0.01914) |
| SVM | 0.54261 (0.03597) | 0.51617 (0.01912) | 0.44870 (0.03285) |
The highest average score is highlighted in bold and underline.
Figure 4Boxplot of intensive care unit (ICU) readmission prediction results using four different classifiers with three experiment settings.
Top 10 features in ICU readmission prediction using gradient boosting on Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) and Children's Healthcare of Atlanta (CHOA) data.
|
|
|
|
|---|---|---|
| 1 | Gender | dx_code_v44.1 |
| 2 | hospital_los | dx_rank |
| 3 | tidal volume (obser) | apgar_score_5_minutes |
| 4 | temperature f | gestation_age_weeks |
| 5 | icustay_los | dx_type_coded final |
| 6 | spo2 | nicu_yn |
| 7 | carbon dioxide | birth_weight |
| 8 | renal_failure | discharge_destination |
| 9 | congestive_heart_failure | dx_present_on_admit_exempt_yn |
| 10 | fingerstick glucose | birth_length |