| Literature DB >> 35068629 |
Behrooz Davazdahemami1, Hamed M Zolbanin2, Dursun Delen3,4.
Abstract
One of the major challenges that confront medical experts during a pandemic is the time required to identify and validate the risk factors of the novel disease and to develop an effective treatment protocol. Traditionally, this process involves numerous clinical trials that may take up to several years, during which strict preventive measures must be in place to control the outbreak and reduce the deaths. Advanced data analytics techniques, however, can be leveraged to guide and speed up this process. In this study, we combine evolutionary search algorithms, deep learning, and advanced model interpretation methods to develop a holistic exploratory-predictive-explanatory machine learning framework that can assist clinical decision-makers in reacting to the challenges of a pandemic in a timely manner. The proposed framework is showcased in studying emergency department (ED) readmissions of COVID-19 patients using ED visits from a real-world electronic health records database. After an exploratory feature selection phase using genetic algorithm, we develop and train a deep artificial neural network to predict early (i.e., 7-day) readmissions (AUC = 0.883). Lastly, a SHAP model is formulated to estimate additive Shapley values (i.e., importance scores) of the features and to interpret the magnitude and direction of their effects. The findings are mostly in line with those reported by lengthy and expensive clinical trial studies.Entities:
Keywords: COVID-19; Deep learning; Genetic algorithm; Machine learning; Pandemic; SHAP
Year: 2022 PMID: 35068629 PMCID: PMC8763415 DOI: 10.1016/j.dss.2022.113730
Source DB: PubMed Journal: Decis Support Syst ISSN: 0167-9236 Impact factor: 6.969
Predictive studies on hospital and ED readmissions.
| Study | Readmission facility | Disease(s) | Sample size (patients) | Period | Best performing method | Metric | Value |
|---|---|---|---|---|---|---|---|
| [ | Hospital | COPD | 106 | 30 days | Random Forest | AUC | 0.720 |
| [ | Hospital | Multiple | 92,530 | 30 days | LASSO & SVM | AUC | 0.680 |
| [ | Hospital | PN | 40,442 | 30 days | Deep Neural Network (DNN) | AUC | 0.734 |
| [ | Hospital | HF | 4210 | 30 days | SVM | AUC | 0.660 |
| [ | Hospital | CHF | 1641 | 30 days | PSO-SVM | Accuracy | 0.784 |
| [ | Hospital | Multiple | 64,912 | 3 days | Ensemble | AUC | 0.666 |
| [ | Hospital | CHF | 4840 | 30 days | CHAID Decision Tree | AUC | 0.707 |
| [ | Hospital | Diabetes | Not Given | 30 days | Recurrent NN | AUC | 0.800 |
| [ | Hospital | Multiple | 32,718 | 30 days | DNN | AUC | 0.780 |
| [ | Hospital | Multiple | 304,888 | 30 days | Ensemble | AUC | 0.771 |
| [ | Hospital | Lupus | 9457 | 30 days | DNN | AUC | 0.700 |
| [ | Hospital | Multiple | 700 | 30 days | DNN | AUC | 0.730 |
| [ | Hospital | CHF | 32,350 | 30 days | Random Forest | AUC | 0.742 |
| [ | Hospital | COPD | 111,992 | 30 days | Gradient Boosting Tree | AUC | 0.653 |
| [ | Hospital | Multiple | 38,597 | 30 days | DNN | AUC | 0.714 |
| [ | ED | Multiple | 279,611 | 30 days | Decision Tree | Accuracy | 0.772 |
| [ | ED | Multiple | 330,631 | 3 days | Gradient Boosting | Sensitivity | 0.16 |
| [ | ED | Multiple | 120,000 | 3 days | DNN | Accuracy | 0.680 |
| [ | ED | Multiple | 290,000 | 3 days | Ensemble | Accuracy | 0.957 |
*AUC stands for Area Under the (Receiver Operating Characteristic) Curve.
COPD: Chronic Obstructive Pulmonary Disease; PN: Pneumonia; CHF: Congestive Heart Failure; AMI: Acute Myocardial Infarction; THA/TKA: Total Hip/Knee Arthroplasty.
Risk factors for readmission among COVID-19 patients.
| Study | Location | Sample size | Readmission rate | Risk factors |
|---|---|---|---|---|
| [ | USA | 279 | 6.8% | Comorbidity (hypertension, diabetes, COPD, liver disease, cancer, substance abuse) |
| [ | South Korea | 7590 | 4.5% | Gender (men), age, medical aid subscription, comorbidity, chest radiographs, computed tomography (CT) scans, HIV antivirals |
| [ | USA | 106,543 | 9% | Discharge disposition, age, comorbidity (COPD, CHF, diabetes, chronic kidney disease, obesity) |
| [ | Spain | 1368 | 4.4% | Weakened immune system, having fever within 48 h prior to discharge |
| [ | Turkey | 154 | 7.1% | Malignant tumor, Hypertension |
| [ | USA | 1775 | 19.9% | Age |
*AUC stands for Area Under the (Receiver Operating Characteristic) Curve.
Summary statistics of the patients.
| Variable | Average (std dev) / proportion |
|---|---|
| Age | 45.81 (17.03) |
| Race | White 39.8% |
| Gender | Male 53.1% |
Fig. 1Deep fully-connected MLP network architecture.
Classification cost function weights.
| Actual negative | Actual positive | |
|---|---|---|
| Predict negative | 0 | 1 |
| Predict positive | 0.226 | 0 |
Fig. 2Summary of the proposed framework.
Grid search settings for hyperparameter optimization.
| Hyperparameter | Range of values | Number of values tested |
|---|---|---|
| Optimizer | [Adam, Nadam, RMSProp] | 3 |
| Learning rate | [0.005, 0.05] | 10 |
| Decay rate | [0.05, 0.30] | 6 |
| Batch size | [4, 8, 16, 32] | 4 |
| Epochs | [100, 200, 300] | 3 |
| Regularization weight | [0.0001, 0.001, 0.01] | 3 |
| Total number of permutations | 6480 |
Optimal set of hyperparameters.
| Hyperparameter | Optimal value |
|---|---|
| Optimizer | Adam |
| Learning rate | 0.01 |
| Decay rate | 0.25 |
| Batch size | 8 |
| Epochs | 300 |
| Regularization weight | 0.0001 |
Fig. 3The Receiver Operating Characteristic (ROC) curve of the best ANN model.
Prediction model results comparison with similar studies.
| Article | Context | Readmission window | Approach | Model metrics |
|---|---|---|---|---|
| Present study | COVID-19 | 7-day | GA + DNN | Acc = 0.874 |
| [ | General | 30-day | Decision Tree | Acc = 0.772 |
| [ | General | 3-day | Gradient Boosting | Sens = 0.16 |
| [ | General | 9-day | Gradient Boosting | Sens = 0.23 |
| [ | General | 3-day | DNN | Acc = 0.680 |
| [ | General | 3-day | Ensemble | Acc = 0.957 |
Fig. 4SHAP importance scores of top features overall (top), medications (middle), and comorbidities (bottom). Green (patterned) = decreasing readmission odds; Red (solid) = increasing readmission odds.
Fig. 5SHAP feature importance scores of a sample patient (individual level).