| Literature DB >> 35860493 |
Yu-Ching Chen1,2, Jo-Hsuan Chung1, Yu-Jo Yeh1, Shi-Jer Lou1,3, Hsiu-Fen Lin4,5, Ching-Huang Lin6, Hong-Hsi Hsien7, Kuo-Wei Hung8, Shu-Chuan Jennifer Yeh1,9, Hon-Yi Shi1,3,9,10,11.
Abstract
Background: Machine learning algorithms for predicting 30-day stroke readmission are rarely discussed. The aims of this study were to identify significant predictors of 30-day readmission after stroke and to compare prediction accuracy and area under the receiver operating characteristic (AUROC) curve in five models: artificial neural network (ANN), K nearest neighbor (KNN), random forest (RF), support vector machine (SVM), naive Bayes classifier (NBC), and Cox regression (COX) models.Entities:
Keywords: 30-day readmission; artificial neural network; feature importance analysis; post-acute care; stroke
Year: 2022 PMID: 35860493 PMCID: PMC9289395 DOI: 10.3389/fneur.2022.875491
Source DB: PubMed Journal: Front Neurol ISSN: 1664-2295 Impact factor: 4.086
The studies in predicting 30-day readmission for patients by using machine learning.
|
|
|
|
|
|---|---|---|---|
| Lineback et al. (USA) ( | 2,855 patients with stroke | 1. Logistic regression (LR) | Advanced machine learning (ML) methods along with natural language processing (NLP) features out performed logistic regression for all-cause readmission [areas under the curve (AUC), 0.64 vs. 0.58; |
| Darabi et al. (USA) ( | 3,184 patients with ischemic stroke | 1. Logistic regression (LR) | 1. GBM provided the highest AUC (0.68), specificity (0.95), and positive predictive value (PPV) (0.33) when compared to the other models 2. In terms of AUC, specificity, and PPV, the LR had poor performance compared to XGBoost and GBM models |
| Xu et al. | 6,070 patients with ischemic stroke | 1. Extreme gradient boosting (XGBoost) | The AUC values of the XGboost model and logistic model for predicting readmission were 0.782 (0.729–0.834) and 0.771 (0.714–0.828), respectively |
| Sarajlic et al. (Sweden) ( | 149,447 patients with acute myocardial infarction | 1. Random forests (RF) | The full logistic regression model with 25 predictors had a C-index of 0.67 as compared with the best-performing ML model (Random Forest) with only 10 predictors and a C-index of 0.73 |
| Sharma et al. (Canada) ( | 9,845 patients with heart failure | 1. Extreme gradient boosting (XGBoost) | 1. The boosted tree-based ML algorithms had the highest AUC with XGBoost compared to the L1 logistic regression (0.685 vs. 0.591) in predicting 30-day readmission 2. Calibration plots for XGBoost showed that predicted readmission was aligned with observed risks and that low predicted risks were associated with fewer actual outcomes highlighting higher negative predicted values at lower predicted risks |
| Wang et al. | 47,498 eligible heart failure with reduced ejection fraction patients | 1. Logistic regression (LR) | 1. The best AUCs of deep learning (DL) models without a buffer window in predicting heart failure hospitalizations and worsening heart failure events in the total patient cohort were 0.977 and 0.972, respectively 2. The best AUCs in predicting 30-day readmission in all adult patients were 0.597 and 0.614, respectively 3. For all outcomes assessed, the DL approach outperformed traditional machine learning (ML) models |
| Amritphale et al. (USA) ( | 16,745 patients with carotid artery stenting | 1. Logistic regression (LR) | 1. The artificial intelligence machine learning DNN prediction model has a C-statistic value of 0.79 in predicting the patients who might have all-cause unplanned readmission within 30 days of the index carotid artery stenting discharge 2. The DNN model showed a significant higher receiver operating characteristic (ROC; 0.802 vs. 0.680, 0.670, 0.607, and 0.586, respectively) and precision-recall (0.383 vs. 0.140, 0.140, 0.380, and 0.269, respectively) than the LR, SVM, RF, and DT in predicting 30-day readmission among patients with carotid artery stenting |
Figure 1Flowchart of the study.
Figure 2Conceptual framework of the proposed method for predicting readmission within 30 days after stroke.
Baseline characteristics of the study population (N = 1,476).
|
|
|
|---|---|
|
| |
| No | 193 (13.1) |
| Yes | 1,283 (86.9) |
|
| |
| Age (years) | 65.5 ± 13.0 |
| Gender | |
| Female | 554 (37.5) |
| Male | 922 (62.5) |
| Education (years) | 8.9 ± 2.1 |
| Body mass index (kg/m2) | 24.0 ± 2.6 |
|
| |
| Stroke type | |
| Ischemic | 1,224 (82.9) |
| Hemorrhagic | 252 (17.1) |
| Nasogastric tube | |
| No | 1,187 (80.4) |
| Yes | 289 (19.6) |
| Foley catheter | |
| No | 1,342 (90.9) |
| Yes | 134 (9.1) |
| Hypertension | |
| No | 449 (30.4) |
| Yes | 1,027 (69.6) |
| Diabetes mellitus | |
| No | 906 (61.4) |
| Yes | 570 (38.6) |
| Hyperlipidemia | |
| No | 967 (65.5) |
| Yes | 509 (34.5) |
| Atrial fibrillation | |
| No | 1,354 (91.7) |
| Yes | 122 (8.3) |
| Previous stroke | |
| No | 1,250 (84.7) |
| Yes | 226 (15.3) |
| Acute care length of stay (days) | 15.2 ± 9.0 |
| Rehabilitation length of stay (days) | 44.9 ± 21.2 |
| Readmission in 30 days | |
| No | 1,356 (91.9) |
| Yes | 120 (8.1) |
|
| |
| BI score | 39.0 ± 23.7 |
| FOIS score | 5.5 ± 2.1 |
| EQ5D score | 10.4 ± 1.9 |
| IADL score | 1.2 ± 1.1 |
| BBS score | 15.6 ± 15.8 |
| MMSE score | 19.4 ± 8.9 |
Data are frequencies (percentages), as indicated, for categorical variables and mean ± standard deviation for continuous variables of baseline characteristics.
SD, standard deviation; BI, Barthel Index; FOIS, Functional Oral Intake Scale; EQ-5D, EuroQoL Quality of Life Scale; IADL, Instrumental activities of Daily Living Scale; BBS, Berg Balance Scale; MMSE, Mini-Mental State Examination.
Univariate analysis of selected risk factors for 30-day readmission in patients with stroke (N = 1,476).
|
|
|
|
|---|---|---|
|
| 52.074 | <0.001 |
|
| ||
| Age (years) | 7.890 | 0.005 |
| Gender (female vs. male) | 23.657 | <0.001 |
| Education (years) | 10.870 | <0.001 |
| Body mass index (kg/m2) | 7.944 | 0.005 |
|
| ||
| Stroke type (ischemic vs. hemorrhagic) | 32.053 | <0.001 |
| Nasogastric tube (yes vs. no) | 49.361 | <0.001 |
| Foley catheter (yes vs. no) | 5.590 | 0.018 |
| Hypertension (yes vs. no) | 4.564 | 0.033 |
| Diabetes mellitus (yes vs. no) | 7.324 | 0.007 |
| Hyperlipidemia (yes vs. no) | 5.777 | 0.016 |
| Atrial fibrillation (yes vs. no) | 6.114 | 0.013 |
| Previous stroke (yes vs. no) | 6.899 | 0.009 |
| Acute care length of stay, days | 30.008 | <0.001 |
| Rehabilitation length of stay, days | 26.508 | <0.001 |
|
| ||
| BI score | 37.494 | <0.001 |
| FOIS score | 26.508 | <0.001 |
| EQ5D score | 16.712 | <0.001 |
| IADL score | 22.726 | <0.001 |
| BBS score | 14.903 | <0.001 |
| MMSE score | 34.665 | <0.001 |
One-way analysis of variance and Fisher exact analysis were performed to assess for associations between the variables and 30-day readmission.
BI, Barthel Index; FOIS, Functional Oral Intake Scale; EQ-5D, EuroQoL Quality of Life Scale; IADL, Instrumental Activities of Daily Living Scale; BBS, Berg Balance Scale; MMSE, Mini-Mental State Examination.
Hyper-parameters and final settings in all machine learning algorithms.
|
|
|
|
|---|---|---|
| Artificial neural network (ANN) | Hidden layers | 6 |
| Hidden neuron | 512-256-128-64-32-1 | |
| Learning rate | 0.001 | |
| K nearest neighbor (KNN) | Neighbors | 5 |
| Support vector machine (SVM) | Cpenalty | 1.0 |
| Gamma | 1/[ | |
| Naive Bayes classifier (NBC) | Alpha | 1.0 |
| Random forest (RF) | Estimators | 100 |
| Splitmin | 2 | |
| leafmin | 1 | |
| Cox regression (COX) | – | – |
Optimizer algorithm using Adam.
Comparison of 1,000 pairs of forecasting models for predicting 30-day readmission in patients with stroke (N = 1,476).
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| ANN (95% CI) | 0.73 (0.65, 0.82) | 0.98 (0.96, 0.99) | 0.88 (0.84, 0.92) | 0.77 (0.70, 0.84) | 0.92 (0.89, 0.95) | 0.94 (0.91,0.97) |
| KNN (95% CI) | 0.59 (0.50, 0.68) | 0.86 (0.82, 0.90) | 0.56 (0.47, 0.65) | 0.64 (0.56, 0.72) | 0.83 (0.78, 0.88) | 0.76 (0.68, 0.84) |
| RF (95% CI) | 0.70 (0.64, 0.76) | 0.92 (0.87, 0.97) | 0.79 (0.75, 0.83) | 0.71 (0.64, 0.78) | 0.88 (0.84, 0.92) | 0.85 (0.80, 0.90) |
| SVM (95% CI) | 0.49 (0.39, 0.59) | 0.96 (0.93, 0.99) | 0.76 (0.68, 0.84) | 0.62 (0.54, 0.70) | 0.89 (0.85, 0.93) | 0.74 (0.66, 0.82) |
| NBC (95% CI) | 0.48 (0.38, 0.59) | 0.96 (0.93, 0.99) | 0.50 (0.40, 0.60) | 0.69 (0.61, 0.77) | 0.81 (0.75, 0.87) | 0.73 (0.65, 0.81) |
| COX (95% CI) | 0.51 (0.42, 0.61) | 0.97 (0.95, 0.99) | 0.77 (0.69, 0.85) | 0.71 (0.63, 0.79) | 0.85 (0.80, 0.90) | 0.88 (0.83, 0.93) |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | |
| ANN (95% CI) | 0.70 (0.62, 0.78) | 0.97 (0.95, 0.99) | 0.89 (0.85, 0.93) | 0.82 (0.76, 0.88) | 0.93 (0.90, 0.96) | 0.89 (0.85, 0.93) |
| KNN (95% CI) | 0.53 (0.44, 0.62) | 0.88 (0.84, 0.92) | 0.60 (0.51, 0.69) | 0.71 (0.63, 0.79) | 0.71 (0.63, 0.79) | 0.81 (0.75, 0.87) |
| RF (95% CI) | 0.69 (0.62, 0.76) | 0.94 (0.92, 0.96) | 0.85 (0.82, 0.88) | 0.79 (0.76, 0.82) | 0.88 (0.84, 0.92) | 0.87 (0.83, 0.91) |
| SVM (95% CI) | 0.53 (0.44, 0.62) | 0.93 (0.90, 0.96) | 0.75 (0.67, 0.82) | 0.78 (0.71, 0.85) | 0.82 (0.73, 0.89) | 0.80 (0.74, 0.86) |
| NBC (95% CI) | 0.50 (0.40, 0.60) | 0.93 (0.90, 0.96) | 0.63 (0.54, 0.72) | 0.79 (0.72, 0.86) | 0.83 (0.76, 0.90) | 0.84 (0.78, 0.90) |
| COX (95% CI) | 0.54 (0.45, 0.64) | 0.96 (0.94, 0.98) | 0.88 (0.83, 0.93) | 0.61 (0.53, 0.69) | 0.87 (0.82, 0.92) | 0.87 (0.82, 0.92) |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | |
ANN, artificial neural network; KNN, K nearest neighbor; RF, random forest; SVM, support vector machine; NBC, naive Bayes classifier; COX, Cox regression; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve; CI, confidence interval.
The P-value is the statistical significance of the forecasting models and performance indices calculated using a Chi-squared test.
Figure 3Performance indices of forecasting models used to predict 30-day readmission in patients with stroke when using (A) training dataset, (B) testing dataset. The box plot shows the median (centers) and interquartile range (borders). In analyses of accuracy and AUROC, the ANN model had significantly higher values compared to other forecasting models (P < 0.001). AUROC, area under the receiver operating characteristics; ANN, artificial neural network.
Figure 4A permutation importance analysis of artificial neural network model in predicting 30-day readmission in patients with stroke. BI, Barthel Index; IADL, Instrumental Activities of Daily Living; MMSE, Mini-Mental State Examination; BBS, Berg Balance Scale; FOIS, Functional Oral Intake Scale; EQ-5D, EuroQoL Quality of Life Scale.
Comparative performance indices of forecasting models when using 167 new validating datasets to predict 30-day readmission in patients with stroke.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| ANN (95% CI) | 0.74 (0.66, 0.82) | 0.97 (0.95, 0.99) | 0.89 (0.85, 0.94) | 0.87 (0.82, 0.92) | 0.93 (0.90, 0.96) | 0.94 (0.91, 0.97) |
| KNN (95% CI) | 0.50 (0.40, 0.49) | 0.87 (0.83, 0.91) | 0.61 (0.52, 0.70) | 0.70 (0.62, 0.78) | 0.80 (0.74, 0.86) | 0.83 (0.78, 0.88) |
| RF (95% CI) | 0.70 (0.66, 0.74) | 0.95 (0.91, 0.98) | 0.84 (0.80, 0.88) | 0.85 (0.81, 0.89) | 0.90 (0.87, 0.93) | 0.90 (0.86, 0.94) |
| SVM (95% CI) | 0.51 (0.41, 0.61) | 0.96 (0.94, 0.98) | 0.76 (0.69, 0.83) | 0.79 (0.72, 0.87) | 0.88 (0.84, 0.92) | 0.81 (0.76, 0.86) |
| NBC (95% CI) | 0.50 (0.40, 0.60) | 0.93 (0.90, 0.96) | 0.61 (0.52, 0.70) | 0.80 (0.73, 0.87) | 0.84 (0.79, 0.89) | 0.80 (0.75, 0.85) |
| COX (95% CI) | 0.58 (0.49, 0.67) | 0.92 (0.89, 0.95) | 0.84 (0.78, 0.90) | 0.69 (0.61, 0.77) | 0.88 (0.84, 0.92) | 0.88 (0.84, 0.92) |
| <0.001 | <0.001 | <0.001 | <0.001 | <0.001 | <0.001 |
ANN, artificial neural network; KNN, K nearest neighbor; RF, random forest; SVM, support vector machine; NBC, naive Bayes classifier; COX, Cox regression; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve; CI, confidence interval.
The P-value is the statistical significance of the forecasting models and the performance indices calculated using a Chi-squared test.
Reported associations between post-acute care (PAC) for stroke and 30-day readmission.
|
|
|
|
|
|
|---|---|---|---|---|
| Present study (Taiwan) | 1,476 | 65.5 | Prospective cohort study from six hospitals | Post-acute care (PAC) program was the best predictor of 30-day readmission |
| Kim et al. (U.S.) ( | 51,863 | 80.4 | Medicare provider analysis and review files | Using Instrumental Variable analysis to control for endogeneity bias, an increase in institutional PAC use was associated with a decrease in 30-day readmission rate by 0.19 percentage points |
| Kosar et al. (U.S.) ( | 2,044,231 | 80.2 | Medicare provider analysis and review database | In most rural counties, 30-day readmission rates were 0.3 (95% CI, −0.6 to −0.1) percentage points lower in a non-PAC group compared to a PAC group |
| Raman et al. (U.S.) ( | 1,613 | 74.4 | State inpatient database, California | Clinical predictors of 30-day readmission included comorbidities (e.g., liver disease, hypertension) and discharge to a PAC facility |
| Li et al. (U.S.) ( | 7,851,430 | 65~100 | Medicare beneficiaries | An increase in quarterly PAC use was significantly ( |
| Ramchand et al. (U.S.) ( | 4,850 | 53.1 | National readmissions database | It showed that discharge to inpatient postacute care facility (adjusted odds ratio 1.61, 95% CI 1.07–2.41) was significantly associated with a higher likelihood of 30-day readmission after discharge |
| Hsieh et al. (Taiwan) ( | 6,839 | 69.4 | National Health Insurance claims datasets | The 30-day readmission rates were 15.5% for the PAC group vs. 30.4% in the non-PAC group |