| Literature DB >> 32294087 |
Wenshuo Liu1, Cooper Stansbury2,3, Karandeep Singh1,4,5, Andrew M Ryan6, Devraj Sukul7, Elham Mahmoudi1,8, Akbar Waljee1,9,10, Ji Zhu1,11, Brahmajee K Nallamothu1,7,10.
Abstract
Reducing unplanned readmissions is a major focus of current hospital quality efforts. In order to avoid unfair penalization, administrators and policymakers use prediction models to adjust for the performance of hospitals from healthcare claims data. Regression-based models are a commonly utilized method for such risk-standardization across hospitals; however, these models often suffer in accuracy. In this study we, compare four prediction models for unplanned patient readmission for patients hospitalized with acute myocardial infarction (AMI), congestive health failure (HF), and pneumonia (PNA) within the Nationwide Readmissions Database in 2014. We evaluated hierarchical logistic regression and compared its performance with gradient boosting and two models that utilize artificial neural networks. We show that unsupervised Global Vector for Word Representations embedding representations of administrative claims data combined with artificial neural network classification models improves prediction of 30-day readmission. Our best models increased the AUC for prediction of 30-day readmissions from 0.68 to 0.72 for AMI, 0.60 to 0.64 for HF, and 0.63 to 0.68 for PNA compared to hierarchical logistic regression. Furthermore, risk-standardized hospital readmission rates calculated from our artificial neural network model that employed embeddings led to reclassification of approximately 10% of hospitals across categories of hospital performance. This finding suggests that prediction models that incorporate new methods classify hospitals differently than traditional regression-based approaches and that their role in assessing hospital performance warrants further investigation.Entities:
Mesh:
Year: 2020 PMID: 32294087 PMCID: PMC7159221 DOI: 10.1371/journal.pone.0221606
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary statistics of the predictors for each cohort assessed in this study population.
| Acute Myocardial Infarction | Heart Failure | Pneumonia | ||||
|---|---|---|---|---|---|---|
| No Readmission | Readmission | No Readmission | Readmission | No Readmission | Readmission | |
| N = 177,892 | N = 24,146 | N = 249,584 | N = 53,649 | N = 257,135 | N = 46,508 | |
| Age, mean (std) | 66.3 (13.7) | 70.5 (13.3) | 72.5 (14.3) | 72.5 (13.9) | 68.6 (17.2) | 70.3 (15.8) |
| Female pct. | 36.60% | 45.00% | 48.80% | 49.30% | 52.60% | 50.20% |
| No. of diagnosis codes, mean (std) | 12.4 (6.1) | 15.7 (6.4) | 15.1 (5.5) | 16.2 (5.7) | 12.7 (5.8) | 14.7 (5.8) |
| No. of procedure codes, mean (std) | 5.6 (3.3) | 5.2 (3.9) | 1.1 (1.9) | 1.3 (2.1) | 0.7 (1.5) | 1.0 (1.8) |
Summary statistics of ICD-9CM diagnosis and procedure codes for each cohort.
| Methods | Acute Myocardial Infarction | Heart Failure | Pneumonia |
|---|---|---|---|
| Hierarchical Logistic Regression | 0.639 (0.635, 0.642) | 0.580 (0.578, 0.583) | 0.605 (0.601, 0.609) |
| XGBoost | 0.666 (0.664, 0.668) | 0.602 (0.599, 0.605) | 0.635 (0.632, 0.638) |
| Feed-Forward Neural Networks | 0.667 (0.664, 0.670) | 0.604 (0.602, 0.606) | 0.639 (0.636, 0.641) |
| Medical Code Embedding Deep Set Architecture | 0.683 (0.680, 0.686) | 0.618 (0.616, 0.621) | 0.656 (0.653, 0.658) |
The prediction accuracy was assessed by the area under the curve for Receiver Operating Characteristic (AUC) on the three cohorts. We compared the four models: the hierarchical logistic regression, XGBoost, the feed-forward neural networks, and the medical code embedding Deep Set architecture model.
Fig 1Distribution of risk-standardized hospital readmission rates.
This figure shows differences in the distribution of risk-standardized hospital readmission rates for acute myocardial infarction (AMI), congestive health failure (HF), and pneumonia (PNA) generated by the hierarchical logistic regression (HLR) model and the medical code embedding Deep Set architecture ANN (ME-DS) model. Standardized readmission rates are generated by comparing model predictions to expected readmission rates for each hospital. This figure illustrates that despite having comparable predictive accuracy LHR and MS-DS lead to differences in hospital risk stratification.
Cross tabulation of divided groups between the Hierarchical Logistic Regression (HLR) and the medical code embedding deep set architecture (ME-DS) model for each cohort.
| Acute Myocardial Infarction | Heart Failure | Pneumonia | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Rank in HLR model | ||||||||||||
| Rank in ME-DS model | Top 20% | Middle 60% | Bottom 20% | All | Top 20% | Middle 60% | Bottom 20% | All | Top 20% | Middle 60% | Bottom 20% | All |
| Top 20% | 151 | 72 | 0 | 223 | 235 | 106 | 0 | 341 | 261 | 122 | 0 | 383 |
| Middle 60% | 72 | 563 | 37 | 672 | 106 | 854 | 66 | 1026 | 122 | 949 | 82 | 1153 |
| Bottom 20% | 0 | 37 | 186 | 223 | 0 | 66 | 275 | 341 | 0 | 82 | 301 | 383 |
| All | 223 | 672 | 223 | 1118 | 341 | 1026 | 341 | 1708 | 383 | 1153 | 383 | 1919 |