| Literature DB >> 36198757 |
David Rolls1, Justin Boyle2, Aida Brankovic3, Philippa Niven1, Sankalp Khanna2.
Abstract
Preventing unplanned hospitalisations, including readmissions and re-presentations to the emergency department, is an important strategy for addressing the growing demand for hospital care. Significant successes have been reported from interventions put in place by hospitals to reduce their incidence. However, there is limited use of data-driven algorithms in hospital services to identify patients for enrolment into these intervention programs. Here we present the results of a study aiming to develop algorithms deployable at scale as part of a state government's initiative to address rehospitalizations and which fills several gaps identified in the state-of-the-art literature. To the best of our knowledge, our study involves the largest-ever sample size for developing risk models. Logistic regression, random forests and gradient boosted techniques were explored as model candidates and validated retrospectively on five years of data from 27 hospitals in Queensland, Australia. The models used a range of predictor variables sourced from state-wide Emergency Department(ED), inpatient, hospital-dispensed medications and hospital-requested pathology databases. The investigation leads to several findings: (i) the advantage of looking at a longer patient data history, (ii) ED and inpatient datasets alone can provide useful information for predicting hospitalisation risk and the addition of medications and pathology test results leads to trivial performance improvements, (iii) predicting readmissions to the hospital was slightly easier than predicting re-presentations to ED after an inpatient stay, which was slightly easier again than predicting re-presentations to ED after an EDstay, (iv) a gradient boosted approach (XGBoost) was systematically the most powerful modelling approach across various tests.Entities:
Mesh:
Year: 2022 PMID: 36198757 PMCID: PMC9534931 DOI: 10.1038/s41598-022-20907-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1distribution per outcome variable on training and test data. Binary outcome 1 denotes identified admission, presentation and ED presentation occurred within 30 days for RA30, RP30 and RP30E respectively.
Best AUC values across the cohorts and outcome metrics obtained with different data groups with L1, RF and XGB models for a historical window length of 180 days.
| Outcome | Cohort | All+DRG | All | Meds | Patho | Basic | Expert | Pruned Final Model |
|---|---|---|---|---|---|---|---|---|
| AUC | AUC | AUC | AUC | AUC | AUC | AUC (95% CIs), | ||
| RA30 | Children’s hospital | 0.842 | 0.836 | 0.842 | 0.835 | 0.811 | – | |
| Principal referral | 0.746 | 0.742 | 0.741 | 0.726 | – | |||
| Public acute | 0.719 | 0.718 | 0.706 | – | ||||
| RP 30 | Children’s hospital | 0.711 | 0.715 | 0.703 | 0.693 | – | ||
| Principal referral | 0.709 | 0.707 | 0.709 | 0.705 | 0.685 | – | ||
| Public acute | 0.706 | 0.704 | 0.706 | 0.703 | 0.684 | – | ||
| RP30E | Children’s hospital | 0.640 | 0.626 | - | ||||
| Principal referral | 0.691 | 0.691 | 0.687 | – | ||||
| Public acute | 0.672 | 0.671 | – | |||||
| RA30 | Children’s hospital | 0.850 | 0.845 | 0.843 | 0.823 | |||
| Principal referral | 0.766 | 0.761 | 0.759 | 0.733 | ||||
| Public acute | 0.739 | 0.731 | 0.729 | 0.710 | ||||
| RP 30 | Children’s hospital | 0.718 | 0.712 | 0.705 | 0.700 | |||
| Principal referral | 0.714 | 0.711 | 0.709 | 0.692 | ||||
| Public acute | 0.710 | 0.709 | 0.707 | 0.691 | ||||
| RP30E | Children’s hospital | 0.643 | 0.642 | 0.644 | 0.638 | |||
| Principal referral | 0.698 | 0.697 | 0.697 | 0.694 | ||||
| Public acute | 0.678 | 0.677 | 0.677 | 0.675 | ||||
| RA30 | Children’s hospital | 0.843 | 0.849 | 0.834 | 0.805 | – | ||
| Principal referral | 0.754 | 0.745 | 0.753 | 0.745 | 0.702 | – | ||
| Public acute | 0.726 | 0.720 | 0.723 | 0.720 | 0.672 | – | ||
| RP 30 | Children’s hospital | 0.720 | 0.713 | 0.720 | 0.709 | 0.661 | – | |
| Principal referral | 0.683 | 0.706 | 0.643 | – | ||||
| Public acute | 0.707 | 0.707 | 0.707 | 0.706 | 0.640 | - | ||
| RP30E | Children’s hospital | 0.642 | 0.643 | 0.641 | 0.643 | 0.597 | – | |
| Principal referral | 0.692 | 0.673 | 0.693 | 0.657 | – | |||
| Public acute | 0.672 | 0.693 | 0.674 | 0.674 | 0.635 | – | ||
The last column reports average AUCs of pruned final models, the corresponding confidence intervals and the number of features used in the model (). The best obtained results are shown in italic. The results obtained with the models using the selected data group and the final models are shown in bold.
Figure 2Calibration (top), PRC (middle) and AUC (bottom) plots obtained for the final models across the cohorts for the three output metrics RA30, RP30, RP30E.
Figure 3Children’s hospital: Summary plot of Shapley values computed for each patient individually in the test partition. Features are sorted top-down based on their global contribution. The distance of a dot representing a sample from the vertical line indicates its contribution. The color of a dot indicates feature value for that sample. Blue and pink color represent extreme values of the feature. Shapley values on the right side of vertical axes ‘push’ predictions towards the class 1 and those on the left side towards the class 0.
Figure 4Two examples of visualizations explaining the relative importance of predictors. Each visualisation lists features in descending order based on how much they contributed to a prediction.
Figure 5RA30 cohort selection procedure.
Figure 6RP30 cohort selection procedure.
Figure 7RP30E cohort selection procedure.