| Literature DB >> 32285012 |
C Beau Hilton1,2,3, Alex Milinovich4, Christina Felix5, Nirav Vakharia6, Timothy Crone7, Chris Donovan7, Andrew Proctor7, Aziz Nazha1,2,3.
Abstract
Hospital systems, payers, and regulators have focused on reducing length of stay (LOS) and early readmission, with uncertain benefit. Interpretable machine learning (ML) may assist in transparently identifying the risk of important outcomes. We conducted a retrospective cohort study of hospitalizations at a tertiary academic medical center and its branches from January 2011 to May 2018. A consecutive sample of all hospitalizations in the study period were included. Algorithms were trained on medical, sociodemographic, and institutional variables to predict readmission, length of stay (LOS), and death within 48-72 h. Prediction performance was measured by area under the receiver operator characteristic curve (AUC), Brier score loss (BSL), which measures how well predicted probability matches observed probability, and other metrics. Interpretations were generated using multiple feature extraction algorithms. The study cohort included 1,485,880 hospitalizations for 708,089 unique patients (median age of 59 years, first and third quartiles (QI) [39, 73]; 55.6% female; 71% white). There were 211,022 30-day readmissions for an overall readmission rate of 14% (for patients ≥65 years: 16%). Median LOS, including observation and labor and delivery patients, was 2.94 days (QI [1.67, 5.34]), or, if these patients are excluded, 3.71 days (QI [2.15, 6.51]). Predictive performance was as follows: 30-day readmission (AUC 0.76/BSL 0.11); LOS > 5 days (AUC 0.84/BSL 0.15); death within 48-72 h (AUC 0.91/BSL 0.001). Explanatory diagrams showed factors that impacted each prediction.Entities:
Keywords: Health care economics; Outcomes research; Risk factors
Year: 2020 PMID: 32285012 PMCID: PMC7125114 DOI: 10.1038/s41746-020-0249-z
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Characteristics of hospital encounters in the study sample, overall and according to readmission and extended length of stay.
| Characteristic | Overall | Not readmitted within 30 days | Readmitted within 30 days | Hospital stay less than 5 days | Hospital stay over 5 days |
|---|---|---|---|---|---|
| Number of hospitalizations | 1,485,880 | 1,274,858 | 211,022 | 1,234,148 | 251,732 |
| Age, median [Q1, Q3] | 59.0 [39.0, 73.0] | 59.0 [38.0, 73.0] | 62.0 [48.0, 76.0] | 58.0 [36.0, 72.0] | 66.0 [54.0, 78.0] |
| Female, | 826,025 (55.6) | 713,391 (56.0) | 112,634 (53.4) | 698,382 (56.6) | 127,643 (50.7) |
| Race/ethnicity, | |||||
| African American | 333,212 (22.4) | 276,208 (21.7) | 57,004 (27.0) | 276,476 (22.4) | 56,736 (22.5) |
| White | 1,055,180 (71.1) | 913,085 (71.7) | 142,095 (67.4) | 873,453 (70.8) | 181,727 (72.2) |
| Other | 96,592 (6.5) | 84,755 (6.7) | 11,837 (5.6) | 83,453 (6.8) | 13,139 (5.2) |
| Marital status, | |||||
| Divorced or separated | 134,841 (9.1) | 111,680 (8.8) | 23,161 (11.0) | 108,779 (8.8) | 26,062 (10.4) |
| Married or partnered | 594,375 (40.0) | 515,620 (40.5) | 78,755 (37.3) | 494,338 (40.1) | 100,037 (39.7) |
| Single | 554,116 (37.3) | 477,592 (37.5) | 76,524 (36.3) | 472,301 (38.3) | 81,815 (32.5) |
| Widowed | 175,822 (11.8) | 146,611 (11.5) | 29,211 (13.8) | 136,888 (11.1) | 38,934 (15.5) |
| Other | 26,200 (1.8) | 22,847 (1.8) | 3353 (1.6) | 21,347 (1.7) | 4853 (1.9) |
| Payer class, | |||||
| Medicaid | 221,969 (16.4) | 188,630 (16.3) | 33,339 (17.0) | 193,978 (17.2) | 27,991 (12.1) |
| Medicare | 725,125 (53.5) | 601,752 (51.9) | 123,373 (63.0) | 567,435 (50.5) | 157,690 (68.4) |
| Private health insurance | 329,842 (24.3) | 298,444 (25.7) | 31,398 (16.0) | 293,292 (26.1) | 36,550 (15.9) |
| Other | 78,269 (5.8) | 70,553 (6.1) | 7716 (3.9) | 69,940 (6.2) | 8329 (3.6) |
| Comorbidities, | |||||
| Cancer | 183,367 (12.3), | 142,205 (11.2) | 41,162 (19.5) | 140,188 (11.4) | 43,179 (17.2) |
| Metastatic solid tumor | 55,906 (3.8) | 41,867 (3.3) | 14,039 (6.7) | 42,339 (3.4) | 13,567 (5.4) |
| Solid organ transplant | 33,780 (2.3) | 24,928 (2.0) | 8852 (4.2) | 22,837 (1.9) | 10,943 (4.3) |
| AIDS/HIV | 4552 (0.3) | 3310 (0.3) | 1242 (0.6) | 3703 (0.3) | 849 (0.3) |
| Renal disease | 177,544 (11.9) | 133,099 (10.4) | 44,445 (21.1) | 129,114 (10.5) | 48,430 (19.2) |
| Mild liver disease | 93,947 (6.3) | 71,396 (5.6) | 22,551 (10.7) | 73,362 (5.9) | 20,585 (8.2) |
| Moderate or severe liver disease | 22,816 (1.5) | 15,542 (1.2) | 7274 (3.4) | 15,971 (1.3) | 6845 (2.7) |
| Diabetes with chronic complication | 125,118 (8.4) | 95,619 (7.5) | 29,499 (14.0) | 95,561 (7.7) | 29,557 (11.7) |
| Diabetes without chronic complication | 293,379 (19.7) | 232,187 (18.2) | 61,192 (29.0) | 226,901 (18.4) | 66,478 (26.4) |
| Hypertension | 939,048 (63.2) | 779,460 (61.1) | 159,588 (75.6) | 744,603 (60.3) | 194,445 (77.2) |
| Myocardial infarction | 69,914 (4.7) | 53,267 (4.2) | 16,647 (7.9) | 52,835 (4.3) | 17,079 (6.8) |
| Congestive heart failure | 215,510 (14.5) | 164,879 (12.9) | 50,631 (24.0) | 155,898 (12.6) | 59,612 (23.7) |
| Cerebrovascular disease | 193,243 (13.0) | 154,368 (12.1) | 38,875 (18.4) | 148,158 (12.0) | 45,085 (17.9) |
| Chronic obstructive pulmonary disease | 302,548 (20.4) | 240,195 (18.8) | 62,353 (29.5) | 238,907 (19.4) | 63,641 (25.3) |
| Pneumonia | 188,684 (12.7) | 142,066 (11.1) | 46,618 (22.1) | 142,437 (11.5) | 46,247 (18.4) |
| Dementia | 56,876 (3.8) | 45,461 (3.6) | 11,415 (5.4) | 41,554 (3.4) | 15,322 (6.1) |
| Anxiety | 181,440 (12.2) | 146,263 (11.5) | 35,177 (16.7) | 150,668 (12.2) | 30,772 (12.2) |
| Depression | 259,323 (17.5) | 207,914 (16.3) | 51,409 (24.4) | 212,806 (17.2) | 46,517 (18.5) |
| Psychosis | 52,085 (3.5) | 39,086 (3.1) | 12,999 (6.2) | 38,544 (3.1) | 13,541 (5.4) |
| Receiving dialysis | 17,791 (1.2) | 12,604 (1.0) | 5187 (2.5) | 10,658 (0.9) | 7133 (2.8) |
| Selected discharge laboratory results, | |||||
| Low hemoglobin level (<12 g/dL) | 248,387 (16.7) | 204,139 (16.0) | 44,248 (21.0) | 200,374 (16.2) | 48,013 (19.1) |
| Low sodium level (<135 mEq/L) | 38,847 (2.6) | 31,439 (2.5) | 7408 (3.5) | 29,467 (2.4) | 9380 (3.7) |
| Hospital encounter information, median [Q1, Q3] or | |||||
| Previous hospitalizations | 1.0 [0.0, 2.0] | 0.0 [0.0, 2.0] | 2.0 [0.0, 6.0] | 1.0 [0.0, 2.0] | 1.0 [0.0, 3.0] |
| Emergency department (ED) admission | 725,843 (48.8) | 603,317 (47.3) | 122,526 (58.1) | 618,055 (50.1) | 107,788 (42.8) |
| Any ED visits in the past 6 months | 644,102 (43.3) | 511,323 (40.1) | 132,779 (62.9) | 521,248 (42.2) | 122,854 (48.8) |
| Total ED visits in the past 6 months | 0.0 [0.0, 1.0] | 0.0 [0.0, 1.0] | 1.0 [0.0, 3.0] | 0.0 [0.0, 1.0] | 0.0 [0.0, 2.0] |
| Admission class, | |||||
| Ambulatory surgical procedures | 8081 (0.5) | 7464 (0.6) | 617 (0.3) | 8060 (0.7) | 21 (0.0) |
| Emergency | 7058 (0.5) | 6417 (0.5) | 641 (0.3) | 7055 (0.6) | 3 (0.0) |
| Hospice | 1486 (0.1) | 1463 (0.1) | 23 (0.0) | 1357 (0.1) | 129 (0.1) |
| Inpatient | 1,185,985 (80.0) | 1,011,772 (79.6) | 174,213 (82.7) | 937,614 (76.2) | 248,371 (98.7) |
| Observation | 261,942 (17.7) | 228,559 (18.0) | 33,383 (15.8) | 260,955 (21.2) | 987 (0.4) |
| Outpatient | 10,559 (0.7) | 9415 (0.7) | 1144 (0.5) | 10,513 (0.9) | 46 (0.0) |
| Psychiatric inpatient | 3381 (0.2) | 2936 (0.2) | 445 (0.2) | 2198 (0.2) | 1183 (0.5) |
| Other | 4074 (0.3) | 3799 (0.3) | 275 (0.1) | 3082 (0.3) | 992 (0.4) |
| Discharge location, | |||||
| Expired | 18,615 (1.4) | 18,615 (1.6) | 0 (0.0) | 10,907 (1.0) | 7708 (3.3) |
| General acute care hospital | 19,855 (1.5) | 17,490 (1.5) | 2365 (1.2) | 16,105 (1.4) | 3750 (1.6) |
| Home | 959,559 (71.1) | 833,797 (72.2) | 125,762 (64.8) | 862,810 (77.1) | 96,749 (42.0) |
| Home care services | 134,970 (10.0) | 109,327 (9.5) | 25,643 (13.2) | 93,833 (8.4) | 41,137 (17.9) |
| Hospice | 14,318 (1.1) | 13,765 (1.2) | 553 (0.3) | 8879 (0.8) | 5439 (2.4) |
| Intermediate care facility | 9046 (0.7) | 7451 (0.6) | 1595 (0.8) | 5215 (0.5) | 3831 (1.7) |
| Left against medical advice | 13,864 (1.0) | 10,599 (0.9) | 3265 (1.7) | 13,374 (1.2) | 490 (0.2) |
| Long-term care facility | 14,592 (1.1) | 12,210 (1.1) | 2382 (1.2) | 5403 (0.5) | 9189 (4.0) |
| Skilled nursing facility | 145,882 (10.8) | 115,106 (10.0) | 30,776 (15.8) | 87,530 (7.8) | 58,352 (25.3) |
| Transfer to a psychiatric hospital | 6828 (0.5) | 6276 (0.5) | 552 (0.3) | 6197 (0.6) | 631 (0.3) |
| Transfer to another hospital | 4482 (0.3) | 4032 (0.3) | 450 (0.2) | 3797 (0.3) | 685 (0.3) |
| Other | 7109 (0.5) | 6240 (0.5) | 869 (0.4) | 4740 (0.4) | 2369 (1.0) |
| Outcomes of interest | |||||
| 30-day readmissions, | 211,022 (14.2) | 0 (0.0) | 211,022 (100.0) | 158,577 (12.8) | 52,445 (20.8) |
| Length of stay in days, median [Q1, Q3] | 2.9 [1.7, 5.3] | 2.8 [1.6, 5.1] | 3.9 [2.0, 7.0] | 2.4 [1.4, 3.9] | 10.6 [8.3, 15.0] |
Fig. 130-Day readmission.
a Shows the most impactful features on prediction (ranked from most to least important). b Shows the distribution of the impacts of each feature on the model output. The colors represent the feature values for numeric features: red for larger values and blue for smaller. The line is made of individual dots representing each admission, and the thickness of the line is determined by the number of examples at a given value (for example, most patients have a low number of past admissions). A negative SHAP value (extending to the left) indicates a reduced probability, while a positive one (extending to the right) indicates an increased probability. For non-numeric features, such as primary diagnosis, the gray points represent specific possible values, with certain diagnoses greatly increasing or reducing the model’s output, while the majority of diagnoses have relatively mild impact on prediction. c, d Show the composition of individualized predictions for two patients. The patient in c was admitted from the emergency outpatient unit with a headache and stayed for >7 days. In addition, this patient had been hospitalized 3 times prior to this admission and had been discharged from the last admission only 8 days prior. The predicted probability of 30-day readmission (~0.30) was three times the baseline value predicted by the model (~0.1). All of the listed features increased the model’s prediction of risk by the relative amounts shown by the size of the red bars. Conversely, the patient in d was admitted for a complete uterovaginal prolapse, stayed less than a full day, and had no reported comorbidities, such as hypertension, depression, or a history of cancer. The model predicted their probability of 30-day readmission at 0.03 or roughly one-third of the baseline prediction. The top variables that contribute and will fit on the chart are shown, but the others can be queried in the live system. The model considers all variables, and SHAP reports on all variables internally, but the images are understandably truncated for visibility.
Performance of predictive models.
| Target | ROC AUC | Average precision | Precision | Recall | Accuracy | F1 Score | Matthews correlation coefficient | Brier score loss | RMSE |
|---|---|---|---|---|---|---|---|---|---|
| Readmitted within 30 days | 0.758 [0.755 to 0.762] | 0.383 [0.377 to 0.388] | 0.632 [0.620 to 0.647] | 0.102 [0.098 to 0.106] | 0.861 [0.860 to 0.861] | 0.176 [0.169 to 0.182] | 0.214 [0.208 to 0.220] | 0.108 [0.108 to 0.109] | — |
| Readmitted within 7 days | 0.701 [0.696 to 0.707] | 0.127 [0.122 to 0.133] | 0.586 [0.455 to 0.722] | 0.003 [0.002 to 0.004] | 0.949 [0.949 to 0.949] | 0.006 [0.004 to 0.008] | 0.040 [0.030 to 0.051] | 0.047 [0.047 to 0.047] | — |
| Readmitted within 5 days | 0.691 [0.684 to 0.698] | 0.091 [0.086 to 0.095] | 0.456 [0.000 to 1.000] | 0.000 [0.000 to 0.001] | 0.963 [0.963 to 0.963] | 0.001 [0.000 to 0.002] | 0.013 [−0.001 to 0.029] | 0.035 [0.035 to 0.035] | — |
| Readmitted within 3 days | 0.681 [0.674 to 0.689] | 0.057 [0.053 to 0.062] | 0.000 [0.000 to 0.000] | 0.000 [0.000 to 0.000] | 0.978 [0.978 to 0.978] | 0.000 [0.000 to 0.000] | 0.000 [0.000 to 0.000] | 0.021 [0.021 to 0.021] | — |
| Days to readmissiona | — | — | — | — | — | — | — | — | 8.98 |
| Death within 48–72 ha | 0.91 | — | — | — | — | — | — | 0.001 | — |
| Hospital stay >7 days | 0.830 [0.827 to 0.833] | 0.567 [0.561 to 0.572] | 0.653 [0.646 to 0.659] | 0.331 [0.325 to 0.337] | 0.827 [0.825 to 0.828] | 0.439 [0.434 to 0.445] | 0.378 [0.371 to 0.384] | 0.122 [0.121 to 0.123] | — |
| Hospital stay >5 days | 0.829 [0.827 to 0.832] | 0.705 [0.701 to 0.710] | 0.690 [0.685 to 0.695] | 0.546 [0.541 to 0.552] | 0.767 [0.765 to 0.770] | 0.609 [0.605 to 0.614] | 0.453 [0.447 to 0.459] | 0.155 [0.154 to 0.157] | — |
| Hospital stay >3 days | 0.824 [0.822 to 0.827] | 0.861 [0.859 to 0.864] | 0.760 [0.758 to 0.762] | 0.842 [0.839 to 0.845] | 0.752 [0.749 to 0.754] | 0.799 [0.797 to 0.801] | 0.480 [0.475 to 0.485] | 0.166 [0.165 to 0.167] | — |
| Length of stay (days)a | — | — | — | — | — | — | — | — | 3.94 |
aPerformance on these predictive tasks was poor to the extent that rigorous cross-validation was not performed.
Fig. 2Length of stay >5 days.
a shows the most impactful features on prediction (ranked from most to least important). b shows the distribution of the impacts of each feature on the model output. The colors represent the feature values for numeric features: red for larger values and blue for smaller. The line is made of individual dots representing each admission, and the thickness of the line is determined by the number of examples at a given value (for example, many of our patients are elderly). A negative SHAP value (extending to the left) indicates a reduced probability, while a positive one (extending to the right) indicates an increased probability. For example, advanced age increases the probability of extended length of stay (SHAP value between zero and one), while young age tends toward a SHAP value between roughly −1 and zero, corresponding to reduced probability. For non-numeric features, such as primary diagnosis, the gray points represent specific possible values, with certain diagnoses greatly increasing or reducing the model’s output, while the majority of diagnoses have relatively mild impact on prediction. c, d show the composition of individualized predictions for two patients. The 75-year-old patient in c was admitted to the inpatient service directly from a physician’s office with leakage of a heart valve graft. The patient received 32 medications in the first 24 h and has Medicare Part A insurance coverage. The model predicted that the patient’s probability of staying >5 days was 0.80, nearly four times the baseline prediction of ~0.2. The majority of the model’s prediction was based on the diagnosis, followed by the number of initial medications, and then the other variables as shown. The patient in d, on the other hand, had a predicted probability of length of stay of 0.06 or roughly one-fourth of the baseline, despite being admitted to the ICU within 24 h of admission. The major contributor to this low probability was the diagnosis of antidepressant poisoning, followed by a private insurance provider, and finally by a lack of BMI recorded in the chart for this encounter. The reasoning behind the importance of a missing value for BMI is unclear but is repeatedly apparent in several analyses and may have to do with systematic recording practices within the hospital system (see Agniel et al.[19] for an exploration of this phenomenon).