Ying P Tabak1, Xiaowu Sun, Carlos M Nunez, Vikas Gupta, Richard S Johannes. 1. *Medical Informatics, Becton, Dickinson and Company †The Biomedical Informatics Research Center at San Diego State University, San Diego, CA ‡Harvard Medical School and Brigham and Women's Hospital, Boston, MA.
Abstract
BACKGROUND: Identifying patients at high risk for readmission early during hospitalization may aid efforts in reducing readmissions. We sought to develop an early readmission risk predictive model using automated clinical data available at hospital admission. METHODS: We developed an early readmission risk model using a derivation cohort and validated the model with a validation cohort. We used a published Acute Laboratory Risk of Mortality Score as an aggregated measure of clinical severity at admission and the number of hospital discharges in the previous 90 days as a measure of disease progression. We then evaluated the administrative data-enhanced model by adding principal and secondary diagnoses and other variables. We examined the c-statistic change when additional variables were added to the model. RESULTS: There were 1,195,640 adult discharges from 70 hospitals with 39.8% male and the median age of 63 years (first and third quartile: 43, 78). The 30-day readmission rate was 11.9% (n=142,211). The early readmission model yielded a graded relationship of readmission and the Acute Laboratory Risk of Mortality Score and the number of previous discharges within 90 days. The model c-statistic was 0.697 with good calibration. When administrative variables were added to the model, the c-statistic increased to 0.722. CONCLUSIONS: Automated clinical data can generate a readmission risk score early at hospitalization with fair discrimination. It may have applied value to aid early care transition. Adding administrative data increases predictive accuracy. The administrative data-enhanced model may be used for hospital comparison and outcome research.
BACKGROUND: Identifying patients at high risk for readmission early during hospitalization may aid efforts in reducing readmissions. We sought to develop an early readmission risk predictive model using automated clinical data available at hospital admission. METHODS: We developed an early readmission risk model using a derivation cohort and validated the model with a validation cohort. We used a published Acute Laboratory Risk of Mortality Score as an aggregated measure of clinical severity at admission and the number of hospital discharges in the previous 90 days as a measure of disease progression. We then evaluated the administrative data-enhanced model by adding principal and secondary diagnoses and other variables. We examined the c-statistic change when additional variables were added to the model. RESULTS: There were 1,195,640 adult discharges from 70 hospitals with 39.8% male and the median age of 63 years (first and third quartile: 43, 78). The 30-day readmission rate was 11.9% (n=142,211). The early readmission model yielded a graded relationship of readmission and the Acute Laboratory Risk of Mortality Score and the number of previous discharges within 90 days. The model c-statistic was 0.697 with good calibration. When administrative variables were added to the model, the c-statistic increased to 0.722. CONCLUSIONS: Automated clinical data can generate a readmission risk score early at hospitalization with fair discrimination. It may have applied value to aid early care transition. Adding administrative data increases predictive accuracy. The administrative data-enhanced model may be used for hospital comparison and outcome research.
Readmission shortly after discharge is associated with clinical and financial burden to patients and society.1 It is used as a publicly reported metric with reimbursement implications to hospitals by the Centers for Medicare and Medicaid Services (CMS).2 For hospitals with excess risk-standardized readmission rates, CMS started lowering the reimbursement in FY2013. As of FY2015, the penalty is up to 3% lower reimbursement for certain clinical conditions [acute myocardial infarction (AMI), heart failure (HF), pneumonia, chronic obstructive pulmonary disease, total hip arthroplasty, and total knee arthroplasty].Many factors are associated with readmissions. Some are intrinsic, attributable to the reduced patient reserve due to disease progression and severity at each admission.1,3–5 Some may relate to the clinical planning and care coordination while patients are still in the hospital. Others may relate to postdischarge care and other social factors.6,7Current readmission risk adjustment models mostly rely on administrative data submitted to the payers after patients are discharged.8–10 Models relying on administrative data cannot be applied to the real-time patient care environment because the discharge diagnoses are not available until after patients are discharged. Most current models incorporated variables for comorbidity based on secondary diagnoses and use of prior medical services, but lack clinical data to assess the impact of physiological function or acute illness severity.1,8–10 The limited number of studies that attempted to incorporate physiological variables focused on a predefined clinical condition such as congestive HF.11 Such disease-specific models have applied value when the electronic health record (EHR) system is highly advanced and able to correctly classify patients into specific disease groups at admission. However, unlike discharge diagnoses that are standardized using codes from the International Classification of Diseases (ICD), admission diagnoses are not standardized across US hospitals. Often at the time of admission, patients display multiple signs and symptoms suggesting more than one clinical condition, which makes it difficult to unambiguously classify them into a single clinical category.From the hospital comparison perspective, while it is helpful to assess readmission rates for patient groups with specific conditions, these conditions account for only a small proportion of total readmissions. According to the nationally representative data reported by the Agency for Healthcare Research and Quality (AHRQ), the 30-day all cause readmission volume of AMI, HF, and pneumonia in 2012 accounted for just 9.8% of total readmission volume.12 In fact, the CMS has begun exploring hospital-wide (all-condition) 30-day risk-standardized readmission measure and risk adjustment models.13 Hence, understanding the full picture of hospital readmission may have practical implications to hospitals, all payers, and the governmental agencies.The objective of this study was 2-fold. First, we sought to develop an early readmission risk score (RRS) for all hospitalized adult patients, using clinical data at the time of admission, that were widely available in the EHR systems. Second, we expanded the early RRS model to include administrative data that are available after patients are discharged. We hypothesized that (1) clinical severity and frequency of recent hospital stays, captured in the real-time EHR system, increase the risk of readmission; (2) type of diseases and comorbidities captured in the administrative data may enhance the predictive accuracy. For applications, the early RRS may serve as a potential real-time decision support tool for clinicians to target high risk patients for discharge planning and other readmission preventive interventions. The administrative data enhanced postdischarge model may be used for retrospective interhospital comparison of risk adjusted readmission rates.
METHODS
Data
We used one of the Clinical Research Databases from Becton Dickinson & Company (Franklin Lakes, NJ). This database has been used for research for the past 2 decades and the data collection system has been fully described elsewhere.14–17 We used data from the EHR systems of 70 acute care hospitals for all consecutively hospitalized adult patients (age 18 y or older) from 2006 through 2008. The laboratory data included numeric laboratory test results and collection time. A total of 96% of patients had laboratory data on the day of admission. For patients with multiple laboratory assessments on the admission day, we used the first reported value. For patients who did not have laboratory data on admission day, the laboratory data on the next day were used. Missing laboratory test results were treated as not clinically indicated and therefore were grouped into the reference group, an approach used by APACHE18 and other researchers.The database also included hospital administrative data comprising demographics, admission and discharge date, discharge disposition, principal diagnosis, and secondary diagnosis codes.
Outcome
The outcome variable was all-cause readmission within 30 days from index hospital discharge. Each live index hospital discharge date served as the starting date for the 30-day readmission window. A patient could have multiple admissions during the study period. The analytic unit was an inpatient admission episode. This floating hospital discharge index approach is used by the Healthcare Cost and Utilization Project,12 a widely used nation-wide database from the AHRQ, as well as other researchers.19
Readmission Risk Candidate Variables
For the first model (RRS), we restricted the candidate variables to information in the EHR system that were available at or soon after the hospital admission to generate a predictive model applicable shortly after patients were hospitalized. We opted to use a published Acute Laboratory Risk of Mortality Score (ALaRMS)17 as an aggregated measure of clinical severity. ALaRMS assesses the clinical severity of hospitalized patients using demographics and admission laboratory test results. It included weighted age, sex, and a total of 23 numeric laboratory test results: (a) serum chemistry: albumin, aspartate transaminase, alkaline phosphatase, blood urea nitrogen, calcium, creatinine, glucose, potassium, sodium, and total bilirubin; (b) hematology and coagulation parameters: bands, hemoglobin, partial thromboplastin time, prothrombin time international normalized ratio, platelets, and white blood cell count; (c) arterial blood gas: partial pressure of carbon dioxide, partial pressure of oxygen, and pH value; (d) cardiac markers: brain natriuretic peptide, creatine phosphokinase MB, probrain natriuretic peptide, and troponin I.We included the number of admissions in the previous 90 days as the second dimension of readmission risk. The rationale is that both progression of disease and disease severity reduce patients’ intrinsic reserve. The more frequent hospitalization, the more likely they will be readmitted.1,3–5 This variable also takes into account the nature of many chronic illnesses that can be managed but not necessarily cured. Hence, a certain rate of readmission may be expected.For the administrative data–enhanced model, in addition to the RRS, we tested the Medicaid status as a surrogate for low social economic status, length of stay, admission source, discharge disposition, and discharge diagnoses of index hospital discharge. We used the AHRQ’s principal diagnosis-based Clinical Classifications Software (CCS)20 and secondary diagnosis-based Comorbidity Software (CS)21 as standard disease classifications. The CCS collapses over 14,000 ICD-9 diagnosis codes into 285 clinically meaningful categories. The CS grouped selected secondary diagnosis codes into 30 comorbidity categories.
Model Development and Validation
Early Readmission Risk Model
We randomly split study population into 70% as the derivation cohort and 30% as the validation cohort. To derive the early readmission risk model, we fit a multivariable logistic regression model. We first conducted univariable analysis on the relationship of the ALaRMS score (the aggregated clinical severity) and the readmission rate. We examined the distribution of the ALaRMS scores and the corresponding readmission rates and categorized the continuous variable accordingly for easier application. We fit logistic regression models to compare the model fit. Similarly, we examined and categorized the number of inpatient discharges during the 90 days before the index discharge and its corresponding readmission rate. Then, we added the categorized number of prior 90-day admissions into the logistic regression model. We converted the final multivariable logistic regression model to an integer score system (early RRS), using methods described by Sullivan et al.22 Specifically, we identified the variable with the smallest coefficient in the final multivariable model and applied it as the denominator. Then we divided each of the remaining regression coefficients in the model by this denominator and rounded the resulting quotient to the nearest whole number (integer), which formed the score weight for that variable. We then calculated each person’s overall RRS by summing the points across all variables present. Converting model coefficients into a score system makes the risk adjustment model easy to understand and implement. We validated the RRS model using the validation cohort. We assessed model discrimination using the c-statistic.23
Administrative Data–Enhanced Model
For the administrative data–enhanced model, we added candidate variables available from standard administrative data, including admission source, discharge disposition, payor, index hospital length of stay, principal diagnosis-based CCS, and secondary diagnosis-based CS. We used P<0.05 as the model variable retention criterion. We examined the c-statistic change.
Sensitivity Analysis
We conducted sensitivity analysis to assess the predictive accuracy (c-statistic) of the RRS when applied to patients by different type of hospital characteristics (teaching status, number of beds, rural/urban) as well as patient categories (medical vs. surgical and discharge principal diagnosis-based disease categories).
RESULTS
Patient Characteristics
Table 1 showed that the derivation cohort consisted of 836,992 discharges and 99,386 (11.9%) readmissions. Overall, 39.8% were male, 81.7% white, and 36.6% covered by Medicare. The overall median age was 63 years, interquartile range was 43, 78. The admission laboratory test results showed that patients with abnormal results had higher readmission rates. Patients with readmissions had higher ALaRMS scores than those without readmissions. The mean (SD) of the ALaRMS score was 43.9 (18.9) versus 33.8 (19.8), P<0.0001 and the median (interquartile range) was 43 (32, 55) versus 34 (20, 46), P<0.0001, respectively. The number of discharges during the previous 90 days was associated with the readmission rate in a graded fashion. The patient characteristics for the validation cohort were similar.
TABLE 1
Patient Characteristics by Derivation and Validation Cohorts
Patient Characteristics by Derivation and Validation Cohorts
Early Readmission Model and Risk Score
The early readmission model yielded a graded relationship of the ALaRMS score and the elevated risk of readmission (Table 2). Compared with those with ALaRMS≤10, those with ALaRMS between 11 and 20 would have a 53% increase in risk of readmission (odds ratio, 1.53; 95% confidence intervals, 1.47–1.60). In general, the higher the ALaRMS score, the higher the readmission risk. The number of discharges during the previous 90 days was a significant predictor of readmission. Compared with those with no hospital discharges in the previous 90 days, one previous discharge would double the risk of readmission (odds ratio, 2.08; 95% confidence interval, 2.05–2.12). This early readmission risk model had a c-statistic of 0.697 (Table 3). When administrative data were added in the RRS model, the c-statistic improved to 0.722. The Hosmer-Lemeshow goodness-of-fit test for the RRS model indicated good calibration for both derivation and validation cohorts (Appendix A, Supplemental Digital Content 1, http://links.lww.com/MLR/B299).
TABLE 2
Early Readmission Model and Risk Score
TABLE 3
Cumulative c-Statistics of Readmission Models
Early Readmission Model and Risk ScoreCumulative c-Statistics of Readmission ModelsThe RRS prevalence distribution and corresponding readmission rates by derivation and validation cohorts were presented in Figure 1. There were 12.9% discharges falling into the lowest RRS group (RRS=0), whose readmission rate was 3.2%. In contrast, for those in the top 2 highest RRS groups (RRS=9 or 10+), over 50% were readmitted. With each point increase of RRS, there was a nearly linear increase of readmission risk. The Cochran-Armitage trending test was significant (P<0.0001).
FIGURE 1
Early readmission risk score distribution. The dotted line represents proportion of discharges in each readmission risk score group for the derivation cohort. The proportion of discharges of the validation cohort was nearly identical as that of the derivation cohort; data were not shown due to nearly complete overlapping.
Early readmission risk score distribution. The dotted line represents proportion of discharges in each readmission risk score group for the derivation cohort. The proportion of discharges of the validation cohort was nearly identical as that of the derivation cohort; data were not shown due to nearly complete overlapping.The sensitivity analysis revealed that the RRS model had similar predictive power for the patient populations in the teaching versus nonteaching, small versus large, or rural versus urban hospitals, with a c-statistic ranging from 0.690 to 0.706 (Table 4). The RRS model also displayed a similar c-statistic for medical verus surgical patients (0.692 vs. 0.690). When patients were segregated into subgroups by major disease categories, 14 of 17 (82%) categories had c-statistic above 0.65, 2 (12%) between 0.60 and 0.64, only 1 category (complications of pregnancy, childbirth, and the puerperium) was below 0.60 due to very low readmission rate of 2.9%. The Hosmer-Lemeshow model calibration charts for all sensitivity analysis are presented in Appendix B1–B3 (Supplemental Digital Content 2, http://links.lww.com/MLR/B300; Supplemental Digital Content 3, http://links.lww.com/MLR/B301; Supplemental Digital Content 4, http://links.lww.com/MLR/B302).
TABLE 4
Sensitivity Analysis
Sensitivity Analysis
DISCUSSION
We developed a 30-day RRS for adult patients admitted to acute care hospitals based on objective clinical parameters available at the time of hospital admission. To our knowledge, this is the largest (over 1 million discharges) clinical database consisting of laboratory test results used for readmission model development and validation. The large sample size resulted in a precise estimate as evidenced by the tight confidence intervals of risk factors. We found a graded relationship between the RRS and the readmission. Although the validity of this RRS needs to be tested prospectively in an environment with electronic health care data automation, this scoring system could have potential use for prioritizing discharge planning and coordinating care transition and management early in the hospital stay. Identifying high risk cases at the time of admission could enable clinicians to make connections with the patient and start to build a relationship during the patient’s hospital stay, which could make postdischarge engagements more effective when undertaken by somebody the patient knew and felt comfortable with. Given the average hospital length of stay is approximately 3–4 days, identifying patients with high readmission risk at the early hospital stay has pragmatic value.Our RRS model using objective data alone has the c-statistic of 0.697, compared with 0.693,5 reported for readmission models that used both clinical and administrative data. When we added discharge diagnosis based administrative variables, the model c-statistic increased from 0.697 to 0.722. When we applied the RRS model to the 3 disease groups being reported by the CMS, the c-statistics of our models were higher than the CMS readmission models: 0.63 versus 0.60 (HF),8 0.65 versus 0.63 (AMI),9 0.68 versus 0.63 (pneumonia),10 respectively, albeit the definition of disease groups and patient population may not be directly comparable. In more recent CMS efforts using administrative data alone to predict 30-day all-cause readmissions hospital-wide, the reported model c-statistics ranged from 0.604 to 0.676,13 which are lower than our RRS model as well as other reported models3,5 that used both clinical and administrative data. Perhaps more importantly, incorporating objective clinical data increases the clinical plausibility.1,3,5,15–17Many factors affect readmission. Some are intrinsic. With disease progression, a patient’s physiological reserve weakens and the clinical condition deteriorates to the point where the need for more frequent acute care becomes necessary. The more physiological reserves are depleted, the more likely the readmission. This depletion can be objectively assessed by the laboratory test results and other clinical parameter. Our finding that the ALaRMS score, which consists of age, sex, and abnormal laboratory test results, was correlated in a graded fashion with readmission risk demonstrates that patients’ physiological reserve and clinical severity play important roles in rehospitalization. The 23 numeric laboratory results assessed the functions of major organ systems. The further the deviation from the reference range, the more severe the clinical conditions. Incorporating laboratory data in the readmission model has advantages. They are objective and less prone to variations in subjective judgment. They are quantitative, allowing more accurate depiction of the graded relationship between severity and the outcome. They are widely automated in EHR systems, allowing potential automated real-time application of algorithms.The finding that admission frequency during the previous three months is a strong predictor of readmission is plausible, given that the more frequent hospitalization may potentially reflects disease progression, continued physiological decline, or near the end of life. This is consistent with previous studies.3,4 It is also corroborated with the findings that the health care expenditure for the final year of life accounts for approximately $40,000 per Medicare decedent,24 and the 5% of Medicare patients who die each year account for nearly 30% of payments.25Our study found that laboratory values, a measure of physiological status and clinical stability, predict readmission. This suggests the possibility that rapid improvement in laboratory values during hospitalization might lead to clinical and physiological stability. Those patients whose care process results in the largest improvements in the shortest time might have lower readmission risk. Consequently, improved clinical stability may be translated into fewer readmissions. Hence, clinical severity as indicated by deranged laboratory test results and disease progression could potentially be modifiable by better care, which could lead to lower readmission rates.For retrospectively collected postdischarge administrative data, we found principal diagnosis-based disease group variables enhanced the model discrimination. Certain clinical conditions are more likely to require rehospitalization than others. For example, severe systemic or chronic diseases affecting major organ systems such as infections, neoplasms, circulatory, respiratory, endocrine, or other systems might be more likely to be associated with readmissions. Hence, adding the principal diagnosis-based disease groups improves model discrimination. Secondary diagnosis-based comorbidities added a small improvement (from 0.715 to 0.722) above and beyond what have been in the model. The small impact of comorbidity may be due to the objective assessment of physiological function by the clinical laboratory data that have already accounted for comorbidity to a large extent. This finding is consistent with a previous study that found limited contribution of comorbidity when clinical data are used in the model.3 As administrative variables are available after patients are discharged as standard billing data, the enhanced model may be used for retrospective hospital comparison and outcome research.
Limitations
We used all-cause readmission within 30 days as the outcome. We did not attempt to differentiate avoidable versus nonavoidable readmission. “Avoidable” is difficult to define. The SQLape, the ICD-10 code based system attempting to identify avoidable readmissions has been developed in Switzerland, but it was not validated in the United States.26 Studies on the “avoidable” readmission reported a wide range: 5.0%–78.9%.27 The large range may indicate the variation of the “avoidable” readmission definitions. Our conceptualization builds on the assumption that with the disease progression, patients’ physiological reserves weakens, which increases probability of readmission. Hence, we “expect” certain admission rates based on a given patient population. Adjusting physiological severity and primary diseases of major organ system may be more practical and fair than trying to define “avoidable” consistently in terms of implementing a hospital comparison metric. Likewise, we used existing data which did not contain information allowing us to identify “planned” readmissions. Continued effort in differentiating planned versus unplanned readmissions may further enhance the study in this field. For example, it may be possible that some “planned” readmissions may be identifiable if EHR systems get more integrated, in which the scheduling systems can specify planned readmissions.Our patient population comprised consecutive patients from over 70 hospitals. Although this is an improvement in patient diversity, it may not represent the US patient population because our hospitals were primarily from the northeastern region. Some studies found that socioeconomic status (SES) or admission or discharge to a skilled nursing facility (SNF) or long-term care (LTC) facility to be associated with readmission.1,28–30 We assessed this using Medicaid as a surrogate for the low SES. The 3 variables all together added a 0.001 to the c-statistic (from 0.697 to 0.698) above the RRS model. This small improvement in model discrimination was in line with other studies.28–30 In our model, the small increase of cumulative c-statistic was also supported by a logistic model including just these three variables (Medicaid, SNF/LTC admission source, and SNF/LTC discharge disposition) without any other covariates or RRS, which yielded a c-statistic of 0.54, indicating low discrimination of these 3 variables in our study population. In addition, there is a controversy regarding the use of SES as a risk adjuster for hospital comparison purpose because the concern that the higher readmission of the low SES patients may be associated lower quality of care provided to the disadvantaged patient population.31Unlike predicting the mortality risk, which showed highly accurate predictive power when objective clinical data are carefully crafted,15–17 predicting readmission remains complex. In addition to the intrinsic clinical risk of the patients which can be measured objectively, extrinsic risk factors, including care transition planning and other social and environmental factors may play important roles.1 These external risk factors can be difficult to collect electronically without integrated health care systems but can be incorporated for quality improvement purposes as they become available electronically.For the purpose of risk adjusted public reporting metrics, it may not even be desirable to adjust for process variables that are under the control of care providers. If substandard care processes, such as poor care coordination, are associated with higher readmission rates, then adding these variables in the model would give more credit (expecting more readmissions) to those hospitals/systems doing a lesser job than those who do a better job in coordinating care. Nevertheless, with increasing automation and connectivity of health care systems, factors that influence readmissions, such as discharge transition planning, postdischarge care, social support, might be captured electronically and studied to enhance the care system and reduce readmissions from the perspective of quality improvement.
CONCLUSIONS
Automated clinical laboratory data can generate a RRS early at hospital admission with fair discrimination. It may be implemented and tested in an electronic health care system to aid early care transition planning. Administrative data–enhanced model, created by adding discharge diagnosis data, improves readmission predictive accuracy. It may be used for retrospective risk adjusted hospital comparison and outcome research.Supplemental Digital Content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal's Website, www.lww-medicalcare.com.
Authors: Peter K Lindenauer; Sharon-Lise T Normand; Elizabeth E Drye; Zhenqiu Lin; Katherine Goodrich; Mayur M Desai; Dale W Bratzler; Walter J O'Donnell; Mark L Metersky; Harlan M Krumholz Journal: J Hosp Med Date: 2011-01-05 Impact factor: 2.960
Authors: Harlan M Krumholz; Zhenqiu Lin; Elizabeth E Drye; Mayur M Desai; Lein F Han; Michael T Rapp; Jennifer A Mattera; Sharon-Lise T Normand Journal: Circ Cardiovasc Qual Outcomes Date: 2011-03
Authors: Alexander B Blum; Natalia N Egorova; Eugene A Sosunov; Annetine C Gelijns; Erin DuPree; Alan J Moskowitz; Alex D Federman; Deborah D Ascheim; Salomeh Keyhani Journal: Circ Cardiovasc Qual Outcomes Date: 2014-05-13
Authors: Ruben Amarasingham; Ferdinand Velasco; Bin Xie; Christopher Clark; Ying Ma; Song Zhang; Deepa Bhat; Brian Lucena; Marco Huesch; Ethan A Halm Journal: BMC Med Inform Decis Mak Date: 2015-05-20 Impact factor: 2.796
Authors: Leora I Horwitz; Zhenqiu Lin; Jeph Herrin; Susannah Bernheim; Elizabeth E Drye; Harlan M Krumholz; Harold J Hines; Joseph S Ross Journal: BMJ Date: 2015-02-09
Authors: Mari M Nakamura; Sara L Toomey; Alan M Zaslavsky; Carter R Petty; Chen Lin; Guergana K Savova; Sherri Rose; Mark S Brittan; Jody L Lin; Maria C Bryant; Sepideh Ashrafzadeh; Mark A Schuster Journal: Acad Pediatr Date: 2018-11-20 Impact factor: 3.107
Authors: Jacek Kryś; Błażej Łyszczarz; Zofia Wyszkowska; Kornelia Kędziora-Kornatowska Journal: Int J Environ Res Public Health Date: 2019-07-02 Impact factor: 3.390