Dylan J Peterson1,2, Nicolai P Ostberg3, Douglas W Blayney4, James D Brooks5, Tina Hernandez-Boussard2,6. 1. Stanford University School of Medicine, Stanford, CA. 2. Department of Medicine (Biomedical Informatics), Stanford University School of Medicine, Stanford, CA. 3. Grossman School of Medicine, New York University, New York, NY. 4. Division of Medical Oncology, Department of Medicine, Stanford University School of Medicine, Stanford, CA. 5. Department of Urology, Stanford University School of Medicine, Stanford, CA. 6. Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA.
Abstract
PURPOSE: Acute care use (ACU) is a major driver of oncologic costs and is penalized by a Centers for Medicare & Medicaid Services quality measure, OP-35. Targeted interventions reduce preventable ACU; however, identifying which patients might benefit remains challenging. Prior predictive models have made use of a limited subset of the data in the electronic health record (EHR). We aimed to predict risk of preventable ACU after starting chemotherapy using machine learning (ML) algorithms trained on comprehensive EHR data. METHODS: Chemotherapy patients treated at an academic institution and affiliated community care sites between January 2013 and July 2019 who met inclusion criteria for OP-35 were identified. Preventable ACU was defined using OP-35 criteria. Structured EHR data generated before chemotherapy treatment were obtained. ML models were trained to predict risk for ACU after starting chemotherapy using 80% of the cohort. The remaining 20% were used to test model performance by the area under the receiver operator curve. RESULTS: Eight thousand four hundred thirty-nine patients were included, of whom 35% had preventable ACU within 180 days of starting chemotherapy. Our primary model classified patients at risk for preventable ACU with an area under the receiver operator curve of 0.783 (95% CI, 0.761 to 0.806). Performance was better for identifying admissions than emergency department visits. Key variables included prior hospitalizations, cancer stage, race, laboratory values, and a diagnosis of depression. Analyses showed limited benefit from including patient-reported outcome data and indicated inequities in outcomes and risk modeling for Black and Medicaid patients. CONCLUSION: Dense EHR data can identify patients at risk for ACU using ML with promising accuracy. These models have potential to improve cancer care outcomes, patient experience, and costs by allowing for targeted, preventative interventions.
PURPOSE: Acute care use (ACU) is a major driver of oncologic costs and is penalized by a Centers for Medicare & Medicaid Services quality measure, OP-35. Targeted interventions reduce preventable ACU; however, identifying which patients might benefit remains challenging. Prior predictive models have made use of a limited subset of the data in the electronic health record (EHR). We aimed to predict risk of preventable ACU after starting chemotherapy using machine learning (ML) algorithms trained on comprehensive EHR data. METHODS: Chemotherapy patients treated at an academic institution and affiliated community care sites between January 2013 and July 2019 who met inclusion criteria for OP-35 were identified. Preventable ACU was defined using OP-35 criteria. Structured EHR data generated before chemotherapy treatment were obtained. ML models were trained to predict risk for ACU after starting chemotherapy using 80% of the cohort. The remaining 20% were used to test model performance by the area under the receiver operator curve. RESULTS: Eight thousand four hundred thirty-nine patients were included, of whom 35% had preventable ACU within 180 days of starting chemotherapy. Our primary model classified patients at risk for preventable ACU with an area under the receiver operator curve of 0.783 (95% CI, 0.761 to 0.806). Performance was better for identifying admissions than emergency department visits. Key variables included prior hospitalizations, cancer stage, race, laboratory values, and a diagnosis of depression. Analyses showed limited benefit from including patient-reported outcome data and indicated inequities in outcomes and risk modeling for Black and Medicaid patients. CONCLUSION: Dense EHR data can identify patients at risk for ACU using ML with promising accuracy. These models have potential to improve cancer care outcomes, patient experience, and costs by allowing for targeted, preventative interventions.
Acute care use (ACU) including emergency department (ED) visits and inpatient (IP) admissions account for nearly half of the cost associated with oncologic care in the United States.[1,2] As many as 50% of these visits are potentially preventable with early outpatient interventions.[3-6] Furthermore, substantial regional variation in the costs and frequency of ACU suggest opportunities for reduction of ACU.[1,7] Not only is ACU costly, unplanned ACU negatively affects patient quality of life and is poor-quality care.[8,9] In an effort to improve quality of care, increase transparency, and reduce costs, the Centers for Medicare & Medicaid Services (CMS) implemented a new quality measure, Chemotherapy Measure (OP-35), which tracks IP admissions or ED visits for patients age ≥ 18 years for potentially preventable diagnoses within 30 days of outpatient chemotherapy administration.[10]
CONTEXT
Key ObjectiveTo predict risk for preventable acute care use (ACU) after starting chemotherapy using comprehensive structured electronic health record (EHR) data.Knowledge GeneratedWe evaluated several machine learning models to predict ACU following chemotherapy using a comprehensive capture of pretreatment structured variables from the EHR. The top-performing model achieved strong performance (area under the receiver operator curve = 0.783; 95% CI, 0.761 to 0.806) and identified known and previously underutilized features associated with ACU, including prior hospitalization and diagnosis of depression. We found inequities in outcomes and risk predictions for Black and Medicaid patients, suggesting closer monitoring could help achieve equitable outcomes for these groups.RelevancePatients at risk for preventable ACU after starting chemotherapy can be identified using machine learning. Models could risk-stratify patients using systematically captured EHR data, providing opportunities for clinical decision support tools to help health systems deliver targeted, preventative interventions to improve patient outcomes.Although many of these admissions or ED visits are avoidable with adequate preventative care, resource constraints necessitate preventative interventions to be targeted. Therefore, developing a robust, data-driven method to identify patients who are most at risk of ED visits or acute admissions at the time of chemotherapy initiation would be valuable for health systems seeking to improve performance on OP-35 metrics. In addition, risk-stratifying patients according to their likelihood of ACU would prioritize preventative care to patients most in need, improving patient outcomes, comfort, and satisfaction during a chemotherapy regimen. Artificial intelligence (AI), including machine learning (ML) has potential to provide this type of risk assessment to physicians.AI-driven oncologic care includes deriving novel insights from complex health data to predict patient outcomes, including cancer survival.[11,12] Regulatory changes have also demanded the use of real-world data, such as that derived from electronic health records (EHRs), to guide clinical assertions and practice guidelines.[13] ML applied to EHRs has been used specifically to tackle the issue of identifying patients with cancer at risk for ACU.[14-17] Although these studies advance our knowledge in the field, they were limited to a small number of available variables in the EHR, used logistic regression instead of more robust AI models, and/or did not use OP-35 criteria to determine preventable ACU.As AI enables better use of real-world data, there is an opportunity to develop clinical decision support tools that could help identify patients at higher risk of ACU following chemotherapy and therefore improve clinical and patient outcomes among these patients. We hypothesized that we could accurately identify patients with cancer undergoing chemotherapy who are at risk of preventable ACU, as defined by CMS's OP-35 using prechemotherapy EHR data. We developed, validated, and compared nine ML models to predict ACU at 3, 6, and 12 months following the start of chemotherapy among patients seen in oncology clinics affiliated with a large academic cancer center (Fig 1A). We also incorporated patient-reported outcomes (PROs) to evaluate the impact of these data on predicting risk for preventable ACU.
FIG 1.
Study design and flow diagram. (A) Comprehensive EHR data on patients who met our inclusion criteria were obtained from our database. Data generated 180 days before the start of chemotherapy were processed and used to train machine learning models to predict risk of preventable ACU. Data generated after chemotherapy initiation were used to determine patient outcomes. (B) Inclusion criteria included OP-35 denominator inclusion criteria, follow-up 180 days after the episode of care for patients who did not have ACU, and adequate data to make predictions. ACU, acute care use; EHR, electronic health record.
Study design and flow diagram. (A) Comprehensive EHR data on patients who met our inclusion criteria were obtained from our database. Data generated 180 days before the start of chemotherapy were processed and used to train machine learning models to predict risk of preventable ACU. Data generated after chemotherapy initiation were used to determine patient outcomes. (B) Inclusion criteria included OP-35 denominator inclusion criteria, follow-up 180 days after the episode of care for patients who did not have ACU, and adequate data to make predictions. ACU, acute care use; EHR, electronic health record.
METHODS
Setting
This retrospective, prognostic cohort study was performed at a Comprehensive Cancer Center (CCC). The CCC includes a large tertiary practice, as well as a community hospital and community practices. This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis and Minimum Information for Medical AI Reporting guidelines.[18,19] This study was approved by the university's institutional review board with a waiver of informed consent. The study was approved by the local ethics committee.
Study Population
Adult patients with cancer were eligible for this study if they underwent chemotherapy at the CCC between January 1, 2013, and July 10, 2019 (Fig 1B). Patients were excluded if they failed to meet inclusion criteria for OP-35 (n = 2,272), on the basis of CMS's 2019 Chemotherapy Measure Updates and Specifications Report.[20] To limit effects because of loss to follow-up, patients who did not have an encounter after their episode of care, defined by the 180 days after the start of chemotherapy, or with a recorded date of death within the episode of care without record of ACU were excluded (n = 1,217). Finally, patients were excluded if they had no vitals, laboratory, medication, diagnosis, and/or procedural data from the 180 days before chemotherapy (n = 1,237).
Study Variables
All clinical data were obtained from the EHR. The CCC has a fully implemented EPIC EHR system, installed in 2008, that includes demographic, social, vital sign, procedure, diagnosis, medication, laboratory, health care utilization, and cancer-specific data generated before the first date of chemotherapy.[21] Patient demographics and clinical data were captured at the time of diagnosis and first date of treatment, including age at treatment, primary cancer type, sex, insurance payor at treatment, ethnicity, race, and stage at diagnosis. Charlson comorbidity score was calculated for each patient on the basis of pretreatment diagnoses. Deaths were captured from the internal Cancer Registry or from the patient health records. To limit data leakage allowing the models to learn from future events, data generated after patients' first date of chemotherapy treatment were only used for cohort building and determining patient outcomes and otherwise were not included in any training or testing data provided to the models. Additionally, training data were limited to information generated in the 180 days before chemotherapy initiation. Further details on data cleaning and preprocessing are available in the Appendix 1.
Outcome Measures
Although CMS uses a 30-day time frame for OP-35 numerator inclusion, our primary outcome was a hospitalization or ED visit that met OP-35 criteria (OP-35 events) within the episode of care (180 days) following the initiation of chemotherapy. However, to test model sensitivities, we analyzed performance at 30, 180, and 365 days after chemotherapy initiation and additionally analyzed for ED-only and IP-only events. ED visits resulting in admission were treated as IP events.
Predictive Models
Patients were randomly assigned to be in the training (80%) or testing cohort (20%). To predict hospitalization or ED visits following chemotherapy initiation, nine models were developed: logistic regression with least absolute shrinkage and selection operator (LASSO), Ridge, and Elastic Net penalties; random forests; gradient-boosted trees; multilayer perceptron neural networks; support vector machines; K Nearest Neighbors classifiers; and ensemble voting models that used equal voting on the predictions from other models. Additional details of model building, training, and selection are included in the Appendix 1.
Evaluation Metrics
To evaluate model performance, the models were validated on the 20% test set, not used previously for model development, on the basis of the area under the receiver operator curve (AUROC) with 1,000-fold bootstrap to determine CIs.
Patient-Reported Outcome Subanalysis
A subanalysis was performed on patients with at least one 12-item Patient-Reported Outcomes Measurement Information System (PROMIS) survey completed within 180 days before starting chemotherapy.[22,23] Two new models were trained on this cohort using same preprocessing, model training, and evaluation steps described above: one with the original features, and a second with the inclusion of 14 features derived from the PROMIS data (12-item survey responses, and the global mental and physical health scores).[24]
Evaluation of Disparities in the Model Output
To determine any effect of inherent bias in our models and data, patients were stratified in the testing cohort by their race, ethnicity, and insurance status, and then, their predicted risk-score percentiles were compared with their true rates of OP-35 events. Empiric cumulative distributions of predicted risk-score percentiles for subgroups were plotted against each other to assess how the models predicted each subgroup's risk for OP-35 events.
Statistical Analysis
Patient clinical characteristics were compared using univariable odds ratios (ORs). The results are presented as mean ± standard deviation, unless otherwise noted.
RESULTS
Cohort Characteristics
A total of 8,439 patients were included in the study cohort who were, on average, age 60.42 (14.48) years, 50% female, and 45% non-White (Table 1). A total of 2,939 patients (35%) met our primary outcome of having at least one OP-35 event within the first 180 days after starting chemotherapy (Appendix Fig A1A). OP-35 events decreased in frequency the further from chemotherapy initiation (Appendix Fig A1B). The most common diagnoses associated with these events included pain, complications of bone marrow suppression, such as neutropenia and sepsis, and gastrointestinal side effects, such as emesis (Appendix Fig A1C).
TABLE 1.
Patient Demographics
FIG A1.
OP-35 events in cohort. (A) Cumulative proportion of the cohort with an OP-35 event after time chemotherapy imitation. (B) Distribution of time from chemotherapy initiation to first OP-35 event in the cohort. (C) Distribution of diagnoses associated with the first OP-35 event for patients in the cohort. ACU, acute care use; ED, emergency department.
Patient DemographicsPatients with OP-35 events by 180 days significantly varied in multiple clinical characteristics than those without. For instance, these patients were on average 2.5 years younger at diagnosis (OR 0.988 per year; 95% CI, 0.985 to 0.991; P < .0001), more likely to be male (OR 1.113; 95% CI, 1.102 to 1.123; P < .0001), non-White (OR 1.280; 95% CI, 1.252 to 1.309; P < .0001), have stage IV disease (OR 1.873; 95% CI, 1.762 to 1.993; P < .0001), have a smoking history (OR 1.056; 95% CI, 1.062 to 1.051; P < .0001), and have a diagnosis of depression (OR 1.422; 95% CI, 1.369 to 1.478; P < .0001).
Model Performance for Predicting Risk of OP-35 Events
Overall, the various ML models performed reasonably well for determining patient risk for OP-35 events using the 759 EHR-based variables on which they were trained (AUROC range: 0.740-0.806, Appendix Table A2). Although the ensemble voting model had the best performance for predicting events by 180 days (AUROC 0.806; 95% CI, 0.794 to 0.816), the LASSO model performed comparably (AUROC 0.783; 95% CI, 0.761 to 0.806; AUROC range during cross-validation = 0.746-0.825, Appendix Figs 2B and 2C). As LASSO is a regularized form of logistic regression, it is easier to interpret how it is making predictions than with more complicated models and was therefore chosen to be our primary model. Although thresholds to label if a patient is likely to have an event could vary depending on the clinical use case, at Youden's index (probability of event = .278), this model had the following performance metrics: accuracy = 0.700; F1 score = 0.644; precision (event) = 0.567; precision (nonevent) = 0.821; recall (event) = 0.745; recall (nonevent) = 0.821; and area under the precision-recall curve = 0.702. This model selected 125 of 759 possible features to use in its predictions. These features included clinical variables used in prior risk models, such as the number of pretreatment hospitalizations, advanced stage disease, and white blood cell count, as well as underutilized features, including a diagnosis of depression, race, and prior brainstem magnetic resonance imaging (see Table 2 for the top model features and Appendix Table A3 for the full feature set).
TABLE A2.
Model AUROCs
FIG A2.
LASSO Model Performance. (A) Performance of primary LASSO model for predicting any OP-35 event, admissions-only OP-35 events, and ED visits-only OP-35 events at 30, 180, and 365 days. Vertical bars indicate 95% CIs. (B) ROC for the LASSO model when predicting any OP-35 event at 180 days. Shaded area indicates 95% CI. (C) Calibration curve for the LASSO model when predicting any OP-35 event at 180 days (blue) and distribution of predicted probabilities of having an OP-35 event by 180 days (red). AUROC, area under the receiver operator curve; ED, emergency department; LASSO, least absolute shrinkage and selection operator; ROC, receiver operator curve.
TABLE 2.
Key Features of the Primary Model
TABLE A3.
All Least Absolute Shrinkage and Selection Operator Coefficients
Key Features of the Primary ModelTo test the model's discriminative power in a setting similar to how it might be implemented at the point of care, the testing cohort was stratified into high-, intermediate-, and low-risk groups on the basis of their predicted risk score tertile. Kaplan-Meier survival curves for OP-35 events showed good separation between risk groups (Fig 2, P < .00001 for each group by log-rank test). By 180 days after starting chemotherapy, 357 (64%) of the 563 high-risk patients had an OP-35 event, whereas 75 (13%) of the 563 low-risk patients had an event.
FIG 2.
Risk-stratified Kaplan-Meier survival curves for ACU. Kaplan-Meier curves for preventable ACU events for patients in the test cohort stratified by predicted risk tertile. Shaded area represents 95% CIs. ACU, acute care use.
Risk-stratified Kaplan-Meier survival curves for ACU. Kaplan-Meier curves for preventable ACU events for patients in the test cohort stratified by predicted risk tertile. Shaded area represents 95% CIs. ACU, acute care use.To assess the relative importance of each type of feature (eg, medications) on the model's predictions, performance was evaluated after retraining when withholding each feature type (Appendix Table A4). All 95% CIs for the AUROCs of the models with the withheld data overlapped with that of the model trained with the full data set; however, predictive performance generally declined when withholding feature types and experienced the greatest decline when withholding demographic and cancer-specific data (AUROC 0.779; 95% CI, 0.755 to 0.802).
TABLE A4.
Ablation Analysis AUROCs
Model Performance for Alternative Outcomes
The sensitivity of the primary LASSO model for predicting risk for OP-35 events was tested at alternative time points (30 and 365 days) and specific ACU setting (ED-only v IP-only). The performance decreased slightly compared with the 180-day outcome for both the 30-day (AUROC 0.774; 95% CI, 0.760 to 0.788) and 365-day outcomes (AUROC 0.772; 95% CI, 0.761 to 0.784; Appendix Fig A2A). The model had improved performance predicting IP-only events (AUROC 0.798; 95% CI, 0.787 to 0.809), but substantially worse performance for ED-only events (AUROC 0.615; 95% CI, 0.595 to 0.635) compared with the primary outcome; however, a model trained specifically to predict ED events performed comparably (AUROC 0.781; 95% CI, 0.762 to 0.796).
Effect of Race, Ethnicity, and Insurance Type
Notably, Black, Hispanic or Latino, and Medicaid patients were predicted to be disproportionately higher risk than their counterparts (Figs 3A-3C). For instance, 50% of the Black patients in our cohort were predicted to be at the 67.5th percentile of risk or higher. In the testing cohort, there was no significant difference in events between Black and White patients (41.3% v 33.5% with an event, respectively, P = .275); however, the predicted risk percentile for Black patients was significantly higher (64.15 ± 26.41 v 47.25 ± 28.59, P = .0001). Concordantly, the model also exhibited poor calibration for Black patients (Appendix Fig A3A). Similarly, 50% of Medicaid patients were predicted to be at the 64th percentile of risk or higher, and these patients had significantly higher predicted risk percentiles than those with private insurance (61.97 ± 26.20 v 51.65 ± 29.38, P = .002). When examining the interaction between race and insurance type, there was an additive effect between being Black and having Medicaid insurance (Fig 3D).
FIG 3.
Algorithmic risk scores stratified by race, ethnicity, and insurance status. Empiric cumulative distribution plots for the percentile of predicted risk for the test cohort stratified by (A) race, (B) ethnicity, (C) insurance status, and (D) the interaction between Black race and Medicaid.
FIG A3.
LASSO model calibration for demographic subgroups. Calibration curves and histograms of predicted probabilities for ACU for the primary LASSO model, stratified by demographic subgroups: (A) White versus Black patients, (B) Hispanic or Latino versus non-Hispanic or non-Latino patients, and (C) Medicaid versus non-Medicaid patients. ACU, acute care use; LASSO, least absolute shrinkage and selection operator.
Algorithmic risk scores stratified by race, ethnicity, and insurance status. Empiric cumulative distribution plots for the percentile of predicted risk for the test cohort stratified by (A) race, (B) ethnicity, (C) insurance status, and (D) the interaction between Black race and Medicaid.
Effect of PROs on Model Performance
Finally, to evaluate the impact PROs have on predictive performance, new LASSO models were trained to predict the risk for OP-35 events at 180 days on the subset of patients who had PRO data (n = 1,808). The baseline model on this data set performed with an AUROC of 0.735 (95% CI, 0.673 to 0.799). With the inclusion of PRO data, performance did not significantly change (AUROC 0.736; 95% CI, 0.673 to 0.798), but the model required only 41 features for comparable performance as compared to the 125 in the original model. PROMIS features selected by the model included the global physical health score, the self-reported pain score, and the self-reported quality of life score.
DISCUSSION
Improving the care and outcomes of patients with cancer undergoing chemotherapy must incorporate insights learned from AI-driven decision support. In this study, we identify patients at high risk for preventable acute care, the target of CMS's OP-35 measure, using ML models trained on routinely collected clinical data that demonstrated strong predictive performance. Our primary model was able to accurately discriminate patients at high risk for ACU versus low risk at an actionable level of accuracy. As payment models move to incorporate more value-based care measures and regulatory agencies demand the incorporation of real-world data to guide clinical practice, the open-source, ML-based tool we present has strong clinical utility and the potential to improve the identification and mitigation of these high-risk patients.Our work presents a model focused specifically on OP-35 eligible admissions and ED visits, which can be implemented by health care systems that routinely capture prechemotherapy data on their patients by following our methodology. The regression coefficients that were selected by the LASSO model are generally consistent with previously reported literature about risk factors for ACU for patients with cancer; for example, previous ED admissions,[25] neutropenia,[26] and depression[27] have all been shown to be associated with unplanned admissions in previous studies. In addition, the model selected novel features not typically included in risk models, such as depression and prior brain imaging, indicating that there is likely loss of predictive information when using a small subset of clinical variables. The limited declines in predictive performance when withholding entire feature categories from the model further suggest that missing predictive information can be partially recovered from other parts of the patient record and there are likely many plausible explanatory models when using dense data. Compared to previous work predicting ACU in patients with cancer, we used OP-35 inclusion criteria to identify preventable ACU, limited data input to only include data up to the initiation of chemotherapy, and took advantage of the richness of data offered by the EHR, incorporating more information about the patient's overall health to aid with prediction by using more than 750 EHR-derived features, whereas prior studies have made use of a much smaller set of clinical features.[14,15,28,29]The very modest gains in predictive performance when including PROs expands upon recent work integrating PROs in ML models. Seow et al[30] developed an ML model to predict cancer patient survival that integrated clinical characteristics and PROs to achieve strong predictive performance and calibration; however, it is unclear what benefit PRO inclusion provided to their predictions, as they do not report a model when withholding PROs.[30] Although PROs are a critical component of evaluating modern oncology care, similar to our study, Grant et al recently showed that PROs did not improve performance when predicting ACU, suggesting their use may be somewhat limited in this particular prediction task.[14,31] Importantly, this study shows that PROs provide some predictive power to determine risk for ACU, but they do not substantially add to model performance when sufficient structured EHR data are available for predictions.Given the prior evidence of inequitable outcomes on the basis of demographic subgroup and insurer, we felt it was important to provide race, ethnicity, and insurance data to our models to study their respective effects.[32-36] Our results further indicate that Black race and Medicaid payor are predictive of increased risk for ACU. In particular, our findings that patients with cancer with Medicaid insurance have higher risk scores for ACU are aligned with previous findings and suggest that Medicaid is correlated with poor patient outcomes. Although our analyses are not designed to address causation, our results support further study of these discrepancies for evaluation and the identification of methods to mitigate identified inequalities. Our health care system strives to equalize resource allocation; however, these results suggest that Medicaid and Black patients need closer monitoring than others to achieve equitable outcomes. This phenomenon is not localized to our health care system, as others have shown similar results across diverse settings.[37] There is a growing shift in care delivery that emphasizes equitable outcomes over equal access to resources. A potential strategy to address this lack of equitable outcomes would be to ensure Medicaid and Black patients undergoing chemotherapy have more frequent follow-up and closer monitoring of symptoms than non-Hispanic Whites and patients insured by Medicare or private companies. Such strategies could potentially help decrease unethical gaps in outcomes. Additionally, the poor calibration of our models for certain demographic groups highlights the critical importance of making clinicians aware of potential biases in risk models when using these to make care decisions to limit perpetuation of bias and exacerbation of inequities.Clinical implementation of this model can help provide outpatient physicians a data-driven tool to identify patients who are at highest risk of unplanned ED or IP admission and preemptively intervene. We envision a sliding scale of interventions on the basis of the risk cohort that the patient falls into, determined by their ML risk score.[38] For example, high-risk patients could be prioritized for advanced home-based health care,[39] frequent nursing follow-up phone calls to manage outpatient symptoms and answer questions,[40] or home telemonitoring services.[41] Targeting these interventions to high-risk cohorts is a cost-effective way for health care systems to manage large populations of oncology patients in a data-driven manner and is already being implemented in some health systems. A recent report that prospectively validated a similar ACU prediction model in radiotherapy patients demonstrated a 10% reduction in ACU for patients who received more frequent evaluation after being identified by an algorithm as high risk.[29,42] From the patient's perspective, ML-guided interventions to reduce ACU would be beneficial, as increasing the number of days at home during chemotherapy and end-of-life oncology care is an important metric for quality of life.[43]There are several limitations that are important to consider in this study. First, this study was performed at a large, academic medical center with multiple specialty oncology practices. In particular, the small number of Black and Black Medicaid patients in our analytic cohort, which reflects the current catchment area of our institution, may limit the generalizability of our analysis. Our cancer center is expanding its catchment area to include treatment centers serving more currently under-represented minorities in our common EHR information platform, which we will use to confirm our analyses and test interventions. External validation of this model at both academic and nonacademic hospitals is needed to ensure generalizability; however, this may prove difficult because of variations in EHR implementations resulting in nonparsimonious models. We plan to develop models on the basis of a common data model, readily allowing for exportation to other systems. In the interim, we encourage other health systems to follow our methodology and expand upon our results. Second, our model relies upon a large number of clinical variables, and although many of these values are routinely collected in the EHR, developing an information technology pipeline that reliably and accurately retrieves these data from the EHR for real-time predictions at the point of care could prove challenging. Third, our model is focused on a binary prediction of ACU within specific follow-up windows; however, there are many patients who will often have repeat admissions within the same follow-up window, significantly driving up utilization and costs.[26] Future prediction models could be used to identify the number of ACU events per patient within prediction windows, giving health care systems more granular data to determine which patients would most likely benefit from preventative admission interventions. Fourth, our utilization and mortality data are limited to ED visits and admissions in our health care system and deaths captured by our cancer registry. Therefore, the data may be missing deaths and acute care services rendered elsewhere. Finally, although we developed a model for all patients receiving chemotherapy, regardless of the underlying cancer etiology, it is likely that models for specific cancer types would have improved performance at the expense of generalizability.In conclusion, in this study, we present a data-driven model to identify chemotherapy patients at high risk for preventable ACU, as defined by CMS's OP-35 quality measure. We found that an ML model trained on a large number of routinely collected clinical variables can accurately identify patients at risk for ACU while on chemotherapy before starting treatment. Our model selected clinical features used in prior risk models and other less commonly used features to produce predictions with an actionable level of accuracy. Additionally, the inherent bias in our models and data demonstrate inequity in both health care systems and risk models and suggest Medicaid and Black patients would benefit from closer monitoring than others to achieve equitable outcomes. Further work will be needed to validate our models in other health care systems and assess the clinical impact of pretreatment risk predictions. Nonetheless, ML models can risk-stratify patients, allowing for differential intensity of monitoring and targeted, preventative interventions, and have potential to improve cancer care outcomes, costs, and patient experience.
Authors: Gabriel A Brooks; Ankit J Kansagra; Sowmya R Rao; James I Weitzman; Erica A Linden; Joseph O Jacobson Journal: JAMA Oncol Date: 2015-07 Impact factor: 31.777
Authors: Sasha Shepperd; Helen Doll; Robert M Angus; Mike J Clarke; Steve Iliffe; Lalit Kalra; Nicoletta Aimonino Ricauda; Andrew D Wilson Journal: Cochrane Database Syst Rev Date: 2008-10-08
Authors: Robert J Huang; Nicole Sung-Eun Kwon; Yutaka Tomizawa; Alyssa Y Choi; Tina Hernandez-Boussard; Joo Ha Hwang Journal: JCO Clin Cancer Inform Date: 2022-06