Ravi B Parikh1,2,3,4, Kristin A Linn5, Jiali Yan3,4, Matthew L Maciejewski6, Ann-Marie Rosland2, Kevin G Volpp1,2,3,4, Peter W Groeneveld2,4, Amol S Navathe1,2,3,4. 1. Corporal Michael J. Crescenz Veterans Affairs Medical Center, Philadelphia, Pennsylvania, United States of America. 2. VA Center for Health Equity Research and Promotion, Pittsburgh, Pennsylvania, United States of America. 3. Department of Medical Ethics and Health Policy, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America. 4. Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America. 5. Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America. 6. Durham Center of Innovation to Accelerate Discovery and Practice Transformation, Durham Veterans Affairs Health Care System, Durham, North Carolina, United States of America.
Abstract
BACKGROUND: Identifying individuals at risk for future hospitalization or death has been a major priority of population health management strategies. High-risk individuals are a heterogeneous group, and existing studies describing heterogeneity in high-risk individuals have been limited by data focused on clinical comorbidities and not socioeconomic or behavioral factors. We used machine learning clustering methods and linked comorbidity-based, sociodemographic, and psychobehavioral data to identify subgroups of high-risk Veterans and study long-term outcomes, hypothesizing that factors other than comorbidities would characterize several subgroups. METHODS AND FINDINGS: In this cross-sectional study, we used data from the VA Corporate Data Warehouse, a national repository of VA administrative claims and electronic health data. To identify high-risk Veterans, we used the Care Assessment Needs (CAN) score, a routinely-used VA model that predicts a patient's percentile risk of hospitalization or death at one year. Our study population consisted of 110,000 Veterans who were randomly sampled from 1,920,436 Veterans with a CAN score≥75th percentile in 2014. We categorized patient-level data into 119 independent variables based on demographics, comorbidities, pharmacy, vital signs, laboratories, and prior utilization. We used a previously validated density-based clustering algorithm to identify 30 subgroups of high-risk Veterans ranging in size from 50 to 2,446 patients. Mean CAN score ranged from 72.4 to 90.3 among subgroups. Two-year mortality ranged from 0.9% to 45.6% and was highest in the home-based care and metastatic cancer subgroups. Mean inpatient days ranged from 1.4 to 30.5 and were highest in the post-surgery and blood loss anemia subgroups. Mean emergency room visits ranged from 1.0 to 4.3 and were highest in the chronic sedative use and polysubstance use with amphetamine predominance subgroups. Five subgroups were distinguished by psychobehavioral factors and four subgroups were distinguished by sociodemographic factors. CONCLUSIONS: High-risk Veterans are a heterogeneous population consisting of multiple distinct subgroups-many of which are not defined by clinical comorbidities-with distinct utilization and outcome patterns. To our knowledge, this represents the largest application of ML clustering methods to subgroup a high-risk population. Further study is needed to determine whether distinct subgroups may benefit from individualized interventions.
BACKGROUND: Identifying individuals at risk for future hospitalization or death has been a major priority of population health management strategies. High-risk individuals are a heterogeneous group, and existing studies describing heterogeneity in high-risk individuals have been limited by data focused on clinical comorbidities and not socioeconomic or behavioral factors. We used machine learning clustering methods and linked comorbidity-based, sociodemographic, and psychobehavioral data to identify subgroups of high-risk Veterans and study long-term outcomes, hypothesizing that factors other than comorbidities would characterize several subgroups. METHODS AND FINDINGS: In this cross-sectional study, we used data from the VA Corporate Data Warehouse, a national repository of VA administrative claims and electronic health data. To identify high-risk Veterans, we used the Care Assessment Needs (CAN) score, a routinely-used VA model that predicts a patient's percentile risk of hospitalization or death at one year. Our study population consisted of 110,000 Veterans who were randomly sampled from 1,920,436 Veterans with a CAN score≥75th percentile in 2014. We categorized patient-level data into 119 independent variables based on demographics, comorbidities, pharmacy, vital signs, laboratories, and prior utilization. We used a previously validated density-based clustering algorithm to identify 30 subgroups of high-risk Veterans ranging in size from 50 to 2,446 patients. Mean CAN score ranged from 72.4 to 90.3 among subgroups. Two-year mortality ranged from 0.9% to 45.6% and was highest in the home-based care and metastatic cancer subgroups. Mean inpatient days ranged from 1.4 to 30.5 and were highest in the post-surgery and blood loss anemia subgroups. Mean emergency room visits ranged from 1.0 to 4.3 and were highest in the chronic sedative use and polysubstance use with amphetamine predominance subgroups. Five subgroups were distinguished by psychobehavioral factors and four subgroups were distinguished by sociodemographic factors. CONCLUSIONS: High-risk Veterans are a heterogeneous population consisting of multiple distinct subgroups-many of which are not defined by clinical comorbidities-with distinct utilization and outcome patterns. To our knowledge, this represents the largest application of ML clustering methods to subgroup a high-risk population. Further study is needed to determine whether distinct subgroups may benefit from individualized interventions.
Authors: Brian W Powers; Jiali Yan; Jingsan Zhu; Kristin A Linn; Sachin H Jain; Jennifer L Kowalski; Amol S Navathe Journal: J Gen Intern Med Date: 2018-12-03 Impact factor: 5.128
Authors: Colene M Byrne; Lauren M Mercincavage; Eric C Pan; Adam G Vincent; Douglas S Johnston; Blackford Middleton Journal: Health Aff (Millwood) Date: 2010-04 Impact factor: 6.301
Authors: Li Wang; Brian Porter; Charles Maynard; Ginger Evans; Christopher Bryson; Haili Sun; Indra Gupta; Elliott Lowy; Mary McDonell; Kathleen Frisbee; Christopher Nielson; Fred Kirkland; Stephan D Fihn Journal: Med Care Date: 2013-04 Impact factor: 2.983
Authors: Nicholas B Frisch; P Maxwell Courtney; Brian Darrith; Laurel A Copeland; Tad L Gerlinger Journal: J Am Acad Orthop Surg Date: 2020-11-15 Impact factor: 3.020
Authors: Jeffrey D Clough; Gerald F Riley; Melissa Cohen; Sheila M Hanley; Darshak Sanghavi; Darren A DeWalt; Rahul Rajkumar; Patrick H Conway Journal: Healthc (Amst) Date: 2015-10-01