April M Jorge1, Dylan Smith2, Zhiyao Wu2, Tashrif Chowdhury2, Karen Costenbader3, Yuqing Zhang1, Hyon K Choi1, Candace H Feldman3, Yijun Zhao2. 1. Division of Rheumatology, Allergy, and Immunology, Harvard Medical School, 2348Massachusetts General Hospital, Boston, MA, USA. 2. Department of Computer and Information Sciences, 5923Fordham University, New York, NY, USA. 3. Division of Rheumatology, Inflammation, and Immunity, Harvard Medical School, 1861Brigham and Women's Hospital, Boston, MA, USA.
Abstract
OBJECTIVES: Systemic lupus erythematosus (SLE) is a heterogeneous disease characterized by disease flares which can require hospitalization. Our objective was to apply machine learning methods to predict hospitalizations for SLE from electronic health record (EHR) data. METHODS: We identified patients with SLE in a longitudinal EHR-based cohort with ≥2 outpatient rheumatology visits between 2012 and 2019. We applied multiple machine learning methods to predict hospitalizations with a primary diagnosis code for SLE, including decision tree, random forest, naive Bayes, logistic regression, and an ensemble method. Candidate predictors were derived from structured EHR features, including demographics, laboratory tests, medications, ICD-9/10 codes for SLE manifestations, and healthcare utilization. We used two approaches to assess these variables over longitudinal follow-up, including the incorporation of lagged features to capture changes over time of clinical data. The performance of each model was evaluated by overall accuracy, the F statistic, and the area under the receiver operator curve (AUC). RESULTS: We identified 1996 patients with SLE. 4.6% were hospitalized for SLE in their most recent year of follow-up. Random forest models had highest performance in predicting SLE hospitalizations, with AUC 0.751 and AUC 0.772 for two approaches (averaging and progressive), respectively. The leading predictors of SLE hospitalizations included dsDNA positivity, C3 level, blood cell counts, and inflammatory markers as well as age and albumin. CONCLUSION: We have demonstrated that machine learning methods can predict SLE hospitalizations. We identified key predictors of these events including known markers of SLE disease activity; further validation in external cohorts is warranted.
OBJECTIVES: Systemic lupus erythematosus (SLE) is a heterogeneous disease characterized by disease flares which can require hospitalization. Our objective was to apply machine learning methods to predict hospitalizations for SLE from electronic health record (EHR) data. METHODS: We identified patients with SLE in a longitudinal EHR-based cohort with ≥2 outpatient rheumatology visits between 2012 and 2019. We applied multiple machine learning methods to predict hospitalizations with a primary diagnosis code for SLE, including decision tree, random forest, naive Bayes, logistic regression, and an ensemble method. Candidate predictors were derived from structured EHR features, including demographics, laboratory tests, medications, ICD-9/10 codes for SLE manifestations, and healthcare utilization. We used two approaches to assess these variables over longitudinal follow-up, including the incorporation of lagged features to capture changes over time of clinical data. The performance of each model was evaluated by overall accuracy, the F statistic, and the area under the receiver operator curve (AUC). RESULTS: We identified 1996 patients with SLE. 4.6% were hospitalized for SLE in their most recent year of follow-up. Random forest models had highest performance in predicting SLE hospitalizations, with AUC 0.751 and AUC 0.772 for two approaches (averaging and progressive), respectively. The leading predictors of SLE hospitalizations included dsDNA positivity, C3 level, blood cell counts, and inflammatory markers as well as age and albumin. CONCLUSION: We have demonstrated that machine learning methods can predict SLE hospitalizations. We identified key predictors of these events including known markers of SLE disease activity; further validation in external cohorts is warranted.
Authors: Michelle A Petri; Ronald F van Vollenhoven; Jill Buyon; Roger A Levy; Sandra V Navarra; Ricard Cervera; Z John Zhong; William W Freimuth Journal: Arthritis Rheum Date: 2013-08
Authors: Matthew W Segar; Muthiah Vaduganathan; Kershaw V Patel; Darren K McGuire; Javed Butler; Gregg C Fonarow; Mujeeb Basit; Vaishnavi Kannan; Justin L Grodin; Brendan Everett; Duwayne Willett; Jarett Berry; Ambarish Pandey Journal: Diabetes Care Date: 2019-09-13 Impact factor: 19.112
Authors: Theresa L Walunas; Anika S Ghosh; Jennifer A Pacheco; Vesna Mitrovic; Andy Wu; Kathryn L Jackson; Ryan Schusler; Anh Chung; Daniel Erickson; Karen Mancera-Cuevas; Yuan Luo; Abel N Kho; Rosalind Ramsey-Goldman Journal: Lupus Sci Med Date: 2021-04
Authors: April Jorge; Victor M Castro; April Barnado; Vivian Gainer; Chuan Hong; Tianxi Cai; Tianrun Cai; Robert Carroll; Joshua C Denny; Leslie Crofford; Karen H Costenbader; Katherine P Liao; Elizabeth W Karlson; Candace H Feldman Journal: Semin Arthritis Rheum Date: 2019-01-04 Impact factor: 5.532