Liqin Wang1, Eli Miloslavsky2, John H Stone2, Hyon K Choi3, Li Zhou1, Zachary S Wallace4. 1. Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, United States; Harvard Medical School, Boston, MA, United States. 2. Harvard Medical School, Boston, MA, United States; Rheumatology Unit, Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston, MA, United States. 3. Harvard Medical School, Boston, MA, United States; Rheumatology Unit, Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston, MA, United States; Clinical Epidemiology Program, Mongan Institute, Massachusetts General Hospital, Boston, MA, United States. 4. Harvard Medical School, Boston, MA, United States; Rheumatology Unit, Division of Rheumatology, Allergy, and Immunology, Massachusetts General Hospital, Boston, MA, United States; Clinical Epidemiology Program, Mongan Institute, Massachusetts General Hospital, Boston, MA, United States. Electronic address: zswallace@mgh.harvard.edu.
Abstract
OBJECTIVES: Clinical notes from electronic health records (EHR) are important to characterize the natural history, comorbidities, and complications of ANCA-associated vasculitis (AAV) because these details may not be captured by claims and structured data. However, labor-intensive chart review is often required to extract information from notes. We hypothesized that machine learning can automatically discover clinically-relevant themes across longitudinal notes to study AAV. METHODS: This retrospective study included prevalent PR3- or MPO-ANCA+ AAV cases managed within the Mass General Brigham integrated health care system with providers' notes available between March 1, 1990 and August 23, 2018. We generated clinically-relevant topics mentioned in notes using latent Dirichlet allocation-based topic modeling and conducted trend analyses of those topics over the 2 years prior to and 5 years after the initiation of AAV-specific treatment. RESULTS: The study cohort included 660 patients with AAV. We generated 90 topics using 113,048 available notes. Topics were related to the AAV diagnosis, treatment, symptoms and manifestations (e.g., glomerulonephritis), and complications (e.g., end-stage renal disease, infection). AAV-related symptoms and psychiatric symptoms were mentioned months before treatment initiation. Topics related to pulmonary and renal diseases, diabetes, and infections were common during the disease course but followed distinct temporal patterns. CONCLUSIONS: Automated topic modeling can be used to discover clinically-relevant themes and temporal patterns related to the diagnosis, treatment, comorbidities, and complications of AAV from EHR notes. Future research might compare the temporal patterns in a non-AAV cohort and leverage clinical notes to identify possible AAV cases prospectively.
OBJECTIVES: Clinical notes from electronic health records (EHR) are important to characterize the natural history, comorbidities, and complications of ANCA-associated vasculitis (AAV) because these details may not be captured by claims and structured data. However, labor-intensive chart review is often required to extract information from notes. We hypothesized that machine learning can automatically discover clinically-relevant themes across longitudinal notes to study AAV. METHODS: This retrospective study included prevalent PR3- or MPO-ANCA+ AAV cases managed within the Mass General Brigham integrated health care system with providers' notes available between March 1, 1990 and August 23, 2018. We generated clinically-relevant topics mentioned in notes using latent Dirichlet allocation-based topic modeling and conducted trend analyses of those topics over the 2 years prior to and 5 years after the initiation of AAV-specific treatment. RESULTS: The study cohort included 660 patients with AAV. We generated 90 topics using 113,048 available notes. Topics were related to the AAV diagnosis, treatment, symptoms and manifestations (e.g., glomerulonephritis), and complications (e.g., end-stage renal disease, infection). AAV-related symptoms and psychiatric symptoms were mentioned months before treatment initiation. Topics related to pulmonary and renal diseases, diabetes, and infections were common during the disease course but followed distinct temporal patterns. CONCLUSIONS: Automated topic modeling can be used to discover clinically-relevant themes and temporal patterns related to the diagnosis, treatment, comorbidities, and complications of AAV from EHR notes. Future research might compare the temporal patterns in a non-AAV cohort and leverage clinical notes to identify possible AAV cases prospectively.
Authors: Sirada Panupattanapong; Dustin L Stwalley; Andrew J White; Margaret A Olsen; Anthony R French; Mary E Hartman Journal: Arthritis Rheumatol Date: 2018-12 Impact factor: 10.995
Authors: Peter C Smith; Rodrigo Araya-Guerra; Caroline Bublitz; Bennett Parnes; L Miriam Dickinson; Rebecca Van Vorst; John M Westfall; Wilson D Pace Journal: JAMA Date: 2005-02-02 Impact factor: 56.272
Authors: Neil Basu; Andrew McClean; Lorraine Harper; Esther Nicole Amft; Neeraj Dhaun; Raashid A Luqmani; Mark A Little; David Rw Jayne; Oliver Flossmann; John McLaren; Vinod Kumar; Lars P Erwig; David M Reid; Gareth T Jones; Gary J Macfarlane Journal: Ann Rheum Dis Date: 2013-01-25 Impact factor: 19.103