April Jorge1, Victor M Castro2, April Barnado3, Vivian Gainer2, Chuan Hong4, Tianxi Cai5, Tianrun Cai6, Robert Carroll7, Joshua C Denny7, Leslie Crofford3, Karen H Costenbader8, Katherine P Liao6, Elizabeth W Karlson8, Candace H Feldman8. 1. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, 55 Fruit Street, Bulfinch 165, Boston, MA 02114, United States. Electronic address: AMJorge@mgh.harvard.edu. 2. Research Information Systems and Computing, Partners Healthcare, United States. 3. Division of Rheumatology and Immunology, Vanderbilt University Medical Center, United States. 4. Harvard T.H. Chan School of Public Health, United States. 5. Research Information Systems and Computing, Partners Healthcare, United States; Department of Biomedical Informatics, Harvard Medical School, United States. 6. Division of Rheumatology, Immunology, and Allergy, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, United States; Department of Biomedical Informatics, Harvard Medical School, United States. 7. Department of Biomedical Informatics, Vanderbilt University Medical Center, United States. 8. Division of Rheumatology, Immunology, and Allergy, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, United States.
Abstract
OBJECTIVE: To utilize electronic health records (EHRs) to study SLE, algorithms are needed to accurately identify these patients. We used machine learning to generate data-driven SLE EHR algorithms and assessed performance of existing rule-based algorithms. METHODS: We randomly selected subjects with ≥ 1 SLE ICD-9/10 codes from our EHR and identified gold standard definite and probable SLE cases by chart review, based on 1997 ACR or 2012 SLICC Classification Criteria. From a training set, we extracted coded and narrative concepts using natural language processing and generated algorithms using penalized logistic regression to classify definite or definite/probable SLE. We assessed predictive characteristics in internal and external cohort validations. We also tested performance characteristics of published rule-based algorithms with pre-specified permutations of ICD-9 codes, laboratory tests and medications in our EHR. RESULTS: At a specificity of 97%, our machine learning coded algorithm for definite SLE had 90% positive predictive value (PPV) and 64% sensitivity and for definite/probable SLE, 92% PPV and 47% sensitivity. In the external validation, at 97% specificity, the definite/probable algorithm had 94% PPV and 60% sensitivity. Adding NLP concepts did not improve performance metrics. The PPVs of published rule-based algorithms ranged from 45-79% in our EHR. CONCLUSION: Our machine learning SLE algorithms performed well in internal and external validation. Rule-based SLE algorithms did not transport as well to our EHR. Unique EHR characteristics, clinical practices and research goals regarding the desired sensitivity and specificity of the case definition must be considered when applying algorithms to identify SLE patients.
OBJECTIVE: To utilize electronic health records (EHRs) to study SLE, algorithms are needed to accurately identify these patients. We used machine learning to generate data-driven SLE EHR algorithms and assessed performance of existing rule-based algorithms. METHODS: We randomly selected subjects with ≥ 1 SLE ICD-9/10 codes from our EHR and identified gold standard definite and probable SLE cases by chart review, based on 1997 ACR or 2012 SLICC Classification Criteria. From a training set, we extracted coded and narrative concepts using natural language processing and generated algorithms using penalized logistic regression to classify definite or definite/probable SLE. We assessed predictive characteristics in internal and external cohort validations. We also tested performance characteristics of published rule-based algorithms with pre-specified permutations of ICD-9 codes, laboratory tests and medications in our EHR. RESULTS: At a specificity of 97%, our machine learning coded algorithm for definite SLE had 90% positive predictive value (PPV) and 64% sensitivity and for definite/probable SLE, 92% PPV and 47% sensitivity. In the external validation, at 97% specificity, the definite/probable algorithm had 94% PPV and 60% sensitivity. Adding NLP concepts did not improve performance metrics. The PPVs of published rule-based algorithms ranged from 45-79% in our EHR. CONCLUSION: Our machine learning SLE algorithms performed well in internal and external validation. Rule-based SLE algorithms did not transport as well to our EHR. Unique EHR characteristics, clinical practices and research goals regarding the desired sensitivity and specificity of the case definition must be considered when applying algorithms to identify SLE patients.
Authors: D M Roden; J M Pulley; M A Basford; G R Bernard; E W Clayton; J R Balser; D R Masys Journal: Clin Pharmacol Ther Date: 2008-05-21 Impact factor: 6.875
Authors: Sheng Yu; Katherine P Liao; Stanley Y Shaw; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai Journal: J Am Med Inform Assoc Date: 2015-04-29 Impact factor: 4.497
Authors: Katherine P Liao; Tianxi Cai; Vivian Gainer; Sergey Goryachev; Qing Zeng-treitler; Soumya Raychaudhuri; Peter Szolovits; Susanne Churchill; Shawn Murphy; Isaac Kohane; Elizabeth W Karlson; Robert M Plenge Journal: Arthritis Care Res (Hoboken) Date: 2010-08 Impact factor: 4.794
Authors: Michelle Petri; Ana-Maria Orbai; Graciela S Alarcón; Caroline Gordon; Joan T Merrill; Paul R Fortin; Ian N Bruce; David Isenberg; Daniel J Wallace; Ola Nived; Gunnar Sturfelt; Rosalind Ramsey-Goldman; Sang-Cheol Bae; John G Hanly; Jorge Sánchez-Guerrero; Ann Clarke; Cynthia Aranow; Susan Manzi; Murray Urowitz; Dafna Gladman; Kenneth Kalunian; Melissa Costner; Victoria P Werth; Asad Zoma; Sasha Bernatsky; Guillermo Ruiz-Irastorza; Munther A Khamashta; Soren Jacobsen; Jill P Buyon; Peter Maddison; Mary Anne Dooley; Ronald F van Vollenhoven; Ellen Ginzler; Thomas Stoll; Christine Peschken; Joseph L Jorizzo; Jeffrey P Callen; S Sam Lim; Barri J Fessler; Murat Inanc; Diane L Kamen; Anisur Rahman; Kristjan Steinsson; Andrew G Franks; Lisa Sigler; Suhail Hameed; Hong Fang; Ngoc Pham; Robin Brey; Michael H Weisman; Gerald McGwin; Laurence S Magder Journal: Arthritis Rheum Date: 2012-08
Authors: Sheng Yu; Abhishek Chakrabortty; Katherine P Liao; Tianrun Cai; Ashwin N Ananthakrishnan; Vivian S Gainer; Susanne E Churchill; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai Journal: J Am Med Inform Assoc Date: 2017-04-01 Impact factor: 4.497
Authors: Vivian S Gainer; Andrew Cagan; Victor M Castro; Stacey Duey; Bhaswati Ghosh; Alyssa P Goodson; Sergey Goryachev; Reeta Metta; Taowei David Wang; Nich Wattanasin; Shawn N Murphy Journal: J Pers Med Date: 2016-02-26
Authors: Katherine P Liao; Tianxi Cai; Guergana K Savova; Shawn N Murphy; Elizabeth W Karlson; Ashwin N Ananthakrishnan; Vivian S Gainer; Stanley Y Shaw; Zongqi Xia; Peter Szolovits; Susanne Churchill; Isaac Kohane Journal: BMJ Date: 2015-04-24
Authors: Scott E Wenderfer; Joyce C Chang; Amy Goodwin Davies; Ingrid Y Luna; Rebecca Scobell; Cora Sears; Bliss Magella; Mark Mitsnefes; Brian R Stotter; Vikas R Dharnidharka; Katherine D Nowicki; Bradley P Dixon; Megan Kelton; Joseph T Flynn; Caroline Gluck; Mahmoud Kallash; William E Smoyer; Andrea Knight; Sangeeta Sule; Hanieh Razzaghi; L Charles Bailey; Susan L Furth; Christopher B Forrest; Michelle R Denburg; Meredith A Atkinson Journal: Clin J Am Soc Nephrol Date: 2021-11-03 Impact factor: 8.237
Authors: Yichi Zhang; Tianrun Cai; Sheng Yu; Kelly Cho; Chuan Hong; Jiehuan Sun; Jie Huang; Yuk-Lam Ho; Ashwin N Ananthakrishnan; Zongqi Xia; Stanley Y Shaw; Vivian Gainer; Victor Castro; Nicholas Link; Jacqueline Honerlaw; Sicong Huang; David Gagnon; Elizabeth W Karlson; Robert M Plenge; Peter Szolovits; Guergana Savova; Susanne Churchill; Christopher O'Donnell; Shawn N Murphy; J Michael Gaziano; Isaac Kohane; Tianxi Cai; Katherine P Liao Journal: Nat Protoc Date: 2019-11-20 Impact factor: 13.491
Authors: April Barnado; Amanda M Eudy; Ashley Blaske; Lee Wheless; Katie Kirchoff; Jim C Oates; Megan E B Clowse Journal: Arthritis Care Res (Hoboken) Date: 2022-03-16 Impact factor: 5.178
Authors: George A Robinson; Junjie Peng; Pierre Dönnes; Leda Coelewij; Meena Naja; Anna Radziszewska; Chris Wincup; Hannah Peckham; David A Isenberg; Yiannis Ioannou; Ines Pineda-Torra; Coziana Ciurtin; Elizabeth C Jury Journal: Lancet Rheumatol Date: 2020-07-29