Aleena Banerji1, Kenneth H Lai2, Yu Li3, Rebecca R Saff4, Carlos A Camargo5, Kimberly G Blumenthal6, Li Zhou7. 1. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Harvard Medical School, Boston, Mass. Electronic address: abanerji@partners.org. 2. Department of Computer Science, Brandeis University, Waltham, Mass; Clinical and Quality Analysis, Partners HealthCare System, Boston, Mass. 3. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Mongan Institute, Department of Medicine, Massachusetts General Hospital, Boston, Mass. 4. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Harvard Medical School, Boston, Mass. 5. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Harvard Medical School, Boston, Mass; Mongan Institute, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Department of Emergency Medicine, Massachusetts General Hospital, Boston, Mass. 6. Division of Rheumatology, Allergy, and Immunology, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Harvard Medical School, Boston, Mass; Mongan Institute, Department of Medicine, Massachusetts General Hospital, Boston, Mass; Edward P. Lawrence Center for Quality and Safety, Massachusetts General Hospital, Boston, Mass. 7. Harvard Medical School, Boston, Mass; Division of General Internal Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, Mass.
Abstract
BACKGROUND: Allergic drug reaction epidemiologic data are sparse because it remains difficult to identify true cases in large data sets using manual chart review. OBJECTIVE: To develop and validate a novel informatics method based on natural language processing (NLP) in combination with International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes that identifies allergic drug reactions in the electronic health record. METHODS: Previously studied and high-yield ICD-9-CM codes were used to screen for possible allergic drug reactions among all inpatients admitted in 2007 and 2008. A random sample was selected for manual chart review to identify true cases of allergic drug reactions. A rule-based NLP algorithm was then developed to identify allergic drug reactions using free-text clinical notes and discharge summaries from the filtered cases. The performance of using manual chart review of ICD-9-CM codes alone was compared with ICD-9-CM codes in combination with NLP. RESULTS: Of 3907 cases identified by ICD-9-CM codes, 725 (19%) were randomly selected for manual chart review; 335 were confirmed as allergic drug reactions, resulting in a positive predictive value (PPV) of 46% (range: 18%-79%) when using ICD-9-CM codes alone. Our NLP algorithm in combination with ICD-9-CM codes achieved a PPV of 86% (range: 69%-100%). Among the 335 confirmed positive cases, NLP identified 259 true cases, resulting in a recall/sensitivity of 77% (range: 26%-100%). Among the 390 negative cases, NLP achieved a specificity of 89% (range: 69%-100%). CONCLUSION: Using NLP with ICD-9-CM codes improved identification of allergic drug reactions. The resulting decrease in manual chart review effort will facilitate large epidemiology studies of this understudied area.
BACKGROUND:Allergic drug reaction epidemiologic data are sparse because it remains difficult to identify true cases in large data sets using manual chart review. OBJECTIVE: To develop and validate a novel informatics method based on natural language processing (NLP) in combination with International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes that identifies allergic drug reactions in the electronic health record. METHODS: Previously studied and high-yield ICD-9-CM codes were used to screen for possible allergic drug reactions among all inpatients admitted in 2007 and 2008. A random sample was selected for manual chart review to identify true cases of allergic drug reactions. A rule-based NLP algorithm was then developed to identify allergic drug reactions using free-text clinical notes and discharge summaries from the filtered cases. The performance of using manual chart review of ICD-9-CM codes alone was compared with ICD-9-CM codes in combination with NLP. RESULTS: Of 3907 cases identified by ICD-9-CM codes, 725 (19%) were randomly selected for manual chart review; 335 were confirmed as allergic drug reactions, resulting in a positive predictive value (PPV) of 46% (range: 18%-79%) when using ICD-9-CM codes alone. Our NLP algorithm in combination with ICD-9-CM codes achieved a PPV of 86% (range: 69%-100%). Among the 335 confirmed positive cases, NLP identified 259 true cases, resulting in a recall/sensitivity of 77% (range: 26%-100%). Among the 390 negative cases, NLP achieved a specificity of 89% (range: 69%-100%). CONCLUSION: Using NLP with ICD-9-CM codes improved identification of allergic drug reactions. The resulting decrease in manual chart review effort will facilitate large epidemiology studies of this understudied area.
Authors: Li Zhou; Joseph M Plasek; Lisa M Mahoney; Neelima Karipineni; Frank Chang; Xuemin Yan; Fenny Chang; Dana Dimaggio; Debora S Goldman; Roberto A Rocha Journal: AMIA Annu Symp Proc Date: 2011-10-22
Authors: Li Zhou; Joseph M Plasek; Lisa M Mahoney; Frank Y Chang; Dana DiMaggio; Roberto A Rocha Journal: J Biomed Inform Date: 2011-11-28 Impact factor: 6.317
Authors: Allan Fong; Nicole Harriott; Donna M Walters; Hanan Foley; Richard Morrissey; Raj R Ratwani Journal: Int J Med Inform Date: 2017-05-11 Impact factor: 4.046
Authors: Harvey J Murff; Fern FitzHenry; Michael E Matheny; Nancy Gentry; Kristen L Kotter; Kimberly Crimin; Robert S Dittus; Amy K Rosen; Peter L Elkin; Steven H Brown; Theodore Speroff Journal: JAMA Date: 2011-08-24 Impact factor: 56.272
Authors: Foster R Goss; Kenneth H Lai; Maxim Topaz; Warren W Acker; Leigh Kowalski; Joseph M Plasek; Kimberly G Blumenthal; Diane L Seger; Sarah P Slight; Kin Wah Fung; Frank Y Chang; David W Bates; Li Zhou Journal: J Am Med Inform Assoc Date: 2018-06-01 Impact factor: 4.497