Phuong Pham1,2, Carmen Cheng3, Eileen Wu2, Ivone Kim2, Rongmei Zhang4, Yong Ma4, Cindy M Kortepeter2, Monica A Muñoz1,2. 1. Department of Pharmaceutical Outcomes and Policy, College of Pharmacy, University of Florida, Gainesville, FL, USA. 2. Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA. 3. Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA. Carmen.Cheng@fda.hhs.gov. 4. Office of Translational Sciences, Center for Drug Evaluation and Research, US Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA.
Abstract
INTRODUCTION: Missing age presents a significant challenge when evaluating individual case safety reports (ICSRs) in the FDA Adverse Event Reporting System (FAERS). When age is missing in an ICSR's structured field, it may be in the report's free-text narrative. OBJECTIVES: This study aimed to evaluate the performance and assess the potential impact of a rule-based natural language processing (NLP) tool that utilizes a text string search to identify patients' numerical age from unstructured narratives. METHODS: Using FAERS ICSRs from 2002 to 2018, we evaluated the annual proportion of ICSRs with age missing in the structured field before and after NLP application. Reviewers manually identified patients' age from ICSR narratives (gold standard) from a random sample of 1500 ICSRs. The gold standard was compared to the NLP-identified age. RESULTS: During the study period, the percentage of ICSRs missing age in the structured field increased from 21.9 to 43.8%. The NLP tool performed well among the random sample: sensitivity 98.5%, specificity 92.9%, positive predictive value (PPV) 94.9%, and F-measure 96.7%. It also performed well for the subset of ICSRs missing age in the structured field; when applied to these cases, NLP identified age for an additional one million ICSRs (10% of the total number of ICSRs from 2002 to 2018) and decreased the percentage of ICSRs missing age to 27% overall. CONCLUSIONS: NLP has potential utility to extract patients' age from ICSR narratives. Use of this tool would enhance pharmacovigilance and research using FAERS data.
INTRODUCTION: Missing age presents a significant challenge when evaluating individual case safety reports (ICSRs) in the FDA Adverse Event Reporting System (FAERS). When age is missing in an ICSR's structured field, it may be in the report's free-text narrative. OBJECTIVES: This study aimed to evaluate the performance and assess the potential impact of a rule-based natural language processing (NLP) tool that utilizes a text string search to identify patients' numerical age from unstructured narratives. METHODS: Using FAERS ICSRs from 2002 to 2018, we evaluated the annual proportion of ICSRs with age missing in the structured field before and after NLP application. Reviewers manually identified patients' age from ICSR narratives (gold standard) from a random sample of 1500 ICSRs. The gold standard was compared to the NLP-identified age. RESULTS: During the study period, the percentage of ICSRs missing age in the structured field increased from 21.9 to 43.8%. The NLP tool performed well among the random sample: sensitivity 98.5%, specificity 92.9%, positive predictive value (PPV) 94.9%, and F-measure 96.7%. It also performed well for the subset of ICSRs missing age in the structured field; when applied to these cases, NLP identified age for an additional one million ICSRs (10% of the total number of ICSRs from 2002 to 2018) and decreased the percentage of ICSRs missing age to 27% overall. CONCLUSIONS: NLP has potential utility to extract patients' age from ICSR narratives. Use of this tool would enhance pharmacovigilance and research using FAERS data.
Authors: Kathryn Marwitz; S Christopher Jones; Cindy M Kortepeter; Gerald J Dal Pan; Monica A Muñoz Journal: Drug Saf Date: 2020-05 Impact factor: 5.606
Authors: Hesha J Duggirala; Joseph M Tonning; Ella Smith; Roselie A Bright; John D Baker; Robert Ball; Carlos Bell; Susan J Bright-Ponte; Taxiarchis Botsis; Khaled Bouri; Marc Boyer; Keith Burkhart; G Steven Condrey; James J Chen; Stuart Chirtel; Ross W Filice; Henry Francis; Hongying Jiang; Jonathan Levine; David Martin; Taiye Oladipo; Rene O'Neill; Lee Anne M Palmer; Antonio Paredes; George Rochester; Deborah Sholtes; Ana Szarfman; Hui-Lee Wong; Zhiheng Xu; Taha Kass-Hout Journal: J Am Med Inform Assoc Date: 2015-07-23 Impact factor: 4.497
Authors: Lisa Harinstein; Dipti Kalra; Cindy M Kortepeter; Monica A Muñoz; Eileen Wu; Gerald J Dal Pan Journal: Drug Saf Date: 2019-05 Impact factor: 5.606
Authors: Jeremy Jokinen; Dominique Bertin; Bruce Donzanti; Janet Hormbrey; Valerie Simmons; Hal Li; Charles Dharmani; Karolyn Kracht; Thomas S Hilzinger; Peter Verdru Journal: Ther Innov Regul Sci Date: 2019-11-04 Impact factor: 1.778
Authors: Divya Hoon; Matthew T Taylor; Pooja Kapadia; Tobias Gerhard; Brian L Strom; Daniel B Horton Journal: Pediatrics Date: 2019-09-16 Impact factor: 7.124