Robert Ball1, Sengwee Toh2, Jamie Nolan2, Kevin Haynes3, Richard Forshee4, Taxiarchis Botsis4. 1. Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, FDA, Silver Spring, MD, USA. 2. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA. 3. Translational Research for Affordability and Quality, HealthCore, Inc., Wilmington, DE, USA. 4. Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research, FDA, Silver Spring, MD, USA.
Abstract
INTRODUCTION: In May 2008, the Food and Drug Administration launched the Sentinel Initiative, a multi-year program for the establishment of a national electronic monitoring system for medical product safety that led, in 2016, to the launch of the full Sentinel System. Under the Mini-Sentinel pilot, several algorithms for identifying health outcomes of interest, including one for anaphylaxis, were developed and evaluated using data available from the Sentinel common data model. PURPOSE: To evaluate whether features extracted from unstructured narrative data using natural language processing (NLP) could be used to classify anaphylaxis cases. METHODS: Using previously developed methods, we extracted features from unstructured narrative data using NLP and applied rule-based and similarity-based algorithms to identify anaphylaxis among 62 potential cases previously classified by human experts as anaphylaxis (N = 33), not anaphylaxis (N = 27), and unknown (N = 2). RESULTS: The rule-based and similarity-based approaches demonstrated almost equal performance (recall 100% vs 100%, precision 60.3% vs 57.4%, F-measure: 0.753 vs 0.729). Reasons for misclassification included the inability of the algorithms to make the same clinical judgments as human experts about the timing, severity, or presence of alternative explanations; and the identification of terms consistent with anaphylaxis but present in conditions other than anaphylaxis. CONCLUSIONS: Although precision needs to be improved before these algorithms could be used without human review, we demonstrated that applying rule-based and similarity-based algorithms to unstructured narrative information from clinical records can be used for classification of anaphylaxis in the Sentinel System. Further development and assessment of these methods in the Sentinel System are warranted.
INTRODUCTION: In May 2008, the Food and Drug Administration launched the Sentinel Initiative, a multi-year program for the establishment of a national electronic monitoring system for medical product safety that led, in 2016, to the launch of the full Sentinel System. Under the Mini-Sentinel pilot, several algorithms for identifying health outcomes of interest, including one for anaphylaxis, were developed and evaluated using data available from the Sentinel common data model. PURPOSE: To evaluate whether features extracted from unstructured narrative data using natural language processing (NLP) could be used to classify anaphylaxis cases. METHODS: Using previously developed methods, we extracted features from unstructured narrative data using NLP and applied rule-based and similarity-based algorithms to identify anaphylaxis among 62 potential cases previously classified by human experts as anaphylaxis (N = 33), not anaphylaxis (N = 27), and unknown (N = 2). RESULTS: The rule-based and similarity-based approaches demonstrated almost equal performance (recall 100% vs 100%, precision 60.3% vs 57.4%, F-measure: 0.753 vs 0.729). Reasons for misclassification included the inability of the algorithms to make the same clinical judgments as human experts about the timing, severity, or presence of alternative explanations; and the identification of terms consistent with anaphylaxis but present in conditions other than anaphylaxis. CONCLUSIONS: Although precision needs to be improved before these algorithms could be used without human review, we demonstrated that applying rule-based and similarity-based algorithms to unstructured narrative information from clinical records can be used for classification of anaphylaxis in the Sentinel System. Further development and assessment of these methods in the Sentinel System are warranted.
Authors: Shirley V Wang; Olga V Patterson; Joshua J Gagne; Jeffrey S Brown; Robert Ball; Pall Jonsson; Adam Wright; Li Zhou; Wim Goettsch; Andrew Bate Journal: Drug Saf Date: 2019-11 Impact factor: 5.606
Authors: Elke Anklam; Martin Iain Bahl; Robert Ball; Richard D Beger; Jonathan Cohen; Suzanne Fitzpatrick; Philippe Girard; Blanka Halamoda-Kenzaoui; Denise Hinton; Akihiko Hirose; Arnd Hoeveler; Masamitsu Honma; Marta Hugas; Seichi Ishida; George En Kass; Hajime Kojima; Ira Krefting; Serguei Liachenko; Yan Liu; Shane Masters; Uwe Marx; Timothy McCarthy; Tim Mercer; Anil Patri; Carmen Pelaez; Munir Pirmohamed; Stefan Platz; Alexandre Js Ribeiro; Joseph V Rodricks; Ivan Rusyn; Reza M Salek; Reinhilde Schoonjans; Primal Silva; Clive N Svendsen; Susan Sumner; Kyung Sung; Danilo Tagle; Li Tong; Weida Tong; Janny van den Eijnden-van-Raaij; Neil Vary; Tao Wang; John Waterton; May Wang; Hairuo Wen; David Wishart; Yinyin Yuan; William Slikker Journal: Exp Biol Med (Maywood) Date: 2021-11-16
Authors: Rishi J Desai; Michael E Matheny; Kevin Johnson; Keith Marsolo; Lesley H Curtis; Jennifer C Nelson; Patrick J Heagerty; Judith Maro; Jeffery Brown; Sengwee Toh; Michael Nguyen; Robert Ball; Gerald Dal Pan; Shirley V Wang; Joshua J Gagne; Sebastian Schneeweiss Journal: NPJ Digit Med Date: 2021-12-20
Authors: Wei Yu; Chengyi Zheng; Fagen Xie; Wansu Chen; Cheryl Mercado; Lina S Sy; Lei Qian; Sungching Glenn; Hung F Tseng; Gina Lee; Jonathan Duffy; Michael M McNeil; Matthew F Daley; Brad Crane; Huong Q McLean; Lisa A Jackson; Steven J Jacobsen Journal: Pharmacoepidemiol Drug Saf Date: 2019-12-03 Impact factor: 2.732