Barbara M Decker1, Alexandra Turco2, Jian Xu3, Samuel W Terman4, Nikitha Kosaraju2, Alisha Jamil2, Kathryn A Davis2, Brian Litt2, Colin A Ellis2, Pouya Khankhanian5, Chloe E Hill4. 1. Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States; Department of Neurological Sciences, University of Vermont Medical Center, Burlington, VT, United States. Electronic address: Barbara.decker@uvmhealth.org. 2. Department of Neurology, University of Pennsylvania, Philadelphia, PA, United States. 3. Department of Neurology, Henry Ford Health System, Detroit, MI, United States. 4. Department of Neurology, University of Michigan, Ann Arbor, MI, United States. 5. Kaiser Permanente, Oakland, CA, United States.
Abstract
OBJECTIVE: To develop a natural language processing (NLP) algorithm to abstract seizure types and frequencies from electronic health records (EHR). BACKGROUND: Seizure frequency measurement is an epilepsy quality metric. Yet, abstraction of seizure frequency from the EHR is laborious. We present an NLP algorithm to extract seizure data from unstructured text of clinic notes. Algorithm performance was assessed at two epilepsy centers. METHODS: We developed a rules-based NLP algorithm to recognize terms related to seizures and frequency within the text of an outpatient encounter. Algorithm output (e.g. number of seizures of a particular type within a time interval) was compared to seizure data manually annotated by two expert reviewers ("gold standard"). The algorithm was developed from 150 clinic notes from institution #1 (development set), then tested on a separate set of 219 notes from institution #1 (internal test set) with 248 unique seizure frequency elements. The algorithm was separately applied to 100 notes from institution #2 (external test set) with 124 unique seizure frequency elements. Algorithm performance was measured by recall (sensitivity), precision (positive predictive value), and F1 score (geometric mean of precision and recall). RESULTS: In the internal test set, the algorithm demonstrated 70% recall (173/248), 95% precision (173/182), and 0.82 F1 score compared to manual review. Algorithm performance in the external test set was lower with 22% recall (27/124), 73% precision (27/37), and 0.40 F1 score. CONCLUSIONS: These results suggest NLP extraction of seizure types and frequencies is feasible, though not without challenges in generalizability for large-scale implementation.
OBJECTIVE: To develop a natural language processing (NLP) algorithm to abstract seizure types and frequencies from electronic health records (EHR). BACKGROUND: Seizure frequency measurement is an epilepsy quality metric. Yet, abstraction of seizure frequency from the EHR is laborious. We present an NLP algorithm to extract seizure data from unstructured text of clinic notes. Algorithm performance was assessed at two epilepsy centers. METHODS: We developed a rules-based NLP algorithm to recognize terms related to seizures and frequency within the text of an outpatient encounter. Algorithm output (e.g. number of seizures of a particular type within a time interval) was compared to seizure data manually annotated by two expert reviewers ("gold standard"). The algorithm was developed from 150 clinic notes from institution #1 (development set), then tested on a separate set of 219 notes from institution #1 (internal test set) with 248 unique seizure frequency elements. The algorithm was separately applied to 100 notes from institution #2 (external test set) with 124 unique seizure frequency elements. Algorithm performance was measured by recall (sensitivity), precision (positive predictive value), and F1 score (geometric mean of precision and recall). RESULTS: In the internal test set, the algorithm demonstrated 70% recall (173/248), 95% precision (173/182), and 0.82 F1 score compared to manual review. Algorithm performance in the external test set was lower with 22% recall (27/124), 73% precision (27/37), and 0.40 F1 score. CONCLUSIONS: These results suggest NLP extraction of seizure types and frequencies is feasible, though not without challenges in generalizability for large-scale implementation.
Authors: Beata Fonferko-Shadrach; Arron S Lacey; Angus Roberts; Ashley Akbari; Simon Thompson; David V Ford; Ronan A Lyons; Mark I Rees; William Owen Pickrell Journal: BMJ Open Date: 2019-04-01 Impact factor: 2.692
Authors: Robert S Fisher; J Helen Cross; Carol D'Souza; Jacqueline A French; Sheryl R Haut; Norimichi Higurashi; Edouard Hirsch; Floor E Jansen; Lieven Lagae; Solomon L Moshé; Jukka Peltola; Eliane Roulet Perez; Ingrid E Scheffer; Andreas Schulze-Bonhage; Ernest Somerville; Michael Sperling; Elza Márcia Yacubian; Sameer M Zuberi Journal: Epilepsia Date: 2017-03-08 Impact factor: 5.864
Authors: Hyunmi Choi; Marla J Hamberger; Heidi Munger Clary; Rebecca Loeb; Frankline M Onchiri; Gus Baker; W Allen Hauser; John B Wong Journal: Epilepsia Date: 2014-06-05 Impact factor: 5.864
Authors: Heidi Munger Clary; S Andrew Josephson; Gary Franklin; Susan T Herman; Jennifer L Hopp; Inna Hughes; Lisa Meunier; Lidia M V R Moura; Brandy Parker-McFadden; Mary Jo Pugh; Rebecca Schultz; Marianna V Spanaki; Amy Bennett; Christine Baca Journal: Neurology Date: 2022-04-05 Impact factor: 9.910
Authors: Kevin Xie; Ryan S Gallagher; Erin C Conrad; Chadric O Garrick; Steven N Baldassano; John M Bernabei; Peter D Galer; Nina J Ghosn; Adam S Greenblatt; Tara Jennings; Alana Kornspun; Catherine V Kulick-Soper; Jal M Panchal; Akash R Pattnaik; Brittany H Scheid; Danmeng Wei; Micah Weitzman; Ramya Muthukrishnan; Joongwon Kim; Brian Litt; Colin A Ellis; Dan Roth Journal: J Am Med Inform Assoc Date: 2022-04-13 Impact factor: 4.497
Authors: Anup D Patel; Christine Baca; Gary Franklin; Susan T Herman; Inna Hughes; Lisa Meunier; Lidia M V R Moura; Heidi Munger Clary; Brandy Parker-McFadden; Mary Jo Pugh; Rebecca J Schultz; Marianna V Spanaki; Amy Bennett; S Andrew Josephson Journal: Neurology Date: 2018-10-03 Impact factor: 9.910