Tracy Edinger1, Dina Demner-Fushman2, Aaron M Cohen1, Steven Bedrick1, William Hersh1. 1. Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA. 2. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
Abstract
Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.
Objective: Secondary use of electronic health record (EHR) data is enabled by accurate and complete retrieval of the relevant patient cohort, which requires searching both structured and unstructured data. Clinical text poses difficulties to searching, although chart notes incorporate structure that may facilitate accurate retrieval. Methods: We developed rules identifying clinical document sections, which can be indexed in search engines that allow faceted searches, such as Lucene or Essie, an NLM search engine. We developed 22 clinical cohorts and two queries for each cohort, one utilizing section headings and the other searching the whole document. We manually evaluated a subset of retrieved documents to compare query performance. Results: Querying by section had lower recall than whole-document queries (0.83 vs 0.95), higher precision (0.73 vs 0.54), and higher F1 (0.78 vs 0.69). Conclusion: This evaluation suggests that searching specific sections may improve precision under certain conditions and often with loss of recall.
Authors: Wei-Qi Wei; Pedro L Teixeira; Huan Mo; Robert M Cronin; Jeremy L Warner; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2015-09-02 Impact factor: 4.497
Authors: Hua Xu; Shane P Stenner; Son Doan; Kevin B Johnson; Lemuel R Waitman; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2010 Jan-Feb Impact factor: 4.497
Authors: Simon Kocbek; Lawrence Cavedon; David Martinez; Christopher Bain; Chris Mac Manus; Gholamreza Haffari; Ingrid Zukerman; Karin Verspoor Journal: J Biomed Inform Date: 2016-10-11 Impact factor: 6.317
Authors: Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J Embi; Noemie Elhadad; Stephen B Johnson; Albert M Lai Journal: J Am Med Inform Assoc Date: 2013-11-07 Impact factor: 4.497
Authors: Peijin Han; Sunyang Fu; Julie Kolis; Richard Hughes; Brian R Hallstrom; Martha Carvour; Hilal Maradit-Kremers; Sunghwan Sohn; V G Vinod Vydiswaran Journal: JMIR Med Inform Date: 2022-08-31