Literature DB >> 31259000

Efficient Active Learning for Electronic Medical Record De-identification.

Muqun Li1,2, Martin Scaiano3, Khaled El Emam2, Bradley A Malin1.   

Abstract

Electronic medical records are often de-identified before disseminated for secondary uses. However, unstructured natural language records are challenging to de-identify while utilizing a considerable amount of expensive human annotation. In this investigation, we incorporate active learning into the de-identification workflow to reduce annotation requirements. We apply this approach to a real clinical trials dataset and a publicly available i2b2 dataset to illustrate that, when the machine learning de-identification system can actively request information to help create a better model from beyond the system (e.g., a knowledgeable human assistant), less training data will be needed to maintain or improve the performance of trained models in comparison to the typical passive learning framework. Specifically, with a batch size of 10 documents, it requires only 40 documents for an active learning approach to reach an F-measure of 0.9, while passive learning needs at least 25% more data for training a comparable model.

Entities:  

Year:  2019        PMID: 31259000      PMCID: PMC6568071     

Source DB:  PubMed          Journal:  AMIA Jt Summits Transl Sci Proc


  21 in total

1.  Standards for privacy of individually identifiable health information. Final rule.

Authors: 
Journal:  Fed Regist       Date:  2002-08-14

2.  Assessing the difficulty and time cost of de-identification in clinical narratives.

Authors:  D A Dorr; W F Phillips; S Phansalkar; S A Sims; J F Hurdle
Journal:  Methods Inf Med       Date:  2006       Impact factor: 2.176

3.  An electronic health record based on structured narrative.

Authors:  Stephen B Johnson; Suzanne Bakken; Daniel Dine; Sookyung Hyun; Eneida Mendonça; Frances Morrison; Tiffani Bright; Tielman Van Vleck; Jesse Wrenn; Peter Stetson
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

4.  Evaluating the state-of-the-art in automatic de-identification.

Authors:  Ozlem Uzuner; Yuan Luo; Peter Szolovits
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

5.  Effects of personal identifier resynthesis on clinical text de-identification.

Authors:  Reyyan Yeniterzi; John Aberdeen; Samuel Bayer; Ben Wellner; Lynette Hirschman; Bradley Malin
Journal:  J Am Med Inform Assoc       Date:  2010 Mar-Apr       Impact factor: 4.497

6.  The MITRE Identification Scrubber Toolkit: design, training, and assessment.

Authors:  John Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Ben Wellner; Cheryl Clark; David Hanauer; Bradley Malin; Lynette Hirschman
Journal:  Int J Med Inform       Date:  2010-10-14       Impact factor: 4.046

7.  Content and structure of clinical problem lists: a corpus analysis.

Authors:  Tielman T Van Vleck; Adam Wilcox; Peter D Stetson; Stephen B Johnson; Noémie Elhadad
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

Review 8.  What can natural language processing do for clinical decision support?

Authors:  Dina Demner-Fushman; Wendy W Chapman; Clement J McDonald
Journal:  J Biomed Inform       Date:  2009-08-13       Impact factor: 6.317

Review 9.  Automatic de-identification of textual documents in the electronic health record: a review of recent research.

Authors:  Stephane M Meystre; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore
Journal:  BMC Med Res Methodol       Date:  2010-08-02       Impact factor: 4.615

10.  Fever detection from free-text clinical records for biosurveillance.

Authors:  Wendy W Chapman; John N Dowling; Michael M Wagner
Journal:  J Biomed Inform       Date:  2004-04       Impact factor: 6.317

View more
  5 in total

1.  Building a best-in-class automated de-identification tool for electronic health records through ensemble learning.

Authors:  Karthik Murugadoss; Ajit Rajasekharan; Bradley Malin; Vineet Agarwal; Sairam Bade; Jeff R Anderson; Jason L Ross; William A Faubion; John D Halamka; Venky Soundararajan; Sankar Ardhanari
Journal:  Patterns (N Y)       Date:  2021-05-12

2.  Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction.

Authors:  Kevin Lybarger; Mari Ostendorf; Meliha Yetisgen
Journal:  J Biomed Inform       Date:  2020-12-05       Impact factor: 6.317

3.  Evaluating the re-identification risk of a clinical study report anonymized under EMA Policy 0070 and Health Canada Regulations.

Authors:  Janice Branson; Nathan Good; Jung-Wei Chen; Will Monge; Christian Probst; Khaled El Emam
Journal:  Trials       Date:  2020-02-18       Impact factor: 2.279

4.  The OpenDeID corpus for patient de-identification.

Authors:  Jitendra Jonnagaddala; Aipeng Chen; Sean Batongbacal; Chandini Nekkantti
Journal:  Sci Rep       Date:  2021-10-07       Impact factor: 4.379

5.  An Efficient Method for Deidentifying Protected Health Information in Chinese Electronic Health Records: Algorithm Development and Validation.

Authors:  Peng Wang; Yong Li; Liang Yang; Simin Li; Linfeng Li; Zehan Zhao; Shaopei Long; Fei Wang; Hongqian Wang; Ying Li; Chengliang Wang
Journal:  JMIR Med Inform       Date:  2022-08-30
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.