Literature DB >> 30815108

Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives.

Youngjun Kim1, Paul Heider1, Stéphane Meystre1,2.   

Abstract

Text de-identification is an application of clinical natural language processing that offers significant efficiency and scalability advantages. Hence, various learning algorithms have been applied to this task to yield better performance. Instead of choosing the best individual learning algorithm, we aim to improve de-identification by constructing ensembles that lead to more accurate classification. We present three different ensemble methods that combine multiple de-identification models trained from deep learning, shallow learning, and rule-based approaches. Each model is capable of automated de-identification without manual medical expertise. Our experimental results show that the stacked learning ensemble is more effective than other ensemble methods, producing the highest recall, the most important metric for de-identification. The stacked ensemble achieved state-of-the-art performance on the 2014 i2b2 dataset with 97.04% precision, 94.45% recall, and 95.73% F1 score.

Entities:  

Mesh:

Year:  2018        PMID: 30815108      PMCID: PMC6371277     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  23 in total

1.  Rapidly retargetable approaches to de-identification in medical records.

Authors:  Ben Wellner; Matt Huyck; Scott Mardis; John Aberdeen; Alex Morgan; Leonid Peshkin; Alex Yeh; Janet Hitzeman; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

2.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

3.  A Study of Concept Extraction Across Different Types of Clinical Notes.

Authors:  Youngjun Kim; Ellen Riloff; John F Hurdle
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

4.  The MITRE Identification Scrubber Toolkit: design, training, and assessment.

Authors:  John Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Ben Wellner; Cheryl Clark; David Hanauer; Bradley Malin; Lynette Hirschman
Journal:  Int J Med Inform       Date:  2010-10-14       Impact factor: 4.046

5.  De-identification of clinical notes via recurrent neural network and conditional random field.

Authors:  Zengjian Liu; Buzhou Tang; Xiaolong Wang; Qingcai Chen
Journal:  J Biomed Inform       Date:  2017-06-01       Impact factor: 6.317

6.  Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.

Authors:  Dilip Gupta; Melissa Saul; John Gilbertson
Journal:  Am J Clin Pathol       Date:  2004-02       Impact factor: 2.493

Review 7.  Automatic de-identification of textual documents in the electronic health record: a review of recent research.

Authors:  Stephane M Meystre; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore
Journal:  BMC Med Res Methodol       Date:  2010-08-02       Impact factor: 4.615

8.  Automated de-identification of free-text medical records.

Authors:  Ishna Neamatullah; Margaret M Douglass; Li-wei H Lehman; Andrew Reisner; Mauricio Villarroel; William J Long; Peter Szolovits; George B Moody; Roger G Mark; Gari D Clifford
Journal:  BMC Med Inform Decis Mak       Date:  2008-07-24       Impact factor: 2.796

9.  Recognition of protein/gene names from text using an ensemble of classifiers.

Authors:  GuoDong Zhou; Dan Shen; Jie Zhang; Jian Su; SoonHeng Tan
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

10.  Automatic detection of protected health information from clinic narratives.

Authors:  Hui Yang; Jonathan M Garibaldi
Journal:  J Biomed Inform       Date:  2015-07-29       Impact factor: 6.317

View more
  6 in total

1.  Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

Authors:  Youngjun Kim; Paul M Heider; Stéphane M Meystre
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

2.  De-identification of Clinical Text via Bi-LSTM-CRF with Neural Language Models.

Authors:  Buzhou Tang; Dehuan Jiang; Qingcai Chen; Xiaolong Wang; Jun Yan; Ying Shen
Journal:  AMIA Annu Symp Proc       Date:  2020-03-04

3.  Ensuring a safe(r) harbor: Excising personally identifiable information from structured electronic health record data.

Authors:  Emily R Pfaff; Melissa A Haendel; Kristin Kostka; Adam Lee; Emily Niehaus; Matvey B Palchuk; Kellie Walters; Christopher G Chute
Journal:  J Clin Transl Sci       Date:  2021-12-09

4.  Building a best-in-class automated de-identification tool for electronic health records through ensemble learning.

Authors:  Karthik Murugadoss; Ajit Rajasekharan; Bradley Malin; Vineet Agarwal; Sairam Bade; Jeff R Anderson; Jason L Ross; William A Faubion; John D Halamka; Venky Soundararajan; Sankar Ardhanari
Journal:  Patterns (N Y)       Date:  2021-05-12

5.  Integrating Eye Tracking and Speech Recognition Accurately Annotates MR Brain Images for Deep Learning: Proof of Principle.

Authors:  Joseph N Stember; Haydar Celik; David Gutman; Nathaniel Swinburne; Robert Young; Sarah Eskreis-Winkler; Andrei Holodny; Sachin Jambawalikar; Bradford J Wood; Peter D Chang; Elizabeth Krupinski; Ulas Bagci
Journal:  Radiol Artif Intell       Date:  2020-11-11

6.  Impact of De-Identification on Clinical Text Classification Using Traditional and Deep Learning Classifiers.

Authors:  Jihad S Obeid; Paul M Heider; Erin R Weeda; Andrew J Matuskowitz; Christine M Carr; Kevin Gagnon; Tami Crawford; Stephane M Meystre
Journal:  Stud Health Technol Inform       Date:  2019-08-21
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.