Literature DB >> 32477643

A Comparative Analysis of Speed and Accuracy for Three Off-the-Shelf De-Identification Tools.

Paul M Heider1, Jihad S Obeid1, Stéphane M Meystre1.   

Abstract

A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identifiable Information (PII), which cannot be freely shared within the research community without compromising patient confidentiality and privacy rights. Significant work has been invested in investigating automated approaches to text de-identification, the process of removing or redacting PII. Few studies have examined the performance of existing de-identification pipelines in a controlled comparative analysis. In this study, we use publicly available corpora to analyze speed and accuracy differences between three de-identification systems that can be run off-the-shelf: Amazon Comprehend Medical PHId, Clinacuity's CliniDeID, and the National Library of Medicine's Scrubber. No single system dominated all the compared metrics. NLM Scrubber was the fastest while CliniDeID generally had the highest accuracy. ©2020 AMIA - All rights reserved.

Entities:  

Year:  2020        PMID: 32477643      PMCID: PMC7233098     

Source DB:  PubMed          Journal:  AMIA Jt Summits Transl Sci Proc


  8 in total

1.  Rapidly retargetable approaches to de-identification in medical records.

Authors:  Ben Wellner; Matt Huyck; Scott Mardis; John Aberdeen; Alex Morgan; Leonid Peshkin; Alex Yeh; Janet Hitzeman; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

Review 2.  Extracting information from textual documents in the electronic health record: a review of recent research.

Authors:  S M Meystre; G K Savova; K C Kipper-Schuler; J F Hurdle
Journal:  Yearb Med Inform       Date:  2008

Review 3.  De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

Authors:  Amber Stubbs; Michele Filannino; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2017-06-11       Impact factor: 6.317

Review 4.  Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

Authors:  Amber Stubbs; Christopher Kotfila; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2015-07-28       Impact factor: 6.317

Review 5.  Automatic de-identification of textual documents in the electronic health record: a review of recent research.

Authors:  Stephane M Meystre; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore
Journal:  BMC Med Res Methodol       Date:  2010-08-02       Impact factor: 4.615

Review 6.  A review of approaches to identifying patient phenotype cohorts using electronic health records.

Authors:  Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J Embi; Noemie Elhadad; Stephen B Johnson; Albert M Lai
Journal:  J Am Med Inform Assoc       Date:  2013-11-07       Impact factor: 4.497

7.  A survey of practices for the use of electronic health records to support research recruitment.

Authors:  Jihad S Obeid; Laura M Beskow; Marie Rape; Ramkiran Gouripeddi; R Anthony Black; James J Cimino; Peter J Embi; Chunhua Weng; Rebecca Marnocha; John B Buse
Journal:  J Clin Transl Sci       Date:  2017-08

8.  Evaluating current automatic de-identification methods with Veteran's health administration clinical documents.

Authors:  Oscar Ferrández; Brett R South; Shuying Shen; F Jeffrey Friedlin; Matthew H Samore; Stéphane M Meystre
Journal:  BMC Med Res Methodol       Date:  2012-07-27       Impact factor: 4.615

  8 in total
  4 in total

1.  Ensuring a safe(r) harbor: Excising personally identifiable information from structured electronic health record data.

Authors:  Emily R Pfaff; Melissa A Haendel; Kristin Kostka; Adam Lee; Emily Niehaus; Matvey B Palchuk; Kellie Walters; Christopher G Chute
Journal:  J Clin Transl Sci       Date:  2021-12-09

2.  Building a best-in-class automated de-identification tool for electronic health records through ensemble learning.

Authors:  Karthik Murugadoss; Ajit Rajasekharan; Bradley Malin; Vineet Agarwal; Sairam Bade; Jeff R Anderson; Jason L Ross; William A Faubion; John D Halamka; Venky Soundararajan; Sankar Ardhanari
Journal:  Patterns (N Y)       Date:  2021-05-12

3.  Investigation of the Utility of Features in a Clinical De-identification Model: A Demonstration Using EHR Pathology Reports for Advanced NSCLC Patients.

Authors:  Tanmoy Paul; Md Kamruz Zaman Rana; Preethi Aishwarya Tautam; Teja Venkat Pavan Kotapati; Yaswitha Jampani; Nitesh Singh; Humayera Islam; Vasanthi Mandhadi; Vishakha Sharma; Michael Barnes; Richard D Hammer; Abu Saleh Mohammad Mosa
Journal:  Front Digit Health       Date:  2022-02-16

4.  An evaluation of two commercial deep learning-based information retrieval systems for COVID-19 literature.

Authors:  Sarvesh Soni; Kirk Roberts
Journal:  J Am Med Inform Assoc       Date:  2021-01-15       Impact factor: 4.497

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.