Literature DB >> 14983930

Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.

Dilip Gupta1, Melissa Saul, John Gilbertson.   

Abstract

We evaluated a comprehensive deidentification engine at the University of Pittsburgh Medical Center (UPMC), Pittsburgh, PA, that uses a complex set of rules, dictionaries, pattern-matching algorithms, and the Unified Medical Language System to identify and replace identifying text in clinical reports while preserving medical information for sharing in research. In our initial data set of 967 surgical pathology reports, the software did not suppress outside (103), UPMC (47), and non-UPMC (56) accession numbers; dates (7); names (9) or initials (25) of case pathologists; or hospital or laboratory names (46). In 150 reports, some clinical information was suppressed inadvertently (overmarking). The engine retained eponymic patient names, eg, Barrett and Gleason. In the second evaluation (1,000 reports), the software did not suppress outside (90) or UPMC (6) accession numbers or names (4) or initials (2) of case pathologists. In the third evaluation, the software removed names of patients, hospitals (297/300), pathologists (297/300), transcriptionists, residents and physicians, dates of procedures, and accession numbers (298/300). By the end of the evaluation, the system was reliably and specifically removing safe-harbor identifiers and producing highly readable deidentified text without removing important clinical information. Collaboration between pathology domain experts and system developers and continuous quality assurance are needed to optimize ongoing deidentification processes.

Entities:  

Mesh:

Year:  2004        PMID: 14983930     DOI: 10.1309/E6K3-3GBP-E5C2-7FYU

Source DB:  PubMed          Journal:  Am J Clin Pathol        ISSN: 0002-9173            Impact factor:   2.493


  62 in total

Review 1.  Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

Authors:  Clete A Kushida; Deborah A Nichols; Rik Jadrnicek; Ric Miller; James K Walsh; Kara Griffin
Journal:  Med Care       Date:  2012-07       Impact factor: 2.983

2.  caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research.

Authors:  Rebecca S Crowley; Melissa Castine; Kevin Mitchell; Girish Chavan; Tara McSherry; Michael Feldman
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Using a pipeline to improve de-identification performance.

Authors:  Frances P Morrison; Soumitra Sengupta; George Hripcsak
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

4.  Inductive creation of an annotation schema and a reference standard for de-identification of VA electronic clinical notes.

Authors:  Jeanmarie Mayer; Shuying Shen; Brett R South; Stephane Meystre; F Jeff Friedlin; William R Ray; Matthew Samore
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

5.  Embedding a hiding function in a portable electronic health record for privacy preservation.

Authors:  Lu-Chou Huang; Huei-Chung Chu; Chung-Yueh Lien; Chia-Hung Hsiao; Tsair Kao
Journal:  J Med Syst       Date:  2010-06       Impact factor: 4.460

6.  A de-identifier for medical discharge summaries.

Authors:  Ozlem Uzuner; Tawanda C Sibanda; Yuan Luo; Peter Szolovits
Journal:  Artif Intell Med       Date:  2007-11-28       Impact factor: 5.326

7.  State-of-the-art anonymization of medical records using an iterative machine learning framework.

Authors:  György Szarvas; Richárd Farkas; Róbert Busa-Fekete
Journal:  J Am Med Inform Assoc       Date:  2007 Sep-Oct       Impact factor: 4.497

8.  Heuristic sample selection to minimize reference standard training set for a part-of-speech tagger.

Authors:  Kaihong Liu; Wendy Chapman; Rebecca Hwa; Rebecca S Crowley
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

9.  Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?

Authors:  Frances P Morrison; Li Li; Albert M Lai; George Hripcsak
Journal:  J Am Med Inform Assoc       Date:  2008-10-24       Impact factor: 4.497

10.  Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research.

Authors:  Todd Lingren; Yizhao Ni; Louise Deleger; Megan Kaiser; Laura Stoutenborough; Keith Marsolo; Michal Kouril; Katalin Molnar; Imre Solti
Journal:  J Biomed Inform       Date:  2014-02-17       Impact factor: 6.317

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.