Literature DB >> 16685332

Assessing the difficulty and time cost of de-identification in clinical narratives.

D A Dorr1, W F Phillips, S Phansalkar, S A Sims, J F Hurdle.   

Abstract

OBJECTIVE: To characterize the difficulty confronting investigators in removing protected health information (PHI) from cross-discipline, free-text clinical notes, an important challenge to clinical informatics research as recalibrated by the introduction of the US Health Insurance Portability and Accountability Act (HIPAA) and similar regulations.
METHODS: Randomized selection of clinical narratives from complete admissions written by diverse providers, reviewed using a two-tiered rater system and simple automated regular expression tools. For manual review, two independent reviewers used simple search and replace algorithms and visual scanning to find PHI as defined by HIPAA, followed by an independent second review to detect any missed PHI. Simple automated review was also performed for the "easy" PHI that are number- or date-based.
RESULTS: From 262 notes, 2074 PHI, or 7.9 +/- 6.1 per note, were found. The average recall (or sensitivity) was 95.9% while precision was 99.6% for single reviewers. Agreement between individual reviewers was strong (ICC = 0.99), although some asymmetry in errors was seen between reviewers (p = 0.001). The automated technique had better recall (98.5%) but worse precision (88.4%) for its subset of identifiers. Manually de-identifying a note took 87.3 +/- 61 seconds on average.
CONCLUSIONS: Manual de-identification of free-text notes is tedious and time-consuming, but even simple PHI is difficult to automatically identify with the exactitude required under HIPAA.

Mesh:

Year:  2006        PMID: 16685332

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  21 in total

Review 1.  Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

Authors:  Clete A Kushida; Deborah A Nichols; Rik Jadrnicek; Ric Miller; James K Walsh; Kara Griffin
Journal:  Med Care       Date:  2012-07       Impact factor: 2.983

2.  Hiding in plain sight: use of realistic surrogates to reduce exposure of protected health information in clinical text.

Authors:  David Carrell; Bradley Malin; John Aberdeen; Samuel Bayer; Cheryl Clark; Ben Wellner; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2012-07-06       Impact factor: 4.497

3.  Inductive creation of an annotation schema and a reference standard for de-identification of VA electronic clinical notes.

Authors:  Jeanmarie Mayer; Shuying Shen; Brett R South; Stephane Meystre; F Jeff Friedlin; William R Ray; Matthew Samore
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

4.  Embedding a hiding function in a portable electronic health record for privacy preservation.

Authors:  Lu-Chou Huang; Huei-Chung Chu; Chung-Yueh Lien; Chia-Hung Hsiao; Tsair Kao
Journal:  J Med Syst       Date:  2010-06       Impact factor: 4.460

5.  The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight.

Authors:  David S Carrell; David J Cronkite; Muqun Rachel Li; Steve Nyemba; Bradley A Malin; John S Aberdeen; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2019-12-01       Impact factor: 4.497

6.  Efficient Active Learning for Electronic Medical Record De-identification.

Authors:  Muqun Li; Martin Scaiano; Khaled El Emam; Bradley A Malin
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2019-05-06

Review 7.  Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.

Authors:  S M Meystre; C Lovis; T Bürkle; G Tognola; A Budrionis; C U Lehmann
Journal:  Yearb Med Inform       Date:  2017-09-11

8.  Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification.

Authors:  David S Carrell; David J Cronkite; Bradley A Malin; John S Aberdeen; Lynette Hirschman
Journal:  Methods Inf Med       Date:  2016-07-13       Impact factor: 2.176

9.  Learning regular expressions for clinical text classification.

Authors:  Duy Duc An Bui; Qing Zeng-Treitler
Journal:  J Am Med Inform Assoc       Date:  2014-02-27       Impact factor: 4.497

10.  Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers.

Authors:  David S Carrell; Bradley A Malin; David J Cronkite; John S Aberdeen; Cheryl Clark; Muqun Rachel Li; Dikshya Bastakoty; Steve Nyemba; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2020-07-01       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.