Literature DB >> 22692265

Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

Clete A Kushida1, Deborah A Nichols, Rik Jadrnicek, Ric Miller, James K Walsh, Kara Griffin.   

Abstract

BACKGROUND: De-identification and anonymization are strategies that are used to remove patient identifiers in electronic health record data. The use of these strategies in multicenter research studies is paramount in importance, given the need to share electronic health record data across multiple environments and institutions while safeguarding patient privacy.
METHODS: Systematic literature search using keywords of de-identify, deidentify, de-identification, deidentification, anonymize, anonymization, data scrubbing, and text scrubbing. Search was conducted up to June 30, 2011 and involved 6 different common literature databases. A total of 1798 prospective citations were identified, and 94 full-text articles met the criteria for review and the corresponding articles were obtained. Search results were supplemented by review of 26 additional full-text articles; a total of 120 full-text articles were reviewed.
RESULTS: A final sample of 45 articles met inclusion criteria for review and discussion. Articles were grouped into text, images, and biological sample categories. For text-based strategies, the approaches were segregated into heuristic, lexical, and pattern-based systems versus statistical learning-based systems. For images, approaches that de-identified photographic facial images and magnetic resonance image data were described. For biological samples, approaches that managed the identifiers linked with these samples were discussed, particularly with respect to meeting the anonymization requirements needed for Institutional Review Board exemption under the Common Rule.
CONCLUSIONS: Current de-identification strategies have their limitations, and statistical learning-based systems have distinct advantages over other approaches for the de-identification of free text. True anonymization is challenging, and further work is needed in the areas of de-identification of datasets and protection of genetic information.

Entities:  

Mesh:

Year:  2012        PMID: 22692265      PMCID: PMC6502465          DOI: 10.1097/MLR.0b013e3182585355

Source DB:  PubMed          Journal:  Med Care        ISSN: 0025-7079            Impact factor:   2.983


  37 in total

1.  Medical document anonymization with a semantic lexicon.

Authors:  P Ruch; R H Baud; A M Rassinoux; P Bouillon; G Robert
Journal:  Proc AMIA Symp       Date:  2000

2.  Identification of patient name references within medical documents using semantic selectional restrictions.

Authors:  Ricky K Taira; Alex A T Bui; Hooshang Kangarloo
Journal:  Proc AMIA Symp       Date:  2002

3.  A successful technique for removing names in pathology reports using an augmented search and replace method.

Authors:  Sean M Thomas; Burke Mamlin; Gunther Schadow; Clement McDonald
Journal:  Proc AMIA Symp       Date:  2002

4.  Establishment of a method of anonymization of DNA samples in genetic research.

Authors:  Kazuo Hara; Kazuhiko Ohe; Takashi Kadowaki; Naoya Kato; Yasushi Imai; Katsushi Tokunaga; Ryozo Nagai; Masao Omata
Journal:  J Hum Genet       Date:  2003-05-15       Impact factor: 3.172

5.  Assessing the difficulty and time cost of de-identification in clinical narratives.

Authors:  D A Dorr; W F Phillips; S Phansalkar; S A Sims; J F Hurdle
Journal:  Methods Inf Med       Date:  2006       Impact factor: 2.176

6.  Patient note deidentification using a find-and-replace iterative process.

Authors:  James P Sweeney; Keith S Portell; James A Houck; Reginald D Smith; John J Mentel
Journal:  J Healthc Inf Manag       Date:  2005

7.  Protection of privacy by third-party encryption in genetic research in Iceland.

Authors:  J R Gulcher; K Kristjánsson; H Gudbjartsson; K Stefánsson
Journal:  Eur J Hum Genet       Date:  2000-10       Impact factor: 4.246

8.  Concept-match medical data scrubbing. How pathology text can be used in research.

Authors:  Jules J Berman
Journal:  Arch Pathol Lab Med       Date:  2003-06       Impact factor: 5.534

9.  Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.

Authors:  Dilip Gupta; Melissa Saul; John Gilbertson
Journal:  Am J Clin Pathol       Date:  2004-02       Impact factor: 2.493

10.  Development and evaluation of an open source software tool for deidentification of pathology reports.

Authors:  Bruce A Beckwith; Rajeshwarri Mahaadevan; Ulysses J Balis; Frank Kuo
Journal:  BMC Med Inform Decis Mak       Date:  2006-03-06       Impact factor: 2.796

View more
  33 in total

Review 1.  Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.

Authors:  S M Meystre; C Lovis; T Bürkle; G Tognola; A Budrionis; C U Lehmann
Journal:  Yearb Med Inform       Date:  2017-09-11

2.  Location bias of identifiers in clinical narratives.

Authors:  David A Hanauer; Qiaozhu Mei; Bradley Malin; Kai Zheng
Journal:  AMIA Annu Symp Proc       Date:  2013-11-16

3.  Query Monitoring and Analysis for Database Privacy - A Security Automata Model Approach.

Authors:  Anand Kumar; Jay Ligatti; Yi-Cheng Tu
Journal:  Proc Int Conf Web Inf Syst Eng       Date:  2015-12-18

4.  Making Mountains out of Molehills: Challenges for Implementation of Cross-Disciplinary Research in the Big Data Era.

Authors:  Daniel Andresen; Eugene Vasserman
Journal:  Merrrill Ser Res Mission Public Univ       Date:  2019-12-20

5.  Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers.

Authors:  David S Carrell; Bradley A Malin; David J Cronkite; John S Aberdeen; Cheryl Clark; Muqun Rachel Li; Dikshya Bastakoty; Steve Nyemba; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2020-07-01       Impact factor: 4.497

Review 6.  Managing protected health information in distributed research network environments: automated review to facilitate collaboration.

Authors:  Christine E Bredfeldt; Amy Butani; Sandhyasree Padmanabhan; Paul Hitz; Roy Pardee
Journal:  BMC Med Inform Decis Mak       Date:  2013-03-22       Impact factor: 2.796

Review 7.  Ensuring privacy in the study of pathogen genetics.

Authors:  Sanjay R Mehta; Staal A Vinterbo; Susan J Little
Journal:  Lancet Infect Dis       Date:  2014-04-07       Impact factor: 25.071

8.  Piloting a deceased subject integrated data repository and protecting privacy of relatives.

Authors:  Vojtech Huser; Mehmet Kayaalp; Zeyno A Dodd; James J Cimino
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

9.  A Profile of the SAIL Databank on the UK Secure Research Platform.

Authors:  K H Jones; D V Ford; S Thompson; R A Lyons
Journal:  Int J Popul Data Sci       Date:  2019-11-20

10.  A DICOM dataset for evaluation of medical image de-identification.

Authors:  Michael Rutherford; Seong K Mun; Betty Levine; William Bennett; Kirk Smith; Phil Farmer; Quasar Jarosz; Ulrike Wagner; John Freyman; Geri Blake; Lawrence Tarbox; Keyvan Farahani; Fred Prior
Journal:  Sci Data       Date:  2021-07-16       Impact factor: 6.444

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.