Literature DB >> 22311158

A database de-identification framework to enable direct queries on medical data for secondary use.

B S Erdal1, J Liu, J Ding, J Chen, C B Marsh, J Kamal, B D Clymer.   

Abstract

OBJECTIVE: To qualify the use of patient clinical records as non-human-subject for research purpose, electronic medical record data must be de-identified so there is minimum risk to protected health information exposure. This study demonstrated a robust framework for structured data de-identification that can be applied to any relational data source that needs to be de-identified.
METHODS: Using a real world clinical data warehouse, a pilot implementation of limited subject areas were used to demonstrate and evaluate this new de-identification process. Query results and performances are compared between source and target system to validate data accuracy and usability.
RESULTS: The combination of hashing, pseudonyms, and session dependent randomizer provides a rigorous de-identification framework to guard against 1) source identifier exposure; 2) internal data analyst manually linking to source identifiers; and 3) identifier cross-link among different researchers or multiple query sessions by the same researcher. In addition, a query rejection option is provided to refuse queries resulting in less than preset numbers of subjects and total records to prevent users from accidental subject identification due to low volume of data. This framework does not prevent subject re-identification based on prior knowledge and sequence of events. Also, it does not deal with medical free text de-identification, although text de-identification using natural language processing can be included due its modular design.
CONCLUSION: We demonstrated a framework resulting in HIPAA Compliant databases that can be directly queried by researchers. This technique can be augmented to facilitate inter-institutional research data sharing through existing middleware such as caGrid.

Entities:  

Mesh:

Year:  2012        PMID: 22311158     DOI: 10.3414/ME11-01-0048

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  5 in total

1.  Knowledge management and informatics considerations for comparative effectiveness research: a case-driven exploration.

Authors:  Peter J Embi; Courtney Hebert; Gayle Gordillo; Kelly Kelleher; Philip R O Payne
Journal:  Med Care       Date:  2013-08       Impact factor: 2.983

Review 2.  Clinical records anonymisation and text extraction (CRATE): an open-source software system.

Authors:  Rudolf N Cardinal
Journal:  BMC Med Inform Decis Mak       Date:  2017-04-26       Impact factor: 2.796

3.  Modular design, application architecture, and usage of a self-service model for enterprise data delivery: the Duke Enterprise Data Unified Content Explorer (DEDUCE).

Authors:  Monica M Horvath; Shelley A Rusincovitch; Stephanie Brinson; Howard C Shang; Steve Evans; Jeffrey M Ferranti
Journal:  J Biomed Inform       Date:  2014-07-19       Impact factor: 6.317

4.  Using patient lists to add value to integrated data repositories.

Authors:  Ted D Wade; Pearlanne T Zelarney; Richard C Hum; Sylvia McGee; Deborah H Batson
Journal:  J Biomed Inform       Date:  2014-02-15       Impact factor: 6.317

5.  Clinical use of an enterprise data warehouse.

Authors:  R Scott Evans; James F Lloyd; Lee A Pierce
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.