Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models.

Literature DB >> 33177037

Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models.

Jongsub Moon¹, Hyung Joon Joo², Seungho Jeon¹, Jeongeun Seo¹, Sukyoung Kim¹, Jeongmoon Lee³, Jong-Ho Kim⁴, Jang Wook Sohn⁵.

Abstract

BACKGROUND: De-identifying personal information is critical when using personal health data for secondary research. The Observational Medical Outcomes Partnership Common Data Model (CDM), defined by the nonprofit organization Observational Health Data Sciences and Informatics, has been gaining attention for its use in the analysis of patient-level clinical data obtained from various medical institutions. When analyzing such data in a public environment such as a cloud-computing system, an appropriate de-identification strategy is required to protect patient privacy.
OBJECTIVE: This study proposes and evaluates a de-identification strategy that is comprised of several rules along with privacy models such as k-anonymity, l-diversity, and t-closeness. The proposed strategy was evaluated using the actual CDM database.
METHODS: The CDM database used in this study was constructed by the Anam Hospital of Korea University. Analysis and evaluation were performed using the ARX anonymizing framework in combination with the k-anonymity, l-diversity, and t-closeness privacy models.
RESULTS: The CDM database, which was constructed according to the rules established by Observational Health Data Sciences and Informatics, exhibited a low risk of re-identification: The highest re-identifiable record rate (11.3%) in the dataset was exhibited by the DRUG_EXPOSURE table, with a re-identification success rate of 0.03%. However, because all tables include at least one "highest risk" value of 100%, suitable anonymizing techniques are required; moreover, the CDM database preserves the "source values" (raw data), a combination of which could increase the risk of re-identification. Therefore, this study proposes an enhanced strategy to de-identify the source values to significantly reduce not only the highest risk in the k-anonymity, l-diversity, and t-closeness privacy models but also the overall possibility of re-identification.
CONCLUSIONS: Our proposed de-identification strategy effectively enhanced the privacy of the CDM database, thereby encouraging clinical research involving multiple centers. ©Seungho Jeon, Jeongeun Seo, Sukyoung Kim, Jeongmoon Lee, Jong-Ho Kim, Jang Wook Sohn, Jongsub Moon, Hyung Joon Joo. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 26.11.2020.

Entities: Chemical Disease Gene Species

Keywords: Observational Health Data Sciences and Informatics; anonymization; common data model; de-identification; privacy

Year: 2020 PMID： 33177037 DOI： 10.2196/19597

Source DB: PubMed Journal: J Med Internet Res ISSN： 1438-8871 Impact factor: 5.428

2 in total

Review 1. Big Data in Nephrology.

Authors: Navchetan Kaur; Sanchita Bhattacharya; Atul J Butte
Journal: Nat Rev Nephrol Date: 2021-06-30 Impact factor: 28.314

2. Perceived Risk of Re-Identification in OMOP-CDM Database: A Cross-Sectional Survey.

Authors: Yae Won Tak; Seng Chan You; Jeong Hyun Han; Soon-Seok Kim; Gi-Tae Kim; Yura Lee
Journal: J Korean Med Sci Date: 2022-07-04 Impact factor: 5.354

2 in total