Literature DB >> 31947331

On Anonymizing Medical Microdata with Large-Scale Missing Values - A Case Study with the FAERS Dataset.

Mei-Hui Hsiao, Wen-Yang Lin, Kuang-Yung Hsu, Zih-Xun Shen.   

Abstract

As big data analysis becomes one of the main driving forces for productivity and economic growth, the concern of individual privacy disclosure increases as well, especially for applications accessing medical or health data that contain personal information. Most contemporary techniques for privacy preserving data publishing follow a simple assumption-the data of concern is complete, i.e., containing no missing values, which however is not the case in the real world. This paper presents our endeavors on inspecting the effect of missing values upon medical data privacy. In particular, we inspected the US FAERS dataset, a public dataset containing adverse drug events released by US FDA. Following the presumption of current anonymization paradigm-the data should contain no missing values, we investigated three intuitive strategies, including or excluding missing values or executing imputation, to anonymize the FAERS dataset. Our results demonstrate the awkwardness of these intuitive strategies in handling data with a massive amount of missing values. Accordingly, we propose a new strategy, consolidation, and the corresponding privacy protection model and anonymization algorithm. Experimental results show that our method can prevent privacy disclosure and sustain the data utility for ADR signal detection.

Year:  2019        PMID: 31947331     DOI: 10.1109/EMBC.2019.8857025

Source DB:  PubMed          Journal:  Conf Proc IEEE Eng Med Biol Soc        ISSN: 1557-170X


  1 in total

1.  Improved privacy preserving method for periodical SRS publishing.

Authors:  Wei Huang; Tong Yi; Haibin Zhu; Wenqian Shang; Weiguo Lin
Journal:  PLoS One       Date:  2021-04-22       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.