Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets.

Literature DB >> 30962669

HDDA: DataSifter: statistical obfuscation of electronic health records and other sensitive datasets.

Simeone Marino¹, Nina Zhou^1,2, Yi Zhao¹, Lu Wang², Qiucheng Wu¹, Ivo D Dinov^1,3,4,5.

Abstract

There are no practical and effective mechanisms to share high-dimensional data including sensitive information in various fields like health financial intelligence or socioeconomics without compromising either the utility of the data or exposing private personal or secure organizational information. Excessive scrambling or encoding of the information makes it less useful for modelling or analytical processing. Insufficient preprocessing may compromise sensitive information and introduce a substantial risk for re-identification of individuals by various stratification techniques. To address this problem, we developed a novel statistical obfuscation method (DataSifter) for on-the-fly de-identification of structured and unstructured sensitive high-dimensional data such as clinical data from electronic health records (EHR). DataSifter provides complete administrative control over the balance between risk of data re-identification and preservation of the data information. Simulation results suggest that DataSifter can provide privacy protection while maintaining data utility for different types of outcomes of interest. The application of DataSifter on a large autism dataset provides a realistic demonstration of its promise practical applications.

Entities: Chemical Disease Gene Species

Keywords: Big Data; Data sharing; information protection; personal privacy; statistical method

Year: 2018 PMID： 30962669 PMCID： PMC6450541 DOI： 10.1080/00949655.2018.1545228

Source DB: PubMed Journal: J Stat Comput Simul ISSN： 0094-9655 Impact factor: 1.424

Keyword Cloud
Cited

1 in total

1. Compressive Big Data Analytics: An ensemble meta-algorithm for high-dimensional multisource datasets.

Authors: Simeone Marino; Yi Zhao; Nina Zhou; Yiwang Zhou; Arthur W Toga; Lu Zhao; Yingsi Jian; Yichen Yang; Yehu Chen; Qiucheng Wu; Jessica Wild; Brandon Cummings; Ivo D Dinov
Journal: PLoS One Date: 2020-08-28 Impact factor: 3.240

1 in total