Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A scalable software solution for anonymizing high-dimensional biomedical data.

Literature DB >> 34605868

A scalable software solution for anonymizing high-dimensional biomedical data.

Thierry Meurers¹, Raffael Bild², Kieu-Mi Do³, Fabian Prasser¹.

Abstract

BACKGROUND: Data anonymization is an important building block for ensuring privacy and fosters the reuse of data. However, transforming the data in a way that preserves the privacy of subjects while maintaining a high degree of data quality is challenging and particularly difficult when processing complex datasets that contain a high number of attributes. In this article we present how we extended the open source software ARX to improve its support for high-dimensional, biomedical datasets.
FINDINGS: For improving ARX's capability to find optimal transformations when processing high-dimensional data, we implement 2 novel search algorithms. The first is a greedy top-down approach and is oriented on a formally implemented bottom-up search. The second is based on a genetic algorithm. We evaluated the algorithms with different datasets, transformation methods, and privacy models. The novel algorithms mostly outperformed the previously implemented bottom-up search. In addition, we extended the GUI to provide a high degree of usability and performance when working with high-dimensional datasets.
CONCLUSION: With our additions we have significantly enhanced ARX's ability to handle high-dimensional data in terms of processing performance as well as usability and thus can further facilitate data sharing.

Entities: Chemical

Keywords: anonymization; biomedical data; data privacy; data protection; de-identification; genetic algorithm; heuristics; privacy preserving data publishing; software tool

Mesh：

Year: 2021 PMID： 34605868 PMCID： PMC8489190 DOI： 10.1093/gigascience/giab068

Source DB: PubMed Journal: Gigascience ISSN： 2047-217X Impact factor: 6.524

Keyword Cloud
References

15 in total

1. A globally optimal k-anonymity method for the de-identification of health data.

Authors: Khaled El Emam; Fida Kamal Dankar; Romeo Issa; Elizabeth Jonker; Daniel Amyot; Elise Cogo; Jean-Pierre Corriveau; Mark Walker; Sadrul Chowdhury; Regis Vaillancourt; Tyson Roffey; Jim Bottomley
Journal: J Am Med Inform Assoc Date: 2009-06-30 Impact factor: 4.497

A scalable software solution for anonymizing high-dimensional biomedical data.

1. A globally optimal k-anonymity method for the de-identification of health data.

2. Genomic privacy and limits of individual detection in a pool.

3. An Open Source Tool for Game Theoretic Health Data De-Identification.

4. The Importance of Context: Risk-based De-identification of Biomedical Data.

5. A review on genetic algorithm: past, present, and future.

6. Open University Learning Analytics dataset.

7. Utility-preserving anonymization for health data publishing.

8. Estimating the success of re-identifications in incomplete datasets using generative models.

9. Efficient and effective pruning strategies for health data de-identification.

10. Where is the human in the data? A guide to ethical data use.