Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Utility-driven assessment of anonymized data via clustering.

Literature DB >> 35907927

Utility-driven assessment of anonymized data via clustering.

Maria Eugénia Ferrão¹, Paula Prata², Paulo Fazendeiro³.

Abstract

In this study, clustering is conceived as an auxiliary tool to identify groups of special interest. This approach was applied to a real dataset concerning an entire Portuguese cohort of higher education Law students. Several anonymized clustering scenarios were compared against the original cluster solution. The clustering techniques were explored as data utility models in the context of data anonymization, using k-anonymity and (ε, δ)-differential as privacy models. The purpose was to assess anonymized data utility by standard metrics, by the characteristics of the groups obtained, and the relative risk (a relevant metric in social sciences research). For a matter of self-containment, we present an overview of anonymization and clustering methods. We used a partitional clustering algorithm and analyzed several clustering validity indices to understand to what extent the data structure is preserved, or not, after data anonymization. The results suggest that for low dimensionality/cardinality datasets the anonymization procedure easily jeopardizes the clustering endeavor. In addition, there is evidence that relevant field-of-study estimates obtained from anonymized data are biased.

Entities: Chemical

Year: 2022 PMID： 35907927 PMCID： PMC9339002 DOI： 10.1038/s41597-022-01561-6

Source DB: PubMed Journal: Sci Data ISSN： 2052-4463 Impact factor: 8.501

5 in total

1 in total

Review 1. Utility-driven assessment of anonymized data via clustering.

Authors: Maria Eugénia Ferrão; Paula Prata; Paulo Fazendeiro
Journal: Sci Data Date: 2022-07-30 Impact factor: 8.501

1 in total

Utility-driven assessment of anonymized data via clustering.

1. A Generic Method for Assessing the Quality of De-Identified Health Data.

Review 2. Statistical methods in cancer research. Volume II--The design and analysis of cohort studies.

Review 3. A Comprehensive Survey on Local Differential Privacy toward Data Statistics and Analysis.

Review 4. Utility-driven assessment of anonymized data via clustering.

5. A novel bidirectional clustering algorithm based on local density.

Review 1. Utility-driven assessment of anonymized data via clustering.