Literature DB >> 34321961

Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.

Hasan B Kartal1, Xiao-Bai Li2.   

Abstract

This study concerns the risks of privacy disclosure when sharing and releasing a dataset in which each individual may be associated with multiple records. Existing data privacy approaches and policies typically assume that each individual in a shared dataset corresponds to a single record, leading to an underestimation of the disclosure risks in multiple records per person scenarios. We propose two novel measures of privacy disclosure to arrive at a more appropriate assessment of disclosure risks. The first measure assesses individual-record disclosure risk based upon the frequency distribution of individuals' occurrences. The second measure assesses sensitive-attribute disclosure risk based upon the number of individuals affiliated with a sensitive value. We show that the two proposed disclosure measures generalize the well-known k-anonymity and l-diversity measures, respectively, and work for scenarios with either a single record or multiple records per person. We have developed an efficient computational procedure that integrates the two proposed measures and a data quality measure to anonymize the data with multiple records per person when sharing and releasing the data for research and analytics. The results of the experimental evaluation using real-world data demonstrate the advantage of the proposed approach over existing techniques for protecting privacy while preserving data quality.

Keywords:  Data Privacy; Gini Index; k-Anonymity; kd-Trees; l-Diversity

Year:  2020        PMID: 34321961      PMCID: PMC8315096          DOI: 10.17705/1jais.00643

Source DB:  PubMed          Journal:  J Assoc Inf Syst        ISSN: 1536-9323            Impact factor:   5.149


  7 in total

1.  Standards for privacy of individually identifiable health information. Office of the Assistant Secretary for Planning and Evaluation, DHHS. Final rule.

Authors: 
Journal:  Fed Regist       Date:  2000-12-28

2.  Data resource profile: the Rochester Epidemiology Project (REP) medical records-linkage system.

Authors:  Jennifer L St Sauver; Brandon R Grossardt; Barbara P Yawn; L Joseph Melton; Joshua J Pankratz; Scott M Brue; Walter A Rocca
Journal:  Int J Epidemiol       Date:  2012-11-18       Impact factor: 7.196

3.  Digression and Value Concatenation to Enable Privacy-Preserving Regression.

Authors:  Xiao-Bai Li; Sumit Sarkar
Journal:  MIS Q       Date:  2014-09       Impact factor: 7.198

4.  Evaluating the Risk of Re-identification of Patients from Hospital Prescription Records.

Authors:  Khaled El Emam; Fida K Dankar; Régis Vaillancourt; Tyson Roffey; Mary Lysyk
Journal:  Can J Hosp Pharm       Date:  2009-07

5.  Class Restricted Clustering and Micro-Perturbation for Data Privacy.

Authors:  Xiao-Bai Li; Sumit Sarkar
Journal:  Manage Sci       Date:  2013-04-01       Impact factor: 4.883

Review 6.  History of the Rochester Epidemiology Project: half a century of medical records linkage in a US population.

Authors:  Walter A Rocca; Barbara P Yawn; Jennifer L St Sauver; Brandon R Grossardt; L Joseph Melton
Journal:  Mayo Clin Proc       Date:  2012-11-28       Impact factor: 7.616

7.  Evaluating the risk of patient re-identification from adverse drug event reports.

Authors:  Khaled El Emam; Fida K Dankar; Angelica Neisa; Elizabeth Jonker
Journal:  BMC Med Inform Decis Mak       Date:  2013-10-05       Impact factor: 2.796

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.