Literature DB >> 25520961

Efficient Discovery of De-identification Policies Through a Risk-Utility Frontier.

Weiyi Xia1, Raymond Heatherly2, Xiaofeng Ding3, Jiuyong Li3, Bradley Malin2.   

Abstract

Modern information technologies enable organizations to capture large quantities of person-specific data while providing routine services. Many organizations hope, or are legally required, to share such data for secondary purposes (e.g., validation of research findings) in a de-identified manner. In previous work, it was shown de-identification policy alternatives could be modeled on a lattice, which could be searched for policies that met a prespecified risk threshold (e.g., likelihood of re-identification). However, the search was limited in several ways. First, its definition of utility was syntactic - based on the level of the lattice - and not semantic - based on the actual changes induced in the resulting data. Second, the threshold may not be known in advance. The goal of this work is to build the optimal set of policies that trade-off between privacy risk (R) and utility (U), which we refer to as a R-U frontier. To model this problem, we introduce a semantic definition of utility, based on information theory, that is compatible with the lattice representation of policies. To solve the problem, we initially build a set of policies that define a frontier. We then use a probability-guided heuristic to search the lattice for policies likely to update the frontier. To demonstrate the effectiveness of our approach, we perform an empirical analysis with the Adult dataset of the UCI Machine Learning Repository. We show that our approach can construct a frontier closer to optimal than competitive approaches by searching a smaller number of policies. In addition, we show that a frequently followed de-identification policy (i.e., the Safe Harbor standard of the HIPAA Privacy Rule) is suboptimal in comparison to the frontier discovered by our approach.

Entities:  

Keywords:  De-identification; Experimentation; Management; Optimization; Policy; Privacy; Security

Year:  2013        PMID: 25520961      PMCID: PMC4266184          DOI: 10.1145/2435349.2435357

Source DB:  PubMed          Journal:  CODASPY


  6 in total

1.  Science and government. An international framework to promote access to data.

Authors:  Peter Arzberger; Peter Schroeder; Anne Beaulieu; Geof Bowker; Kathleen Casey; Leif Laaksonen; David Moorman; Paul Uhlir; Paul Wouters
Journal:  Science       Date:  2004-03-19       Impact factor: 47.728

2.  Sharing research data to improve public health.

Authors:  Mark Walport; Paul Brest
Journal:  Lancet       Date:  2011-01-07       Impact factor: 79.321

3.  Optimizing drug outcomes through pharmacogenetics: a case for preemptive genotyping.

Authors:  J S Schildcrout; J C Denny; E Bowton; W Gregg; J M Pulley; M A Basford; J D Cowan; H Xu; A H Ramirez; D C Crawford; M D Ritchie; J F Peterson; D R Masys; R A Wilke; D M Roden
Journal:  Clin Pharmacol Ther       Date:  2012-06-27       Impact factor: 6.875

4.  Beyond Safe Harbor: Automatic Discovery of Health Information De-identification Policy Alternatives.

Authors:  Kathleen Benitez; Grigorios Loukides; Bradley Malin
Journal:  IHI       Date:  2010

5.  Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers.

Authors:  Iain Hrynaszkiewicz; Melissa L Norton; Andrew J Vickers; Douglas G Altman
Journal:  BMJ       Date:  2010-01-28

6.  De-identification methods for open health data: the case of the Heritage Health Prize claims dataset.

Authors:  Khaled El Emam; Luk Arbuckle; Gunes Koru; Benjamin Eze; Lisa Gaudette; Emilio Neri; Sean Rose; Jeremy Howard; Jonathan Gluck
Journal:  J Med Internet Res       Date:  2012-02-27       Impact factor: 5.428

  6 in total
  1 in total

1.  R-U policy frontiers for health data de-identification.

Authors:  Weiyi Xia; Raymond Heatherly; Xiaofeng Ding; Jiuyong Li; Bradley A Malin
Journal:  J Am Med Inform Assoc       Date:  2015-04-24       Impact factor: 4.497

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.