Literature DB >> 30034201

Selecting Optimal Subset to release under Differentially Private M-estimators from Hybrid Datasets.

Meng Wang1, Zhanglong Ji2, Hyeon-Eui Kim2, Shuang Wang2, Li Xiong3, Xiaoqian Jiang2.   

Abstract

Privacy concern in data sharing especially for health data gains particularly increasing attention nowadays. Now some patients agree to open their information for research use, which gives rise to a new question of how to effectively use the public information to better understand the private dataset without breaching privacy. In this paper, we specialize this question as selecting an optimal subset of the public dataset for M-estimators in the framework of differential privacy (DP) in [1]. From a perspective of non-interactive learning, we first construct the weighted private density estimation from the hybrid datasets under DP. Along the same line as [2], we analyze the accuracy of the DP M-estimators based on the hybrid datasets. Our main contributions are (i) we find that the bias-variance tradeoff in the performance of our M-estimators can be characterized in the sample size of the released dataset; (2) based on this finding, we develop an algorithm to select the optimal subset of the public dataset to release under DP. Our simulation studies and application to the real datasets confirm our findings and set a guideline in the real application.

Entities:  

Keywords:  M-estimators; differential privacy; hybrid datasets

Year:  2017        PMID: 30034201      PMCID: PMC6051552          DOI: 10.1109/TKDE.2017.2773545

Source DB:  PubMed          Journal:  IEEE Trans Knowl Data Eng        ISSN: 1041-4347            Impact factor:   6.977


  13 in total

Review 1.  Methods for observational post-licensure medical product safety surveillance.

Authors:  Jennifer C Nelson; Andrea J Cook; Onchee Yu; Shanshan Zhao; Lisa A Jackson; Bruce M Psaty
Journal:  Stat Methods Med Res       Date:  2011-12-02       Impact factor: 3.021

2.  Sensible use of observational clinical data.

Authors:  J Marc Overhage; Lauren M Overhage
Journal:  Stat Methods Med Res       Date:  2011-08-09       Impact factor: 3.021

3.  Protecting confidentiality in small population health and environmental statistics.

Authors:  L H Cox
Journal:  Stat Med       Date:  1996 Sep 15-30       Impact factor: 2.373

4.  Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: derivation and evaluation of logistic regression models.

Authors:  R L Kennedy; A M Burton; H S Fraser; L N McStay; R F Harrison
Journal:  Eur Heart J       Date:  1996-08       Impact factor: 29.983

5.  Individual privacy versus public good: protecting confidentiality in health research.

Authors:  Christine M O'Keefe; Donald B Rubin
Journal:  Stat Med       Date:  2015-06-05       Impact factor: 2.373

6.  Privacy-preserving heterogeneous health data sharing.

Authors:  Noman Mohammed; Xiaoqian Jiang; Rui Chen; Benjamin C M Fung; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2012-12-13       Impact factor: 4.497

7.  Imputation of confidential data sets with spatial locations using disease mapping models.

Authors:  Thais Paiva; Avishek Chakraborty; Jerry Reiter; Alan Gelfand
Journal:  Stat Med       Date:  2014-01-07       Impact factor: 2.373

8.  Disclosure control using partially synthetic data for large-scale health surveys, with applications to CanCORS.

Authors:  Bronwyn Loong; Alan M Zaslavsky; Yulei He; David P Harrington
Journal:  Stat Med       Date:  2013-05-13       Impact factor: 2.373

9.  Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery.

Authors:  Yongan Zhao; Xiaofeng Wang; Xiaoqian Jiang; Lucila Ohno-Machado; Haixu Tang
Journal:  J Am Med Inform Assoc       Date:  2014-10-28       Impact factor: 4.497

10.  Differentially private distributed logistic regression using private and public data.

Authors:  Zhanglong Ji; Xiaoqian Jiang; Shuang Wang; Li Xiong; Lucila Ohno-Machado
Journal:  BMC Med Genomics       Date:  2014-05-08       Impact factor: 3.063

View more
  1 in total

Review 1.  Differential privacy in health research: A scoping review.

Authors:  Joseph Ficek; Wei Wang; Henian Chen; Getachew Dagne; Ellen Daley
Journal:  J Am Med Inform Assoc       Date:  2021-09-18       Impact factor: 7.942

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.