Literature DB >> 11825239

Effects of data anonymization by cell suppression on descriptive statistics and predictive modeling performance.

L Ohno-Machado1, S A Vinterbo, S Dreiseitl.   

Abstract

Protecting individual data in disclosed databases is essential. Data anonymization strategies can produce table ambiguation by suppression of selected cells. Using table ambiguation, different degrees of anonymization can be achieved, depending on the number of individuals that a particular case must become indistinguishable from. This number defines the level of anonymization. Anonymization by cell suppression does not necessarily prevent inferences from being made from the disclosed data. Preventing inferences may be important to preserve confidentiality. We show that anonymized data sets can preserve descriptive characteristics of the data, but might also be used for making inferences on particular individuals, which is a feature that may not be desirable. The degradation of predictive performance is directly proportional to the degree of anonymity. As an example, we report the effect of anonymization on the predictive performance of a model constructed to estimate the probability of disease given clinical findings.

Mesh:

Year:  2001        PMID: 11825239      PMCID: PMC2243599     

Source DB:  PubMed          Journal:  Proc AMIA Symp        ISSN: 1531-605X


  7 in total

1.  A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction.

Authors:  S Vinterbo; L Ohno-Machado
Journal:  Proc AMIA Symp       Date:  1999

2.  Evaluating variable selection methods for diagnosis of myocardial infarction.

Authors:  S Dreiseitl; L Ohno-Machado; S Vinterbo
Journal:  Proc AMIA Symp       Date:  1999

3.  Using Boolean reasoning to anonymize databases.

Authors:  A Ohrn; L Ohno-Machado
Journal:  Artif Intell Med       Date:  1999-03       Impact factor: 5.326

4.  Improving machine learning performance by removing redundant cases in medical data sets.

Authors:  L Ohno-Machado; H S Fraser; A Ohrn
Journal:  Proc AMIA Symp       Date:  1998

5.  Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: derivation and evaluation of logistic regression models.

Authors:  R L Kennedy; A M Burton; H S Fraser; L N McStay; R F Harrison
Journal:  Eur Heart J       Date:  1996-08       Impact factor: 29.983

Review 6.  Driving toward guiding principles: a goal for privacy, confidentiality, and security of health information.

Authors:  S A Buckovich; H E Rippen; M J Rozen
Journal:  J Am Med Inform Assoc       Date:  1999 Mar-Apr       Impact factor: 4.497

7.  The Internet and electronic transmission of medical records.

Authors:  S G Campbell; G L Gibby; S Collingwood
Journal:  J Clin Monit       Date:  1997-09
  7 in total
  5 in total

1.  A security architecture for query tools used to access large biomedical databases.

Authors:  Shawn N Murphy; Henry C Chueh
Journal:  Proc AMIA Symp       Date:  2002

2.  Strategies for maintaining patient privacy in i2b2.

Authors:  Shawn N Murphy; Vivian Gainer; Michael Mendis; Susanne Churchill; Isaac Kohane
Journal:  J Am Med Inform Assoc       Date:  2011-10-07       Impact factor: 4.497

3.  Protecting privacy using k-anonymity.

Authors:  Khaled El Emam; Fida Kamal Dankar
Journal:  J Am Med Inform Assoc       Date:  2008-06-25       Impact factor: 4.497

Review 4.  Privacy technology to support data sharing for comparative effectiveness research: a systematic review.

Authors:  Xiaoqian Jiang; Anand D Sarwate; Lucila Ohno-Machado
Journal:  Med Care       Date:  2013-08       Impact factor: 2.983

5.  Reconsidering Anonymization-Related Concepts and the Term "Identification" Against the Backdrop of the European Legal Framework.

Authors:  Murat Sariyar; Irene Schlünder
Journal:  Biopreserv Biobank       Date:  2016-04-22       Impact factor: 2.300

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.