Literature DB >> 20442151

The disclosure of diagnosis codes can breach research participants' privacy.

Grigorios Loukides1, Joshua C Denny, Bradley Malin.   

Abstract

OBJECTIVE: De-identified clinical data in standardized form (eg, diagnosis codes), derived from electronic medical records, are increasingly combined with research data (eg, DNA sequences) and disseminated to enable scientific investigations. This study examines whether released data can be linked with identified clinical records that are accessible via various resources to jeopardize patients' anonymity, and the ability of popular privacy protection methodologies to prevent such an attack.
DESIGN: The study experimentally evaluates the re-identification risk of a de-identified sample of Vanderbilt's patient records involved in a genome-wide association study. It also measures the level of protection from re-identification, and data utility, provided by suppression and generalization. MEASUREMENT: Privacy protection is quantified using the probability of re-identifying a patient in a larger population through diagnosis codes. Data utility is measured at a dataset level, using the percentage of retained information, as well as its description, and at a patient level, using two metrics based on the difference between the distribution of Internal Classification of Disease (ICD) version 9 codes before and after applying privacy protection.
RESULTS: More than 96% of 2800 patients' records are shown to be uniquely identified by their diagnosis codes with respect to a population of 1.2 million patients. Generalization is shown to reduce further the percentage of de-identified records by less than 2%, and over 99% of the three-digit ICD-9 codes need to be suppressed to prevent re-identification.
CONCLUSIONS: Popular privacy protection methods are inadequate to deliver a sufficiently protected and useful result when sharing data derived from complex clinical systems. The development of alternative privacy protection models is thus required.

Entities:  

Mesh:

Year:  2010        PMID: 20442151      PMCID: PMC2995712          DOI: 10.1136/jamia.2009.002725

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  14 in total

Review 1.  Ethical and legal implications of pharmacogenomics.

Authors:  M A Rothstein; P G Epps
Journal:  Nat Rev Genet       Date:  2001-03       Impact factor: 53.242

2.  Hiding information by cell suppression.

Authors:  S A Vinterbo; L Ohno-Machado; S Dreiseitl
Journal:  Proc AMIA Symp       Date:  2001

3.  Using binning to maintain confidentiality of medical data.

Authors:  Zhen Lin; Michael Hewett; Russ B Altman
Journal:  Proc AMIA Symp       Date:  2002

4.  UK Biobank: a project in search of a protocol?

Authors:  Virginia Barbour
Journal:  Lancet       Date:  2003-05-17       Impact factor: 79.321

5.  How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems.

Authors:  Bradley Malin; Latanya Sweeney
Journal:  J Biomed Inform       Date:  2004-06       Impact factor: 6.317

6.  Genetics. Genomic research and human subject privacy.

Authors:  Zhen Lin; Art B Owen; Russ B Altman
Journal:  Science       Date:  2004-07-09       Impact factor: 47.728

7.  Rare visible disorders/ diseases as individually identifiable health information.

Authors:  Tewodros Eguale; Gillian Bartlett; Robyn Tamblyn
Journal:  AMIA Annu Symp Proc       Date:  2005

Review 8.  A call for the creation of personalized medicine databases.

Authors:  David Gurwitz; Jeantine E Lunshof; Russ B Altman
Journal:  Nat Rev Drug Discov       Date:  2006-01       Impact factor: 84.694

9.  A cryptographic approach to securely share and query genomic sequences.

Authors:  Murat Kantarcioglu; Wei Jiang; Ying Liu; Bradley Malin
Journal:  IEEE Trans Inf Technol Biomed       Date:  2008-09

10.  A computational model to protect patient data from location-based re-identification.

Authors:  Bradley Malin
Journal:  Artif Intell Med       Date:  2007-06-01       Impact factor: 5.326

View more
  49 in total

1.  Attribute Utility Motivated k-anonymization of datasets to support the heterogeneous needs of biomedical researchers.

Authors:  Huimin Ye; Elizabeth S Chen
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Trends in biomedical informatics: most cited topics from recent years.

Authors:  Hyeon-Eui Kim; Xiaoqian Jiang; Jihoon Kim; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2011-12       Impact factor: 4.497

3.  iDASH: integrating data for analysis, anonymization, and sharing.

Authors:  Lucila Ohno-Machado; Vineet Bafna; Aziz A Boxwala; Brian E Chapman; Wendy W Chapman; Kamalika Chaudhuri; Michele E Day; Claudiu Farcas; Nathaniel D Heintzman; Xiaoqian Jiang; Hyeoneui Kim; Jihoon Kim; Michael E Matheny; Frederic S Resnic; Staal A Vinterbo
Journal:  J Am Med Inform Assoc       Date:  2011-11-10       Impact factor: 4.497

4.  Anonymization of longitudinal electronic medical records.

Authors:  Acar Tamersoy; Grigorios Loukides; Mehmet Ercan Nergiz; Yucel Saygin; Bradley Malin
Journal:  IEEE Trans Inf Technol Biomed       Date:  2012-01-27

5.  ARX--A Comprehensive Tool for Anonymizing Biomedical Data.

Authors:  Fabian Prasser; Florian Kohlmayer; Ronald Lautenschläger; Klaus A Kuhn
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

6.  Detecting the Presence of an Individual in Phenotypic Summary Data.

Authors:  Yongtai Liu; Zhiyu Wan; Weiyi Xia; Murat Kantarcioglu; Yevgeniy Vorobeychik; Ellen Wright Clayton; Abel Kho; David Carrell; Bradley A Malin
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

7.  Privacy-Preserving Methods for Vertically Partitioned Incomplete Data.

Authors:  Yi Deng; Xiaoqian Jiang; Qi Long
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

Review 8.  Big Data Science: Opportunities and Challenges to Address Minority Health and Health Disparities in the 21st Century.

Authors:  Xinzhi Zhang; Eliseo J Pérez-Stable; Philip E Bourne; Emmanuel Peprah; O Kenrik Duru; Nancy Breen; David Berrigan; Fred Wood; James S Jackson; David W S Wong; Joshua Denny
Journal:  Ethn Dis       Date:  2017-04-20       Impact factor: 1.847

Review 9.  Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.

Authors:  S M Meystre; C Lovis; T Bürkle; G Tognola; A Budrionis; C U Lehmann
Journal:  Yearb Med Inform       Date:  2017-09-11

Review 10.  Identifiability in biobanks: models, measures, and mitigation strategies.

Authors:  Bradley Malin; Grigorios Loukides; Kathleen Benitez; Ellen Wright Clayton
Journal:  Hum Genet       Date:  2011-07-08       Impact factor: 4.132

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.