Literature DB >> 33413390

Measuring the impact of spatial perturbations on the relationship between data privacy and validity of descriptive statistics.

Kelly Broen1,2, Rob Trangucci3, Jon Zelner4,5.   

Abstract

BACKGROUND: Like many scientific fields, epidemiology is addressing issues of research reproducibility. Spatial epidemiology, which often uses the inherently identifiable variable of participant address, must balance reproducibility with participant privacy. In this study, we assess the impact of several different data perturbation methods on key spatial statistics and patient privacy.
METHODS: We analyzed the impact of perturbation on spatial patterns in the full set of address-level mortality data from Lawrence, MA during the period from 1911 to 1913. The original death locations were perturbed using seven different published approaches to stochastic and deterministic spatial data anonymization. Key spatial descriptive statistics were calculated for each perturbation, including changes in spatial pattern center, Global Moran's I, Local Moran's I, distance to the k-th nearest neighbors, and the L-function (a normalized form of Ripley's K). A spatially adapted form of k-anonymity was used to measure the privacy protection conferred by each method, and its compliance with HIPAA and GDPR privacy standards.
RESULTS: Random perturbation at 50 m, donut masking between 5 and 50 m, and Voronoi masking maintain the validity of descriptive spatial statistics better than other perturbations. Grid center masking with both 100 × 100 and 250 × 250 m cells led to large changes in descriptive spatial statistics. None of the perturbation methods adhered to the HIPAA standard that all points have a k-anonymity > 10. All other perturbation methods employed had at least 265 points, or over 6%, not adhering to the HIPAA standard.
CONCLUSIONS: Using the set of published perturbation methods applied in this analysis, HIPAA and GDPR compliant de-identification was not compatible with maintaining key spatial patterns as measured by our chosen summary statistics. Further research should investigate alternate methods to balancing tradeoffs between spatial data privacy and preservation of key patterns in public health data that are of scientific and medical importance.

Entities:  

Keywords:  Geomasking; Privacy; Reproducibility; Spatial anonymity

Year:  2021        PMID: 33413390      PMCID: PMC7788553          DOI: 10.1186/s12942-020-00256-8

Source DB:  PubMed          Journal:  Int J Health Geogr        ISSN: 1476-072X            Impact factor:   3.918


  16 in total

Review 1.  Geographically masking health data to preserve confidentiality.

Authors:  M P Armstrong; G Rushton; D L Zimmerman
Journal:  Stat Med       Date:  1999-03-15       Impact factor: 2.373

2.  Mapping health data: improved privacy protection with donut method geomasking.

Authors:  Kristen H Hampton; Molly K Fitch; William B Allshouse; Irene A Doherty; Dionne C Gesink; Peter A Leone; Marc L Serre; William C Miller
Journal:  Am J Epidemiol       Date:  2010-09-03       Impact factor: 4.897

3.  Spatial epidemiology: an emerging (or re-emerging) discipline.

Authors:  Richard S Ostfeld; Gregory E Glass; Felicia Keesing
Journal:  Trends Ecol Evol       Date:  2005-06       Impact factor: 17.712

4.  Race, socioeconomic status, and air pollution exposure in North Carolina.

Authors:  Simone C Gray; Sharon E Edwards; Marie Lynn Miranda
Journal:  Environ Res       Date:  2013-07-11       Impact factor: 6.498

5.  Spatial clustering of HIV prevalence in Atlanta, Georgia and population characteristics associated with case concentrations.

Authors:  Brooke A Hixson; Saad B Omer; Carlos del Rio; Paula M Frew
Journal:  J Urban Health       Date:  2011-02       Impact factor: 3.671

6.  The verified neighbor approach to geoprivacy: An improved method for geographic masking.

Authors:  Wayne Richter
Journal:  J Expo Sci Environ Epidemiol       Date:  2017-09-20       Impact factor: 5.563

Review 7.  Degrees of Freedom in Planning, Running, Analyzing, and Reporting Psychological Studies: A Checklist to Avoid p-Hacking.

Authors:  Jelte M Wicherts; Coosje L S Veldkamp; Hilde E M Augusteijn; Marjan Bakker; Robbie C M van Aert; Marcel A L M van Assen
Journal:  Front Psychol       Date:  2016-11-25

8.  Racial Disparities in Coronavirus Disease 2019 (COVID-19) Mortality Are Driven by Unequal Infection Risks.

Authors:  Jon Zelner; Rob Trangucci; Ramya Naraharisetti; Alex Cao; Ryan Malosh; Kelly Broen; Nina Masters; Paul Delamater
Journal:  Clin Infect Dis       Date:  2021-03-01       Impact factor: 9.079

Review 9.  Ensuring Confidentiality of Geocoded Health Data: Assessing Geographic Masking Strategies for Individual-Level Data.

Authors:  Paul A Zandbergen
Journal:  Adv Med       Date:  2014-04-29

10.  An interactive web-based dashboard to track COVID-19 in real time.

Authors:  Ensheng Dong; Hongru Du; Lauren Gardner
Journal:  Lancet Infect Dis       Date:  2020-02-19       Impact factor: 25.071

View more
  1 in total

Review 1.  A guide to backward paper writing for the data sciences.

Authors:  Jon Zelner; Kelly Broen; Ella August
Journal:  Patterns (N Y)       Date:  2022-01-03
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.