Kazuki Yoshida1,2, Susan Gruber3, Bruce H Fireman4, Sengwee Toh3. 1. Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA. 2. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA. 3. Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA, USA. 4. Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA.
Abstract
PURPOSE: Privacy-protecting analytic and data-sharing methods that minimize the disclosure risk of sensitive information are increasingly important due to the growing interest in utilizing data across multiple sources. We conducted a simulation study to examine how avoiding sharing individual-level data in a distributed data network can affect analytic results. METHODS: The base scenario had four sites of varying sizes with 5% outcome incidence, 50% treatment prevalence, and seven confounders. We varied treatment prevalence, outcome incidence, treatment effect, site size, number of sites, and covariate distribution. Confounding adjustment was conducted using propensity score or disease risk score. We compared analyses of three types of aggregate-level data requested from sites: risk-set, summary-table, or effect-estimate data (meta-analysis) with benchmark results of analysis of pooled individual-level data. We assessed bias and precision of hazard ratio estimates as well as the accuracy of standard error estimates. RESULTS: All the aggregate-level data-sharing approaches, regardless of confounding adjustment methods, successfully approximated pooled individual-level data analysis in most simulation scenarios. Meta-analysis showed minor bias when using inverse probability of treatment weights (IPTW) in infrequent exposure (5%), rare outcome (0.01%), and small site (5,000 patients) settings. SE estimates became less accurate for IPTW risk-set approach with less frequent exposure and for propensity score-matching meta-analysis approach with rare outcomes. CONCLUSIONS: Overall, we found that we can avoid sharing individual-level data and obtain valid results in many settings, although care must be taken with meta-analysis approach in infrequent exposure and rare outcome scenarios, particularly when confounding adjustment is performed with IPTW.
PURPOSE: Privacy-protecting analytic and data-sharing methods that minimize the disclosure risk of sensitive information are increasingly important due to the growing interest in utilizing data across multiple sources. We conducted a simulation study to examine how avoiding sharing individual-level data in a distributed data network can affect analytic results. METHODS: The base scenario had four sites of varying sizes with 5% outcome incidence, 50% treatment prevalence, and seven confounders. We varied treatment prevalence, outcome incidence, treatment effect, site size, number of sites, and covariate distribution. Confounding adjustment was conducted using propensity score or disease risk score. We compared analyses of three types of aggregate-level data requested from sites: risk-set, summary-table, or effect-estimate data (meta-analysis) with benchmark results of analysis of pooled individual-level data. We assessed bias and precision of hazard ratio estimates as well as the accuracy of standard error estimates. RESULTS: All the aggregate-level data-sharing approaches, regardless of confounding adjustment methods, successfully approximated pooled individual-level data analysis in most simulation scenarios. Meta-analysis showed minor bias when using inverse probability of treatment weights (IPTW) in infrequent exposure (5%), rare outcome (0.01%), and small site (5,000 patients) settings. SE estimates became less accurate for IPTW risk-set approach with less frequent exposure and for propensity score-matching meta-analysis approach with rare outcomes. CONCLUSIONS: Overall, we found that we can avoid sharing individual-level data and obtain valid results in many settings, although care must be taken with meta-analysis approach in infrequent exposure and rare outcome scenarios, particularly when confounding adjustment is performed with IPTW.
Authors: Sengwee Toh; Joshua J Gagne; Jeremy A Rassen; Bruce H Fireman; Martin Kulldorff; Jeffrey S Brown Journal: Med Care Date: 2013-08 Impact factor: 2.983
Authors: Jeremy A Rassen; Daniel H Solomon; Jeffrey R Curtis; Lisa Herrinton; Sebastian Schneeweiss Journal: Med Care Date: 2010-06 Impact factor: 2.983
Authors: Kazuki Yoshida; Sonia Hernández-Díaz; Daniel H Solomon; John W Jackson; Joshua J Gagne; Robert J Glynn; Jessica M Franklin Journal: Epidemiology Date: 2017-05 Impact factor: 4.822
Authors: Amadou Gaye; Yannick Marcon; Julia Isaeva; Philippe LaFlamme; Andrew Turner; Elinor M Jones; Joel Minion; Andrew W Boyd; Christopher J Newby; Marja-Liisa Nuotio; Rebecca Wilson; Oliver Butters; Barnaby Murtagh; Ipek Demir; Dany Doiron; Lisette Giepmans; Susan E Wallace; Isabelle Budin-Ljøsne; Carsten Oliver Schmidt; Paolo Boffetta; Mathieu Boniol; Maria Bota; Kim W Carter; Nick deKlerk; Chris Dibben; Richard W Francis; Tero Hiekkalinna; Kristian Hveem; Kirsti Kvaløy; Sean Millar; Ivan J Perry; Annette Peters; Catherine M Phillips; Frank Popham; Gillian Raab; Eva Reischl; Nuala Sheehan; Melanie Waldenberger; Markus Perola; Edwin van den Heuvel; John Macleod; Bartha M Knoppers; Ronald P Stolk; Isabel Fortier; Jennifer R Harris; Bruce H R Woffenbuttel; Madeleine J Murtagh; Vincent Ferretti; Paul R Burton Journal: Int J Epidemiol Date: 2014-09-26 Impact factor: 7.196
Authors: Bruce Fireman; Janelle Lee; Ned Lewis; Oliver Bembom; Mark van der Laan; Roger Baxter Journal: Am J Epidemiol Date: 2009-07-22 Impact factor: 4.897
Authors: Rachael L Fleurence; Lesley H Curtis; Robert M Califf; Richard Platt; Joe V Selby; Jeffrey S Brown Journal: J Am Med Inform Assoc Date: 2014-05-12 Impact factor: 4.497
Authors: Sruthi Adimadhyam; Erin F Barreto; Noelle M Cocoros; Sengwee Toh; Jeffrey S Brown; Judith C Maro; Jacqueline Corrigan-Curay; Gerald J Dal Pan; Robert Ball; David Martin; Michael Nguyen; Richard Platt; Xiaojuan Li Journal: J Am Soc Nephrol Date: 2020-10-19 Impact factor: 10.121
Authors: Xiaojuan Li; Bruce H Fireman; Jeffrey R Curtis; David E Arterburn; David P Fisher; Érick Moyneur; Mia Gallagher; Marsha A Raebel; W Benjamin Nowell; Lindsay Lagreid; Sengwee Toh Journal: Am J Epidemiol Date: 2019-04-01 Impact factor: 4.897