INTRODUCTION: Clinical researchers need to share data to support scientific validation and information reuse and to comply with a host of regulations and directives from funders. Various organizations are constructing informatics resources in the form of centralized databases to ensure reuse of data derived from sponsored research. The widespread use of such open databases is contingent on the protection of patient privacy. METHODS: We review privacy-related problems associated with data sharing for clinical research from technical and policy perspectives. We investigate existing policies for secondary data sharing and privacy requirements in the context of data derived from research and clinical settings. In particular, we focus on policies specified by the US National Institutes of Health and the Health Insurance Portability and Accountability Act and touch on how these policies are related to current and future use of data stored in public database archives. We address aspects of data privacy and identifiability from a technical, although approachable, perspective and summarize how biomedical databanks can be exploited and seemingly anonymous records can be reidentified using various resources without hacking into secure computer systems. RESULTS: We highlight which clinical and translational data features, specified in emerging research models, are potentially vulnerable or exploitable. In the process, we recount a recent privacy-related concern associated with the publication of aggregate statistics from pooled genome-wide association studies that have had a significant impact on the data sharing policies of National Institutes of Health-sponsored databanks. CONCLUSION: Based on our analysis and observations we provide a list of recommendations that cover various technical, legal, and policy mechanisms that open clinical databases can adopt to strengthen data privacy protection as they move toward wider deployment and adoption.
INTRODUCTION: Clinical researchers need to share data to support scientific validation and information reuse and to comply with a host of regulations and directives from funders. Various organizations are constructing informatics resources in the form of centralized databases to ensure reuse of data derived from sponsored research. The widespread use of such open databases is contingent on the protection of patient privacy. METHODS: We review privacy-related problems associated with data sharing for clinical research from technical and policy perspectives. We investigate existing policies for secondary data sharing and privacy requirements in the context of data derived from research and clinical settings. In particular, we focus on policies specified by the US National Institutes of Health and the Health Insurance Portability and Accountability Act and touch on how these policies are related to current and future use of data stored in public database archives. We address aspects of data privacy and identifiability from a technical, although approachable, perspective and summarize how biomedical databanks can be exploited and seemingly anonymous records can be reidentified using various resources without hacking into secure computer systems. RESULTS: We highlight which clinical and translational data features, specified in emerging research models, are potentially vulnerable or exploitable. In the process, we recount a recent privacy-related concern associated with the publication of aggregate statistics from pooled genome-wide association studies that have had a significant impact on the data sharing policies of National Institutes of Health-sponsored databanks. CONCLUSION: Based on our analysis and observations we provide a list of recommendations that cover various technical, legal, and policy mechanisms that open clinical databases can adopt to strengthen data privacy protection as they move toward wider deployment and adoption.
Authors: Martin Dugas; Claudia Schoch; Susanne Schnittger; Alexander Kohlmann; Wolfgang Kern; Torsten Haferlach; Karl Uberla Journal: In Silico Biol Date: 2002
Authors: Amy L McGuire; Melissa Basford; Lynn G Dressler; Stephanie M Fullerton; Barbara A Koenig; Rongling Li; Cathy A McCarty; Erin Ramos; Maureen E Smith; Carol P Somkin; Carol Waudby; Wendy A Wolf; Ellen Wright Clayton Journal: Genome Res Date: 2011-06-01 Impact factor: 9.043
Authors: Lucila Ohno-Machado; Vineet Bafna; Aziz A Boxwala; Brian E Chapman; Wendy W Chapman; Kamalika Chaudhuri; Michele E Day; Claudiu Farcas; Nathaniel D Heintzman; Xiaoqian Jiang; Hyeoneui Kim; Jihoon Kim; Michael E Matheny; Frederic S Resnic; Staal A Vinterbo Journal: J Am Med Inform Assoc Date: 2011-11-10 Impact factor: 4.497
Authors: Charmaine D Royal; John Novembre; Stephanie M Fullerton; David B Goldstein; Jeffrey C Long; Michael J Bamshad; Andrew G Clark Journal: Am J Hum Genet Date: 2010-05-14 Impact factor: 11.025
Authors: S Trent Rosenbloom; Jennifer L Madison; Kyle B Brothers; Erica A Bowton; Ellen Wright Clayton; Bradley A Malin; Dan M Roden; Jill Pulley Journal: J Am Med Inform Assoc Date: 2013-07-25 Impact factor: 4.497