Bradley Malin1, Kathleen Benitez, Daniel Masys. 1. Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, Tennessee 37203, USA. b.malin@vanderbilt.edu
Abstract
OBJECTIVE: Healthcare organizations must de-identify patient records before sharing data. Many organizations rely on the Safe Harbor Standard of the HIPAA Privacy Rule, which enumerates 18 identifiers that must be suppressed (eg, ages over 89). An alternative model in the Privacy Rule, known as the Statistical Standard, can facilitate the sharing of more detailed data, but is rarely applied because of a lack of published methodologies. The authors propose an intuitive approach to de-identifying patient demographics in accordance with the Statistical Standard. DESIGN: The authors conduct an analysis of the demographics of patient cohorts in five medical centers developed for the NIH-sponsored Electronic Medical Records and Genomics network, with respect to the US census. They report the re-identification risk of patient demographics disclosed according to the Safe Harbor policy and the relative risk rate for sharing such information via alternative policies. MEASUREMENTS: The re-identification risk of Safe Harbor demographics ranged from 0.01% to 0.19%. The findings show alternative de-identification models can be created with risks no greater than Safe Harbor. The authors illustrate that the disclosure of patient ages over the age of 89 is possible when other features are reduced in granularity. LIMITATIONS: The de-identification approach described in this paper was evaluated with demographic data only and should be evaluated with other potential identifiers. CONCLUSION: Alternative de-identification policies to the Safe Harbor model can be derived for patient demographics to enable the disclosure of values that were previously suppressed. The method is generalizable to any environment in which population statistics are available.
OBJECTIVE: Healthcare organizations must de-identify patient records before sharing data. Many organizations rely on the Safe Harbor Standard of the HIPAA Privacy Rule, which enumerates 18 identifiers that must be suppressed (eg, ages over 89). An alternative model in the Privacy Rule, known as the Statistical Standard, can facilitate the sharing of more detailed data, but is rarely applied because of a lack of published methodologies. The authors propose an intuitive approach to de-identifying patient demographics in accordance with the Statistical Standard. DESIGN: The authors conduct an analysis of the demographics of patient cohorts in five medical centers developed for the NIH-sponsored Electronic Medical Records and Genomics network, with respect to the US census. They report the re-identification risk of patient demographics disclosed according to the Safe Harbor policy and the relative risk rate for sharing such information via alternative policies. MEASUREMENTS: The re-identification risk of Safe Harbor demographics ranged from 0.01% to 0.19%. The findings show alternative de-identification models can be created with risks no greater than Safe Harbor. The authors illustrate that the disclosure of patient ages over the age of 89 is possible when other features are reduced in granularity. LIMITATIONS: The de-identification approach described in this paper was evaluated with demographic data only and should be evaluated with other potential identifiers. CONCLUSION: Alternative de-identification policies to the Safe Harbor model can be derived for patient demographics to enable the disclosure of values that were previously suppressed. The method is generalizable to any environment in which population statistics are available.
Authors: Khaled El Emam; Fida Kamal Dankar; Romeo Issa; Elizabeth Jonker; Daniel Amyot; Elise Cogo; Jean-Pierre Corriveau; Mark Walker; Sadrul Chowdhury; Regis Vaillancourt; Tyson Roffey; Jim Bottomley Journal: J Am Med Inform Assoc Date: 2009-06-30 Impact factor: 4.497
Authors: Ross Sparks; Chris Carter; John B Donnelly; Christine M O'Keefe; Jodie Duncan; Tim Keighley; Damien McAullay Journal: Comput Methods Programs Biomed Date: 2008-05-20 Impact factor: 5.428
Authors: Amy L McGuire; Melissa Basford; Lynn G Dressler; Stephanie M Fullerton; Barbara A Koenig; Rongling Li; Cathy A McCarty; Erin Ramos; Maureen E Smith; Carol P Somkin; Carol Waudby; Wendy A Wolf; Ellen Wright Clayton Journal: Genome Res Date: 2011-06-01 Impact factor: 9.043
Authors: Lucila Ohno-Machado; Vineet Bafna; Aziz A Boxwala; Brian E Chapman; Wendy W Chapman; Kamalika Chaudhuri; Michele E Day; Claudiu Farcas; Nathaniel D Heintzman; Xiaoqian Jiang; Hyeoneui Kim; Jihoon Kim; Michael E Matheny; Frederic S Resnic; Staal A Vinterbo Journal: J Am Med Inform Assoc Date: 2011-11-10 Impact factor: 4.497
Authors: Katherine K Kim; Dennis K Browe; Holly C Logan; Roberta Holm; Lori Hack; Lucila Ohno-Machado Journal: J Am Med Inform Assoc Date: 2013-12-03 Impact factor: 4.497
Authors: James Gardner; Li Xiong; Yonghui Xiao; Jingjing Gao; Andrew R Post; Xiaoqian Jiang; Lucila Ohno-Machado Journal: J Am Med Inform Assoc Date: 2012-10-11 Impact factor: 4.497
Authors: Alison L Antes; Heidi A Walsh; Michelle Strait; Cynthia R Hudson-Vitale; James M DuBois Journal: J Empir Res Hum Res Ethics Date: 2017-12-10 Impact factor: 1.742