George Hripcsak1, Parsa Mirhaji2, Alexander Fh Low3, Bradley A Malin4,5. 1. Department of Biomedical Informatics, Columbia University Medical Center, New York, NY 10032, USA hripcsak@columbia.edu. 2. Montefiore Medical Center/Albert Einstein College of Medicine, Bronx, New York, NY 10461, USA. 3. Department of Healthcare Policy and Research, Weill Cornell Medical College, New York, NY 10065, USA. 4. Department of Biomedical Informatics, School of Medicine, Vanderbilt University, Nashville, TN 37203, USA. 5. Department of Electrical Engineering and Computer Science, School of Engineering, Vanderbilt University, Nashville, TN 37203, USA.
Abstract
OBJECTIVE: Maintaining patient privacy is a challenge in large-scale observational research. To assist in reducing the risk of identifying study subjects through publicly available data, we introduce a method for obscuring date information for clinical events and patient characteristics. METHODS: The method, which we call Shift and Truncate (SANT), obscures date information to any desired granularity. Shift and Truncate first assigns each patient a random shift value, such that all dates in that patient's record are shifted by that amount. Data are then truncated from the beginning and end of the data set. RESULTS: The data set can be proven to not disclose temporal information finer than the chosen granularity. Unlike previous strategies such as a simple shift, it remains robust to frequent - even daily - updates and robust to inferring dates at the beginning and end of date-shifted data sets. Time-of-day may be retained or obscured, depending on the goal and anticipated knowledge of the data recipient. CONCLUSIONS: The method can be useful as a scientific approach for reducing re-identification risk under the Privacy Rule of the Health Insurance Portability and Accountability Act and may contribute to qualification for the Safe Harbor implementation.
OBJECTIVE: Maintaining patient privacy is a challenge in large-scale observational research. To assist in reducing the risk of identifying study subjects through publicly available data, we introduce a method for obscuring date information for clinical events and patient characteristics. METHODS: The method, which we call Shift and Truncate (SANT), obscures date information to any desired granularity. Shift and Truncate first assigns each patient a random shift value, such that all dates in that patient's record are shifted by that amount. Data are then truncated from the beginning and end of the data set. RESULTS: The data set can be proven to not disclose temporal information finer than the chosen granularity. Unlike previous strategies such as a simple shift, it remains robust to frequent - even daily - updates and robust to inferring dates at the beginning and end of date-shifted data sets. Time-of-day may be retained or obscured, depending on the goal and anticipated knowledge of the data recipient. CONCLUSIONS: The method can be useful as a scientific approach for reducing re-identification risk under the Privacy Rule of the Health Insurance Portability and Accountability Act and may contribute to qualification for the Safe Harbor implementation.
Authors: Ameet Sarpatwari; Aaron S Kesselheim; Bradley A Malin; Joshua J Gagne; Sebastian Schneeweiss Journal: N Engl J Med Date: 2014-10-23 Impact factor: 91.245
Authors: George Hripcsak; Jon D Duke; Nigam H Shah; Christian G Reich; Vojtech Huser; Martijn J Schuemie; Marc A Suchard; Rae Woong Park; Ian Chi Kei Wong; Peter R Rijnbeek; Johan van der Lei; Nicole Pratt; G Niklas Norén; Yu-Chuan Li; Paul E Stang; David Madigan; Patrick B Ryan Journal: Stud Health Technol Inform Date: 2015
Authors: Ishna Neamatullah; Margaret M Douglass; Li-wei H Lehman; Andrew Reisner; Mauricio Villarroel; William J Long; Peter Szolovits; George B Moody; Roger G Mark; Gari D Clifford Journal: BMC Med Inform Decis Mak Date: 2008-07-24 Impact factor: 2.796
Authors: Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams Journal: Genet Med Date: 2013-06-06 Impact factor: 8.822
Authors: Rainu Kaushal; George Hripcsak; Deborah D Ascheim; Toby Bloom; Thomas R Campion; Arthur L Caplan; Brian P Currie; Thomas Check; Emme Levin Deland; Marc N Gourevitch; Raffaella Hart; Carol R Horowitz; Isaac Kastenbaum; Arthur Aaron Levin; Alexander F H Low; Paul Meissner; Parsa Mirhaji; Harold A Pincus; Charles Scaglione; Donna Shelley; Jonathan N Tobin Journal: J Am Med Inform Assoc Date: 2014-05-12 Impact factor: 4.497
Authors: Rachael L Fleurence; Lesley H Curtis; Robert M Califf; Richard Platt; Joe V Selby; Jeffrey S Brown Journal: J Am Med Inform Assoc Date: 2014-05-12 Impact factor: 4.497
Authors: Denis Newman-Griffis; Julia Porcino; Ayah Zirikly; Thanh Thieu; Jonathan Camacho Maldonado; Pei-Shu Ho; Min Ding; Leighton Chan; Elizabeth Rasch Journal: BMC Public Health Date: 2019-10-15 Impact factor: 3.295
Authors: Andrew J Goodwin; Danny Eytan; William Dixon; Sebastian D Goodfellow; Zakary Doherty; Robert W Greer; Alistair McEwan; Mark Tracy; Peter C Laussen; Azadeh Assadi; Mjaye Mazwi Journal: Front Digit Health Date: 2022-08-18