OBJECTIVE: Record linkage to integrate uncoordinated databases is critical in biomedical research using Big Data. Balancing privacy protection against the need for high quality record linkage requires a human-machine hybrid system to safely manage uncertainty in the ever changing streams of chaotic Big Data. METHODS: In the computer science literature, private record linkage is the most published area. It investigates how to apply a known linkage function safely when linking two tables. However, in practice, the linkage function is rarely known. Thus, there are many data linkage centers whose main role is to be the trusted third party to determine the linkage function manually and link data for research via a master population list for a designated region. Recently, a more flexible computerized third-party linkage platform, Secure Decoupled Linkage (SDLink), has been proposed based on: (1) decoupling data via encryption, (2) obfuscation via chaffing (adding fake data) and universe manipulation; and (3) minimum information disclosure via recoding. RESULTS: We synthesize this literature to formalize a new framework for privacy preserving interactive record linkage (PPIRL) with tractable privacy and utility properties and then analyze the literature using this framework. CONCLUSIONS: Human-based third-party linkage centers for privacy preserving record linkage are the accepted norm internationally. We find that a computer-based third-party platform that can precisely control the information disclosed at the micro level and allow frequent human interaction during the linkage process, is an effective human-machine hybrid system that significantly improves on the linkage center model both in terms of privacy and utility.
OBJECTIVE: Record linkage to integrate uncoordinated databases is critical in biomedical research using Big Data. Balancing privacy protection against the need for high quality record linkage requires a human-machine hybrid system to safely manage uncertainty in the ever changing streams of chaotic Big Data. METHODS: In the computer science literature, private record linkage is the most published area. It investigates how to apply a known linkage function safely when linking two tables. However, in practice, the linkage function is rarely known. Thus, there are many data linkage centers whose main role is to be the trusted third party to determine the linkage function manually and link data for research via a master population list for a designated region. Recently, a more flexible computerized third-party linkage platform, Secure Decoupled Linkage (SDLink), has been proposed based on: (1) decoupling data via encryption, (2) obfuscation via chaffing (adding fake data) and universe manipulation; and (3) minimum information disclosure via recoding. RESULTS: We synthesize this literature to formalize a new framework for privacy preserving interactive record linkage (PPIRL) with tractable privacy and utility properties and then analyze the literature using this framework. CONCLUSIONS:Human-based third-party linkage centers for privacy preserving record linkage are the accepted norm internationally. We find that a computer-based third-party platform that can precisely control the information disclosed at the micro level and allow frequent human interaction during the linkage process, is an effective human-machine hybrid system that significantly improves on the linkage center model both in terms of privacy and utility.
Entities:
Keywords:
Electronic Health Records (EHR); decoupled data; entity resolution; medical record linkage; privacy; privacy preserving interactive record linkage (PPIRL)
Authors: Allison B McCoy; Adam Wright; Michael G Kahn; Jason S Shapiro; Elmer Victor Bernstam; Dean F Sittig Journal: BMJ Qual Saf Date: 2013-01-29 Impact factor: 7.035
Authors: Mehmet Kuzu; Murat Kantarcioglu; Elizabeth Ashley Durham; Csaba Toth; Bradley Malin Journal: J Am Med Inform Assoc Date: 2012-07-30 Impact factor: 4.497
Authors: David V Ford; Kerina H Jones; Jean-Philippe Verplancke; Ronan A Lyons; Gareth John; Ginevra Brown; Caroline J Brooks; Simon Thompson; Owen Bodger; Tony Couch; Ken Leake Journal: BMC Health Serv Res Date: 2009-09-04 Impact factor: 2.655
Authors: Shuang Wang; Luca Bonomi; Wenrui Dai; Feng Chen; Cynthia Cheung; Cinnamon S Bloss; Samuel Cheng; Xiaoqian Jiang Journal: IEEE Trans Big Data Date: 2016-09-13
Authors: Theodoros V Giannouchos; Alva O Ferdinand; Gurudev Ilangovan; Eric Ragan; W Benjamin Nowell; Hye-Chung Kum; Cason D Schmit Journal: J Am Med Inform Assoc Date: 2021-07-30 Impact factor: 4.497
Authors: Christopher T Rentsch; Chodziwadziwa Whiteson Kabudula; Jason Catlett; David Beckles; Richard Machemba; Baltazar Mtenga; Nkosinathi Masilela; Denna Michael; Redempta Natalis; Mark Urassa; Jim Todd; Basia Zaba; Georges Reniers Journal: Gates Open Res Date: 2018-01-11
Authors: Boris P Hejblum; Griffin M Weber; Katherine P Liao; Nathan P Palmer; Susanne Churchill; Nancy A Shadick; Peter Szolovits; Shawn N Murphy; Isaac S Kohane; Tianxi Cai Journal: Sci Data Date: 2019-01-08 Impact factor: 6.444