BACKGROUND: Multi-site health sciences research is becoming more common, as it enables investigation of rare outcomes and diseases and new healthcare innovations. Multi-site research usually involves the transfer of large amounts of research data between collaborators, which increases the potential for accidental disclosures of protected health information (PHI). Standard protocols for preventing release of PHI are extremely vulnerable to human error, particularly when the shared data sets are large. METHODS: To address this problem, we developed an automated program (SAS macro) to identify possible PHI in research data before it is transferred between research sites. The macro reviews all data in a designated directory to identify suspicious variable names and data patterns. The macro looks for variables that may contain personal identifiers such as medical record numbers and social security numbers. In addition, the macro identifies dates and numbers that may identify people who belong to small groups, who may be identifiable even in the absences of traditional identifiers. RESULTS: Evaluation of the macro on 100 sample research data sets indicated a recall of 0.98 and precision of 0.81. CONCLUSIONS: When implemented consistently, the macro has the potential to streamline the PHI review process and significantly reduce accidental PHI disclosures.
BACKGROUND: Multi-site health sciences research is becoming more common, as it enables investigation of rare outcomes and diseases and new healthcare innovations. Multi-site research usually involves the transfer of large amounts of research data between collaborators, which increases the potential for accidental disclosures of protected health information (PHI). Standard protocols for preventing release of PHI are extremely vulnerable to human error, particularly when the shared data sets are large. METHODS: To address this problem, we developed an automated program (SAS macro) to identify possible PHI in research data before it is transferred between research sites. The macro reviews all data in a designated directory to identify suspicious variable names and data patterns. The macro looks for variables that may contain personal identifiers such as medical record numbers and social security numbers. In addition, the macro identifies dates and numbers that may identify people who belong to small groups, who may be identifiable even in the absences of traditional identifiers. RESULTS: Evaluation of the macro on 100 sample research data sets indicated a recall of 0.98 and precision of 0.81. CONCLUSIONS: When implemented consistently, the macro has the potential to streamline the PHI review process and significantly reduce accidental PHI disclosures.
Authors: Lesley H Curtis; Mark G Weiner; Denise M Boudreau; William O Cooper; Gregory W Daniel; Vinit P Nair; Marsha A Raebel; Nicolas U Beaulieu; Robert Rosofsky; Tiffany S Woodworth; Jeffrey S Brown Journal: Pharmacoepidemiol Drug Saf Date: 2012-01 Impact factor: 2.890
Authors: Shawn N Murphy; Vivian Gainer; Michael Mendis; Susanne Churchill; Isaac Kohane Journal: J Am Med Inform Assoc Date: 2011-10-07 Impact factor: 4.497
Authors: Clete A Kushida; Deborah A Nichols; Rik Jadrnicek; Ric Miller; James K Walsh; Kara Griffin Journal: Med Care Date: 2012-07 Impact factor: 2.983
Authors: Hanene Boussi Rahmouni; Tony Solomonides; Marco Casassa Mont; Simon Shiu Journal: Philos Trans A Math Phys Eng Sci Date: 2010-09-13 Impact factor: 4.226
Authors: James Baggs; Julianne Gee; Edwin Lewis; Gabrielle Fowler; Patti Benson; Tracy Lieu; Allison Naleway; Nicola P Klein; Roger Baxter; Edward Belongia; Jason Glanz; Simon J Hambidge; Steven J Jacobsen; Lisa Jackson; Jim Nordin; Eric Weintraub Journal: Pediatrics Date: 2011-04-18 Impact factor: 7.124
Authors: Stephane M Meystre; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore Journal: BMC Med Res Methodol Date: 2010-08-02 Impact factor: 4.615
Authors: John H Holmes; Thomas E Elliott; Jeffrey S Brown; Marsha A Raebel; Arthur Davidson; Andrew F Nelson; Annie Chung; Pierre La Chance; John F Steiner Journal: J Am Med Inform Assoc Date: 2014-03-28 Impact factor: 4.497