| Literature DB >> 21542889 |
Abstract
Electronic health records are increasingly being linked to DNA repositories and used as a source of clinical information for genomic research. Privacy legislation in many jurisdictions, and most research ethics boards, require that either personal health information is de-identified or that patient consent or authorization is sought before the data are disclosed for secondary purposes. Here, I discuss how de-identification has been applied in current genomic research projects. Recent metrics and methods that can be used to ensure that the risk of re-identification is low and that disclosures are compliant with privacy legislation and regulations (such as the Health Insurance Portability and Accountability Act Privacy Rule) are reviewed. Although these methods can protect against the known approaches for re-identification, residual risks and specific challenges for genomic research are also discussed.Entities:
Year: 2011 PMID: 21542889 PMCID: PMC3129641 DOI: 10.1186/gm239
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 11.117
Summary of de-identification methods for individual-level data
| De-identification method | Techniques | Details |
|---|---|---|
| Masking (applied to direct identifiers) | Suppression/redaction | Direct identifiers are removed from the data or replaced with tags |
| Random replacement/randomization | Direct identifiers are replaced with randomly chosen values (for example, for names and medical record numbers) | |
| Pseudonymization | Unique numbers that are not reversible replace direct identifiers | |
| Generalization (applied to quasi-identifiers) | Hierarchy-based generalization | Generalization is based on a predefined hierarchy describing how precision on quasi-identifiers is reduced |
| Cluster-based generalization | Individual transactions are empirically grouped or based on pre-defined utility policies | |
| Suppression (applied to records flagged for suppression) | Casewise deletion | The full record is deleted |
| Quasi-identifier deletion | Only the quasi-identifiers are deleted | |
| Local cell suppression | Optimization scheme is applied to the quasi-identifiers to suppress the fewest values but ensure a re-identification probability below the threshold |