| Literature DB >> 17213047 |
Khaled El Emam1, Sam Jabbouri, Scott Sams, Youenn Drouet, Michael Power.
Abstract
BACKGROUND: With the growing adoption of electronic medical records, there are increasing demands for the use of this electronic clinical data in observational research. A frequent ethics board requirement for such secondary use of personal health information in observational research is that the data be de-identified. De-identification heuristics are provided in the Health Insurance Portability and Accountability Act Privacy Rule, funding agency and professional association privacy guidelines, and common practice.Entities:
Mesh:
Year: 2006 PMID: 17213047 PMCID: PMC1794009 DOI: 10.2196/jmir.8.4.e28
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Illustration of record linkage of a research database and another identification database
Figure 2The relationship between the research database, the identification database, and a hypothetical population database
Figure 3A sample identification database (shown shaded) for data intrusion simulation
Figure 4The three main source databases used to construct an identification database for a professional subpopulation
Ability to get various data elements on physicians and lawyers, with the source of the data (n = 236 for CPSO; n = 189 for LSUC)
| home postal codes (source: PPSR and telephone directory) | 60 | 45 |
| practice/firm postal codes (source: CPSO/LSUC) | 100 | 100 |
| date of birth (source: PPSR) | 40 | 45 |
| gender (source: CPSO/genderizer for LSUC data) | 100 | 100 |
| initials (source: CPSO/LSUC) | 100 | 100 |
| date of dirth (DoB) | forward sortation area |
| DoB – month and year | city |
| year of birth | region |
| gender | initials |
| postal code |
Percentage of time a quasi-identifier or combination of quasi-identifiers was considered “safe” more than 50% of the time (as sample sizes were varied from 30 to the maximum)
| gender | 100 | 100 |
| region | 93 | 65 |
| DOB – year | 94 | 85 |
| gender + region | 85 | 82 |
| gender + DOB – year | 80 | – |
Results of evaluating the accuracy of various tools for predicting gender from first names
| Precision | 0.988 | 0.989 |
| Recall | 0.818 | 0.80 |
| F-measure | 0.89 | 0.88 |
| Precision | 0.98 | 0.99 |
| Recall | 0.82 | 0.79 |
| F-measure | 0.89 | 0.88 |
| Precision | 0.98 | 0.98 |
| Recall | 0.9 | 0.87 |
| F-measure | 0.94 | 0.93 |
| Precision | 0.988 | 0.997 |
| Recall | 0.78 | 0.77 |
| F-measure | 0.87 | 0.87 |
| Precision | 0.98 | 0.996 |
| Recall | 0.77 | 0.78 |
| F-measure | 0.86 | 0.88 |
| British Columbia | |
| Alberta | Available from authorized registry agents |
| Saskatchewan | |
| Manitoba | |
| Ontario | |
| Quebec | |
| New Brunswick | |
| Nova Scotia | |
| Prince Edward Island | |
| Newfoundland and Labrador | |
| Northwest Territories | |
| Nunavut |