| Literature DB >> 21696636 |
Khaled El Emam1, David Buckeridge, Robyn Tamblyn, Angelica Neisa, Elizabeth Jonker, Aman Verma.
Abstract
BACKGROUND: The public is less willing to allow their personal health information to be disclosed for research purposes if they do not trust researchers and how researchers manage their data. However, the public is more comfortable with their data being used for research if the risk of re-identification is low. There are few studies on the risk of re-identification of Canadians from their basic demographics, and no studies on their risk from their longitudinal data. Our objective was to estimate the risk of re-identification from the basic cross-sectional and longitudinal demographics of Canadians.Entities:
Mesh:
Year: 2011 PMID: 21696636 PMCID: PMC3151203 DOI: 10.1186/1472-6947-11-46
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Figure 1Example of data sets matched against a population registry with uniques.
Figure 2Estimated uniqueness results for different levels of generalization on the three demographic variables for 1 year, 2 years, 5 years, and 11 years.
Figure 3The maximum and minimum uniqueness values had we retained the records with data entry errors.
A comparison of postal code population sizes among Montreal, Ottawa, and Toronto.
| Ottawa | Montreal | Toronto | ||||
|---|---|---|---|---|---|---|
| 362,211 | 362,211 | 1,870,336 | 1,870,336 | 699,936 | 699,936 | |
| 181,106 | 181,106 | 267,191 | 273,240 | 116,656 | 88,401 | |
| 16,464 | 16,368 | 19,792 | 19,602 | 12,498 | 10,892 | |
| 3,853 | 4,016 | 5,616 | 4,954 | 3,910 | 2,286 | |
| 289 | 220 | 351 | 264 | 324 | 228 | |
| 40 | 23 | 48 | 30 | 51 | 23 | |
Figure 4Distribution of population sizes in postal codes for all ten provinces categorized by urban vs. rural.
The difference in between real uniqueness and uniqueness estimated making the uniform distribution assumption.
| % Uniques | |||
|---|---|---|---|
| 2004 | 113,220 | 97.8% | 93.9% |
| 2005 | 120,803 | 97.5% | 94.0% |
| 2006 | 125,724 | 97.6% | 94.0% |
| 2007 | 136,980 | 98.0% | 94.5% |
| 2008 | 139,278 | 98.2% | 94.7% |