| Literature DB >> 32070405 |
Janice Branson1, Nathan Good2, Jung-Wei Chen2, Will Monge2, Christian Probst3, Khaled El Emam4,5.
Abstract
BACKGROUND: Regulatory agencies, such as the European Medicines Agency and Health Canada, are requiring the public sharing of clinical trial reports that are used to make drug approval decisions. Both agencies have provided guidance for the quantitative anonymization of these clinical reports before they are shared. There is limited empirical information on the effectiveness of this approach in protecting patient privacy for clinical trial data.Entities:
Mesh:
Year: 2020 PMID: 32070405 PMCID: PMC7029478 DOI: 10.1186/s13063-020-4120-y
Source DB: PubMed Journal: Trials ISSN: 1745-6215 Impact factor: 2.279
Distribution by country of nepafenac trial participants
| Country | Number of subjects | % of total subjects |
|---|---|---|
| United States | 500 | 83.6 |
| Panama | 26 | 4.3 |
| Puerto Rico | 54 | 9 |
| Other | 18 | 3 |
| Total | 598 | 100 |
Rate of correct verification from suspected matches
| Study | Data details | % of suspected matches verified as actual matches |
|---|---|---|
| Kwok and Lafky and colleagues [ | Matched 15,000 Safe Harbor de-identified admission records from a regional hospital to a marketing dataset of 30,000 records | 10% (2/20) |
| Elliot et al. [ | Sampled records from the UK Labour Force Survey (LFS) and the Living Costs and Food Survey (LCF) to re-identify. Matches were performed with and without the Output Area Classifier (OAC), which provides more precise geography | • LFS: 12% (6/50) using web-based info to match with;28% (14/50) using commercial data • LCF: 10% (2/20) for dataset without OAC;43% (18/42) for dataset with OAC |
| Tudor and colleagues [ | Data examined were tabular in nature, consisting of 89 tables that were determined to be potentially high risk | • 36% claims of identifying a neighbor were correct • 61% correct for identifying self/family • All claims, except one, involved people the intruder knew |
| Sweeney [ | News reports of hospitalizations ( | 23% (8/35) |
Interpretation of the confidence levels attached to candidate matches [30]
| Confidence level | Confidence percentage | Meaning in words | Interpretation |
|---|---|---|---|
| 1 | 0–19 | Not at all confident, complete guess | Low confidence |
| 2 | 20–39 | Not very confident, bit of a guess | |
| 3 | 40–59 | Not quite sure, uncertain | Medium confidence |
| 4 | 60–79 | Fairly sure, reasonably confident | High confidence |
| 5 | 80–100 | Very confident, absolutely sure |
Approaches used for each of the six suspected matches
| Approach | External source | Confidence score | Confidence group | Reason for confidence score |
|---|---|---|---|---|
| Social media | 2 | Low | Date of surgery + location + symptoms + diabetic (inferred) | |
| Social media | 1 | Low | Date of surgery + age + gender + diabetic | |
| Death records | 1 | Low | Age + date of death + ethnicity (inferred) + unknown diabetic status | |
| Death records | 1 | Low | Age + date of death + ethnicity (inferred) + unknown diabetic status | |
| Death records | 1 | Low | Age + date of death + ethnicity (inferred) + unknown diabetic status | |
| Death records | 2 | Low | Age + diabetic (inferred) + details of death + location |
Summary of re-identification confidence scores
| Confidence score | Count | % of total subjects |
|---|---|---|
| 1 | 4 | 0.8 |
| 2 | 2 | 0.4 |
| 3 | 0 | 0 |
| 4 | 0 | 0 |
| 5 | 0 | 0 |
| Partially verified | 0 | 0 |
| Fully verified | 0 | 0 |