| Literature DB >> 36131115 |
Siqi Zhang1, Xiaohui Li2.
Abstract
The advent of the era of big data promotes the further development of medicine, and data release is an important step in it. The existing medical data release methods mostly use the k-anonymity model as the basis for data protection. With the advancement of technology, anonymous models are progressively less resistant to consistency attacks and background knowledge attacks. In order to better protect the private information of patients, this paper makes two major contributions: (1) The method of calculating the correlation between attributes is used to ensure the validity of the data after the data is released; (2) On the basis of the previous step, combined with the difference privacy-preserving model and tree model, this paper proposes an attribute association-based differential privacy classification tree data publishing method (ACDP-Tree). In this paper, simulation experiments are carried out on real medical data sets. The experimental results show that the algorithm ensures the validity and availability of the data to a certain extent while ensuring that the patient's privacy is not leaked.Entities:
Mesh:
Year: 2022 PMID: 36131115 PMCID: PMC9492700 DOI: 10.1038/s41598-022-19544-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Medical big data privacy protection model.
Hospital medical record.
| ID | Name | Age | Zipcode | Job | Disease |
|---|---|---|---|---|---|
| 1 | Alice | 23 | 21,081 | Doctor | Flu |
| 2 | Marry | 26 | 22,081 | Lawyer | Hepatitis |
| 3 | John | 46 | 21,084 | Teacher | HIV |
| 4 | Mike | 42 | 21,015 | Officer | Flu |
| 5 | Bob | 32 | 22,074 | Clerk | HIV |
| 6 | Jenny | 37 | 21,071 | Plumber | Hepatitis |
| 7 | Turman | 55 | 21,009 | Repairer | Pneumonia |
| 8 | Loary | 48 | 22,003 | Officer | Flu |
| 9 | Kitty | 22 | 21,424 | Teacher | Pneumonia |
Figure 2{Job} attribute generalization tree.
Importance rating.
| Attribute i and Attribute j | Quantized Value |
|---|---|
| Slightly important | 1 |
| Strong and important | 2 |
| Strongly important | 3 |
Adult attributes and attribute types.
| Attributes | Type |
|---|---|
| Age | Continuous |
| Marital-status | Discrete |
| Occupation | Discrete |
| Sex | Discrete |
| Capital-gain | Continuous |
Figure 3Changes in the degree of information loss with the size of the data.
Figure 4The change of absolute error when the query interval is 100 and ε takes different values.
Figure 5DLP variation trend under different privacy budgets.