| Literature DB >> 35408425 |
Ayesha Shaukat1, Adeel Anjum2, Saif U R Malik3, Munam Ali Shah1, Carsten Maple4.
Abstract
Protecting the privacy of individuals is of utmost concern in today's society, as inscribed and governed by the prevailing privacy laws, such as GDPR. In serial data, bits of data are continuously released, but their combined effect may result in a privacy breach in the whole serial publication. Protecting serial data is crucial for preserving them from adversaries. Previous approaches provide privacy for relational data and serial data, but many loopholes exist when dealing with multiple sensitive values. We address these problems by introducing a novel privacy approach that limits the risk of privacy disclosure in republication and gives better privacy with much lower perturbation rates. Existing techniques provide a strong privacy guarantee against attacks on data privacy; however, in serial publication, the chances of attack still exist due to the continuous addition and deletion of data. In serial data, proper countermeasures for tackling attacks such as correlation attacks have not been taken, due to which serial publication is still at risk. Moreover, protecting privacy is a significant task due to the critical absence of sensitive values while dealing with multiple sensitive values. Due to this critical absence, signatures change in every release, which is a reason for attacks. In this paper, we introduce a novel approach in order to counter the composition attack and the transitive composition attack and we prove that the proposed approach is better than the existing state-of-the-art techniques. Our paper establishes the result with a systematic examination of the republication dilemma. Finally, we evaluate our work using benchmark datasets, and the results show the efficacy of the proposed technique.Entities:
Keywords: attribute disclosure attacks; multiple sensitive values; preserving privacy; serial publication; transactional data
Mesh:
Year: 2022 PMID: 35408425 PMCID: PMC9002876 DOI: 10.3390/s22072811
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Notations.
|
| set of terms known as a transaction |
|
| subset of transactions known as a cluster |
|
| set of all transactions in one timestamp known as a corpus |
|
| known as serial corpora |
|
| all private term sets in corpora |
|
| all non-private term sets in corpora |
|
| set of all private terms in the global bag |
|
| union of anonymized transactions at time |
|
| background knowledge at time |
|
| counterfeits required for transactions |
|
| any overlap belongs to |
|
| set of overlapping transactions |
|
| maximum counterfeits for all overlaps |
|
| original version |
|
| anonymized version |
|
| final released version |
Serially generated prisoner health records (Release 1).
| Name | Record |
|---|---|
| Laura | |
| Lucy | |
| Martin | |
| Shane | |
| John | |
| Stacy |
Serially generated prisoner health records (Release 2).
| Name | Record |
|---|---|
| Laura | |
| Lucy | |
| Martin | |
| Shane | |
| John | |
| Stacy | |
| Ben | |
| Ivy |
Serially generated prisoner health records (Release 3).
| Name | Record |
|---|---|
| Laura | |
| Lucy | |
| Martin | |
| Shane | |
| John | |
| Stacy | |
| Ben | |
| Ivy | |
| Pam | |
| Jan |
Anonymized prisoner health records (Release 1).
|
|
|
|
|---|---|---|
| Cluster 1 | herpes, COVID-19 | |
| HIV | ||
| lung cancer | ||
| Cluster 2 | ||
| cancer | ||
| Cluster 3 | ||
| HIV | ||
Anonymized prisoner health records (Release 2).
|
|
|
|
|---|---|---|
| Cluster 1 | COVID-19, diabetes | |
| lung cancer | ||
| HIV | ||
|
| ||
| Cluster 2 | ||
| cancer | ||
|
| ||
Anonymized prisoner health records (Release 3).
|
|
|
|
|---|---|---|
| Cluster 1 | diabetes | |
| lung cancer | ||
| HIV | ||
| Cluster 2 | ||
| cancer | ||
Figure 1Attacker model.
Figure 2Proposed model.
Published prisoner health record (Release 1).
|
|
|
|
|---|---|---|
| Cluster 1 | herpes, COVID-19 | |
| HIV | ||
| lung cancer | ||
| Cluster 2 | ||
| cancer | ||
| Cluster 3 | ||
| HIV | ||
Figure 3Reasonable surjections.
Figure 4Sensitive values responsible for a breach.
Published prisoner health record (Release 2).
|
|
|
|
|---|---|---|
| Cluster 1 | {COVID-19, diabetes (Tc herpes, AIDS)} | |
| lung cancer | ||
| HIV | ||
|
| ||
| Cluster 2 | ||
| cancer | ||
|
| ||
Published prisoner health record (Release 3).
|
|
|
|
|---|---|---|
| Cluster 1 | {diabetes (Tc herpes, COVID-19, AIDS)} | |
| lung cancer | ||
| HIV | ||
| Cluster 2 | ||
| cancer | ||
Description of the datasets.
| Datasets | No. of Trans. | No. of Terms | Max Trans. Length | Avg. Trans. Length | Avg. Sparsity |
|---|---|---|---|---|---|
| B1 | 50,000 | 497 | 267 | 2.5 | 99.49% |
| B2 | 70,378 | 3340 | 161 | 5.0 | 99.86% |
Figure 5Vulnerability vs. sample size.
Figure 6Vulnerability vs. releases.
Figure 7Perturbation rate vs. sample size.
Figure 8Perturbation rate vs. release.
Figure 9Perturbation rate vs. sample size.
Figure 10Utility vs. sample size.
Figure 11Runtime vs. sample size.