| Literature DB >> 26070130 |
Peng Cheng1, Chun-Wei Lin2, Jeng-Shyang Pan2.
Abstract
During business collaboration, partners may benefit through sharing data. People may use data mining tools to discover useful relationships from shared data. However, some relationships are sensitive to the data owners and they hope to conceal them before sharing. In this paper, we address this problem in forms of association rule hiding. A hiding method based on evolutionary multi-objective optimization (EMO) is proposed, which performs the hiding task by selectively inserting items into the database to decrease the confidence of sensitive rules below specified thresholds. The side effects generated during the hiding process are taken as optimization goals to be minimized. HypE, a recently proposed EMO algorithm, is utilized to identify promising transactions for modification to minimize side effects. Results on real datasets demonstrate that the proposed method can effectively perform sanitization with fewer damages to the non-sensitive knowledge in most cases.Entities:
Mesh:
Year: 2015 PMID: 26070130 PMCID: PMC4466550 DOI: 10.1371/journal.pone.0127834
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Difference of hiding solutions.
| Method | Strategy of data modification | Knowledge form | Type of data modification | |||
|---|---|---|---|---|---|---|
| Distortion | Block | Rule | Itemset | Delete | Add | |
| EMO-AddItem | ✓ | ✓ | ✓ | |||
| Algo1.a [ | ✓ | ✓ | ✓ | |||
| Algo1.b, Algo2.a [ | ✓ | ✓ | ✓ | |||
| Algo2.b [ | ✓ | ✓ | ✓ | |||
| WSDA [ | ✓ | ✓ | ✓ | |||
| BA [ | ✓ | ✓ | ✓ | ✓ | ||
| GIH [ | ✓ | ✓ | ✓ | |||
| CR [ | ✓ | ✓ | ✓ | |||
| CR2 [ | ✓ | ✓ | ✓ | |||
| SIF-IDF [ | ✓ | ✓ | ✓ | |||
| Hybrid [ | ✓ | ✓ | ✓ | |||
| Border-based [ | ✓ | ✓ | ✓ | |||
| CSP-based [ | ✓ | ✓ | ✓ | |||
| Template-based [ | ✓ | ✓ | ✓ | ✓ | ||
| MaxMin [ | ✓ | ✓ | ✓ | |||
Fig 1Conversion between transactional database and bit-vectors.
Fig 2The mechanism of chromosome encoding.
Fig 3The shuffle crossover.
Characteristics of real datsets and parameter settings.
| Dataset | # Tran. | # Items | Avg. Tran. Len. | MST | MCT | # Freq. Itemsets | # Strong Rules |
|---|---|---|---|---|---|---|---|
| Mushroom | 8124 | 119 | 23 | 5% | 50% | 1329 | 1065 |
| Bms-1 | 59602 | 497 | 2.5 | 0.1% | 20% | 3065 | 3207 |
| Bms-2 | 77512 | 3340 | 5.0 | 0.2% | 20% | 1196 | 1598 |
| Retail | 88162 | 16469 | 10.3 | 0.1% | 50% | 5054 | 3276 |
Results with the increasing size of sensitive rules.
| Side effects | |||||
|---|---|---|---|---|---|
| Dataset | ∣ | Method | Hiding failure(%) | Knowledge distortion(%) | Data distortion(%) |
| Mushroom | 10 | EMO-AddItem | 20.000 | 2.449 | 49.489 |
| Algo1.a | 20.000 | 5.087 | 36.234 | ||
| WSDA | 0.000 | 2.935 | 36.148 | ||
| SIF-IDF | 0.000 | 8.431 | 26.105 | ||
| 20 | EMO-AddItem | 15.000 | 7.116 | 75.840 | |
| Algo1.a | 15.000 | 13.244 | 47.532 | ||
| WSDA | 0.000 | 8.793 | 66.597 | ||
| SIF-IDF | 0.000 | 34.706 | 76.542 | ||
| Bms-1 | 10 | EMO-AddItem | 6.500 | 4.897 | 4.613 |
| Algo1.a | 40.000 | 20.929 | 4.111 | ||
| WSDA | 0.000 | 14.295 | 0.841 | ||
| SIF-IDF | 0.000 | 8.257 | 0.532 | ||
| 20 | EMO-AddItem | 21.000 | 16.113 | 9.920 | |
| Algo1.a | 50.000 | 38.219 | 6.538 | ||
| WSDA | 0.000 | 33.479 | 1.435 | ||
| SIF-IDF | 0.000 | 16.033 | 0.859 | ||
| Bms-2 | 10 | EMO-AddItem | 0.000 | 4.873 | 3.675 |
| Algo1.a | 20.000 | 9.769 | 2.518 | ||
| WSDA | 0.000 | 6.738 | 0.661 | ||
| SIF-IDF | 0.000 | 2.707 | 0.190 | ||
| 20 | EMO-AddItem | 0.000 | 8.750 | 5.738 | |
| Algo1.a | 40.000 | 20.563 | 3.335 | ||
| WSDA | 0.000 | 8.302 | 0.955 | ||
| SIF-IDF | 0.000 | 5.829 | 0.343 | ||
| Retail | 10 | EMO-AddItem | 0.000 | 0.031 | 0.222 |
| Algo1.a | 0.000 | 0.031 | 0.222 | ||
| WSDA | 0.000 | 0.061 | 0.129 | ||
| SIF-IDF | 0.000 | 3.735 | 1.000 | ||
| 20 | EMO-AddItem | 0.000 | 0.043 | 0.524 | |
| Algo1.a | 5.000 | 0.061 | 0.382 | ||
| WSDA | 5.000 | 0.215 | 0.284 | ||
| SIF-IDF | 0.000 | 4.730 | 1.594 | ||
Fig 4Tradeoffs exist within different side effects.
Fig 5Side effects with increasing MCT levels.