| Literature DB >> 24955429 |
Sajid Mahmood1, Muhammad Shahbaz2, Aziz Guergachi3.
Abstract
Association rule mining research typically focuses on positive association rules (PARs), generated from frequently occurring itemsets. However, in recent years, there has been a significant research focused on finding interesting infrequent itemsets leading to the discovery of negative association rules (NARs). The discovery of infrequent itemsets is far more difficult than their counterparts, that is, frequent itemsets. These problems include infrequent itemsets discovery and generation of accurate NARs, and their huge number as compared with positive association rules. In medical science, for example, one is interested in factors which can either adjudicate the presence of a disease or write-off of its possibility. The vivid positive symptoms are often obvious; however, negative symptoms are subtler and more difficult to recognize and diagnose. In this paper, we propose an algorithm for discovering positive and negative association rules among frequent and infrequent itemsets. We identify associations among medications, symptoms, and laboratory results using state-of-the-art data mining technology.Entities:
Mesh:
Year: 2014 PMID: 24955429 PMCID: PMC4052479 DOI: 10.1155/2014/973750
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
IDF scores of sample keywords of the corpus.
| Selected keywords | IDF |
|
|
|---|---|---|---|
| Chemo | 2.2693 |
|
|
| Radiation | 2.2535 |
|
|
| Tumor | 2.2316 |
|
|
| Surgery | 2.2135 |
|
|
| Cancer | 2.2013 |
|
|
| Temodar | 1.9830 |
|
|
| CT scan | 1.9812 |
|
|
| Glioblastoma | 1.9609 |
|
|
| Skull | 1.8906 |
|
|
| Cause | 1.8902 |
|
|
|
|
|
|
|
Algorithm 2
Figure 2Positive and negative rules generated with varying minimum supports and confidence values.
Algorithm 3
Algorithm 4Total generated frequent and infrequent itemsets using different support values.
| Support | Frequent itemsets | Infrequent itemsets |
|---|---|---|
| 0.1 | 146474 | 268879 |
| 0.15 | 105397 | 329766 |
| 0.2 | 94467 | 447614 |
| 0.25 | 79871 | 492108 |
| 0.3 | 57954 | 504320 |
Figure 1Frequent and infrequent itemsets generated with varying minimum support values.
Positive and negative association rules using varying support and confidence values.
| Support | Confidence | PARs from FIs | PARs from IIs | NARs from FIs | NARs from IIs |
|---|---|---|---|---|---|
| 0.05 | 0.95 | 3993 | 254 | 27032 | 836 |
| 0. 05 | 0.9 | 4340 | 228 | 29348 | 1544 |
| 0.1 | 0.9 | 3731 | 196 | 30714 | 1279 |
| 0.1 | 0.85 | 3867 | 246 | 37832 | 1170 |
| 0.15 | 0.8 | 3340 | 330 | 26917 | 1121 |
(a) Positive rules from frequent itemsets
| Rule | Support | Confidence | Lift |
|---|---|---|---|
| {chemo radiation treatment} → {tumor} | 0.2 | 1 | 2.0 |
| {surgery tumor} → {radiation} | 0.25 | 1 | 1.4286 |
| {radiation surgery} → {tumor} | 0.25 | 0.8333 | 1.6667 |
| {treatment} → {radiation} | 0.45 | 0.9 | 1.2857 |
| {treatment tumor} → {radiation} | 0.3 | 1 | 1.4286 |
| {tumor} → {radiation} | 0.5 | 1 | 1.4286 |
(b) Negative rules from frequent itemsets
| Rule | Support | Confidence | Lift |
|---|---|---|---|
| {radiation} → {~cancer} | 0.5 | 0.7143 | 1.0204 |
| {~temodar} → {radiation} | 0.4825 | 0.8333 | 1.6667 |
| {~radiation} → {glioblastoma} | 0.65 | 1 | 1.4285 |
(c) Positive rules from infrequent itemsets
| Rule | Support | Confidence | Lift |
|---|---|---|---|
| {cancer treatment} → {tumor} | 0.15 | 1 | 2 |
| {doctor temodar} → {chemo} | 0.05 | 1 | 2.8571 |
| {brain chemo} → {doctor} | 0.15 | 1 | 2.2222 |
| {mri tumor} → {brain} | 0.1 | 1 | 2.5 |
(d) Negative rules from infrequent itemsets
| Rule | Support | Confidence | Lift |
|---|---|---|---|
| {chemo} → {~surgery} | 0.35 | 1 | 1.5385 |
| {glioblastoma} → {~treatment} | 0.3 | 1 | 2.13 |
| {chemo} → {~glioblastoma} | 0.35 | 0.95 | 1.4286 |