| Literature DB >> 35634074 |
Abstract
In the discipline of data mining, association rule mining is an important study topic that focuses on discovering the relationships between database attributes. The maximum frequent itemset comprises the information of all frequent itemsets, which is one of the important difficulties in mining association rules, and certain data mining applications just need to mine the maximum frequent itemsets. As a result, analyzing the maximum frequent itemset mining technique is practical. Considering this, the research introduces FP-MFIA, a new maximum frequent itemset mining approach based on the FP-tree, which is inspired by the data structure of the frequent pattern tree and the idea that the maximum frequent itemset implies all frequent itemsets. First, the FP-MFIA constructs a one-way FP-tree structure, which only has pointers from the root to the leaves, so that only two scans of the FP-tree are required by the FP-MFIA. On the other hand, it redefines a data storage structure MFI-list for maximum frequent itemsets. It can quickly release unnecessary nodes in the FP-tree after scanning it. In this way, not only the information required by the maximum frequent itemsets can be quickly mined but also the space required for storing the maximum frequent itemsets can be reduced, which greatly improves the mining efficiency. Finally, experiments were conducted to compare the mining efficiency of the novel FP-MFIA algorithm to the IDMFIA and DMFIA algorithms. We can see from the findings that the FP-MFIA algorithm is more efficient than the other two techniques.Entities:
Mesh:
Year: 2022 PMID: 35634074 PMCID: PMC9132644 DOI: 10.1155/2022/7022168
Source DB: PubMed Journal: Comput Intell Neurosci
Pseudocode for constructing new FP-tree.
| Algorithm 1: The construction algorithm of new FP-tree. |
|---|
|
|
|
|
|
|
| Scan database |
| Create a new FP-tree node, assign the value of the node_name to null, and use it as the root node |
| Create FP-tree: call Create function |
| Function Create() |
| For each transaction in the database |
| {Initialize the current pointer in the FP-tree to point to the root node |
| Put the items in the transaction |
| For each item in the queue |
| FP-tree. Insert( |
| } |
| Function Insert (char name)//Insert a node with an item named name into the FP-tree |
| {if the node pointed to by the current pointer has no child nodes or there is no node whose node_name field is name in the child nodes |
| Then |
| {Create a new FP-tree node whose node_name field is name; |
| current- > node_children = node; |
| current = = node;//Modify the current pointer to point to the new node |
| current- > node_link = item_head; |
| item_head = current;//Add it to the item_name linked list whose item_name value is name in the header table |
| } |
| else increase the node_count value of the child node by 1 and modify the current pointer to point to it; |
| } |
The transaction database.
| TID | Items | Frequent items in descending order of support |
|---|---|---|
|
| Abe | bae |
|
|
|
|
|
|
|
|
|
|
| bad |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Abce | bace |
|
|
|
|
Figure 1FP-tree constructed from the transactional database.
Figure 2Conditional FP-tree based on (c) node.
Pseudocode for constructing MFI-list.
| Algorithm 2: Algorithm of constructing MFI-list. |
|---|
|
|
|
|
|
|
| current = |
| InitStack(s); //initialize stack |
| There are untraversed paths in the FP-tree |
| {while (current- > node_count ≥ min_sup) |
| {current points to a child node that has not been visited; |
| Push( |
| } |
| //When there is no child node or the node_count value of the child node is less than min_sup |
|
|
| Convert the MFI into the corresponding bit vector and link it to the corresponding maximum frequent itemset linked list |
| current- > node_pre = |
| While(current has child nodes in the current path) |
| {pre = current; |
| current points to its child node; |
| current- > node_pre = pre- > node_pre∪current- > node_name;//Take the frequent itemset composed of the union of the node_pre of its parent node and the node_name of the node |
| }} |
Pseudocode of FP-MFIA.
| Algorithm 3: The algorithm of FP-MFIA. |
|---|
|
|
|
|
|
|
| MFI-list = Ø; |
| Preorder traversal of the FP-tree to obtain a simplified FP-tree, and initialize the MFI-list at the same time; |
| Traverse the MFI-list and simplify the FP-tree; |
|
|
| while( |
|
|
|
|
| while ( |
| {According to |
| for( |
| for( |
| If( |
| {Convert |
|
|
|
|
| if(the number of “1” in |
| {nd.count = |
|
|
| If(nd.count ≥ |
| {while( |
| if( |
| else break; } |
| If( |
| }}}} |
Transaction database.
| TID | Items |
|---|---|
|
| abcdep |
|
| abcdf |
|
| abcdm |
|
| abcdi |
|
| abcdho |
|
| aef |
|
| befn |
|
| ae |
|
| be |
|
| ad |
Figure 3The FP-tree corresponding to transaction database (D).
Figure 4MFI-list obtained after preorder traversal of FP-tree.
Figure 5Preorder traversal of FP-tree and simplified FP-tree obtained after traversing MFI-list.
Figure 6The final MFI-list obtained.
The relevant parameters of the database.
| Database | Items | The number of the items | The length of each item |
|---|---|---|---|
| Mushroom | 119 | 8124 | 23 |
| Connect | 129 | 67557 | 43 |
Figure 7Runtime of FP_MFIA, DMFIA, and IDMFIA algorithms on the Mushroom database (when min_sup is large).
Figure 8Runtime of FP_MFIA, DMFIA, and IDMFIA algorithms on the Mushroom database (when min_sup is small).
Figure 9Runtime of different algorithms on the Connect database (when the min_sup is large).
Figure 10Runtime of different algorithms on the Connect database (when the min_sup is small).