| Literature DB >> 35455166 |
Seongju Kang1, Kwangsue Chung1.
Abstract
In the current era of online information overload, recommendation systems are very useful for helping users locate content that may be of interest to them. A personalized recommendation system presents content based on information such as a user's browsing history and the videos watched. However, information filtering-based recommendation systems are vulnerable to data sparsity and cold-start problems. Additionally, existing recommendation systems suffer from the large overhead incurred in learning regression models used for preference prediction or in selecting groups of similar users. In this study, we propose a preference-tree-based real-time recommendation system that uses various tree models to predict user preferences with a fast runtime. The proposed system predicts preferences based on two balance constants and one similarity threshold to recommend content with a high accuracy while balancing generalized and personalized preferences. The results of comparative experiments and ablation studies confirm that the proposed system can accurately recommend content to users. Specifically, we confirmed that the accuracy and novelty of the recommended content were, respectively, improved by 12.1% and 27.2% compared to existing systems. Furthermore, we verified that the proposed system satisfies real-time requirements and mitigates both cold-start and overfitting problems.Entities:
Keywords: cold start; data sparsity; information filtering; preference tree; real-time requirements; recommendation systems
Year: 2022 PMID: 35455166 PMCID: PMC9030273 DOI: 10.3390/e24040503
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Classification of recommendation systems.
Figure 2Proposed recommendation system architecture.
Figure 3Structure of the proposed preference-tree model.
Figure 4Time complexities of the data search with matrix and ordered HashMap.
Components of the preference-tree node.
| Component | Description |
|---|---|
|
| Mapped category code of the node |
|
| Number of category |
|
| Child nodes of the category |
|
| Depth of the category |
|
| Total number of categories at depth |
|
| Maximum depth of the descendant node |
|
| User preference for the category |
|
| Weight of historical data in the |
Figure 5Average runtime and error of preference scores according to probability ρ.
Figure 6Example of collaborative similarity graph.
Average values of precision, recall, F1-measure, accuracy, and novelty measured based on . The values of and were set to 0.5.
|
| Precision | Recall | Accuracy | Novelty | |
|---|---|---|---|---|---|
| 0.3 | 0.659 | 0.713 | 0.685 | 0.596 | 0.285 |
| 0.4 | 0.633 | 0.737 | 0.681 | 0.579 | 0.224 |
| 0.5 | 0.646 | 0.725 | 0.683 | 0.588 | 0.302 |
| 0.6 | 0.635 | 0.739 | 0.683 | 0.570 | 0.331 |
| 0.7 | 0.620 | 0.739 | 0.674 | 0.571 | 0.275 |
| 0.8 | 0.604 | 0.689 | 0.644 | 0.543 | 0.292 |
Average values of precision, recall, F1-measure, accuracy, and novelty measured based on . The values of and were set to 0.5.
|
| Precision | Recall | Accuracy | Novelty | |
|---|---|---|---|---|---|
| 0.4 | 0.584 | 0.661 | 0.620 | 0.545 | 0.329 |
| 0.5 | 0.601 | 0.685 | 0.641 | 0.578 | 0.302 |
| 0.6 | 0.612 | 0.747 | 0.673 | 0.582 | 0.285 |
| 0.7 | 0.619 | 0.751 | 0.678 | 0.589 | 0.224 |
Average values of precision, recall, F1-measure, accuracy, and novelty measured based on . The values of and were set to 0.5.
|
| Precision | Recall | Accuracy | Novelty | |
|---|---|---|---|---|---|
| 0.3 | 0.572 | 0.739 | 0.645 | 0.550 | 0.234 |
| 0.4 | 0.598 | 0.742 | 0.662 | 0.556 | 0.299 |
| 0.5 | 0.601 | 0.685 | 0.641 | 0.578 | 0.372 |
| 0.6 | 0.606 | 0.709 | 0.653 | 0.588 | 0.385 |
| 0.7 | 0.601 | 0.699 | 0.647 | 0.572 | 0.424 |
| 0.8 | 0.625 | 0.656 | 0.640 | 0.559 | 0.422 |
Average values of precision, recall, F1-measure, accuracy, and novelty measured with = = 0.5 and = 0.6. Proposed Scheme A uses personalized tree. Proposed Scheme B uses personalized and federated trees. Proposed Scheme C uses personalized and similarity trees. Proposed Scheme D uses personalized, federated, and similarity trees.
| Scheme | Precision | Recall | Accuracy | Novelty | |
|---|---|---|---|---|---|
| Proposed Scheme A | 0.581 | 0.704 | 0.637 | 0.513 | 0.188 |
| Proposed Scheme B | 0.554 | 0.759 | 0.640 | 0.588 | 0.341 |
| Proposed Scheme C | 0.583 | 0.722 | 0.645 | 0.529 | 0.332 |
| Proposed Scheme D | 0.606 | 0.709 | 0.653 | 0.588 | 0.385 |
| MF-based | 0.590 | 0.682 | 0.633 | 0.544 | 0.229 |
| Max-heap-tree-based | 0.553 | 0.611 | 0.581 | 0.467 | 0.113 |
| Knowledge-based | 0.718 | 0.633 | 0.673 | 0.575 | NAN |
Figure 7Execution time required for updating the tree and predicting preferences.