| Literature DB >> 24693248 |
Min Gao1, Quan Yuan2, Bin Ling3, Qingyu Xiong1.
Abstract
With the rapid development of e-business, personalized recommendation has become core competence for enterprises to gain profits and improve customer satisfaction. Although collaborative filtering is the most successful approach for building a recommender system, it suffers from "shilling" attacks. In recent years, the research on shilling attacks has been greatly improved. However, the approaches suffer from serious problem in attack model dependency and high computational cost. To solve the problem, an approach for the detection of abnormal item is proposed in this paper. In the paper, two common features of all attack models are analyzed at first. A revised bottom-up discretized approach is then proposed based on time intervals and the features for the detection. The distributions of ratings in different time intervals are compared to detect anomaly based on the calculation of chi square distribution (χ(2)). We evaluated our approach on four types of items which are defined according to the life cycles of these items. The experimental results show that the proposed approach achieves a high detection rate with low computational cost when the number of attack profiles is more than 15. It improves the efficiency in shilling attacks detection by narrowing down the suspicious users.Entities:
Mesh:
Year: 2014 PMID: 24693248 PMCID: PMC3945428 DOI: 10.1155/2014/845897
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
The features of the attack models.
| Attack model |
|
|
|
|---|---|---|---|
| Random attack | Ø |
|
|
| Average attack | Ø | The ratings for |
|
| Bandwagon attack | Widely popular items, |
|
|
| Segment attack | Similar items to target items, |
|
|
Significant levels and related boundary values.
| Significant level | 0.25 | 0.10 | 0.05 | 0.025 | 0.01 | 0.005 |
|---|---|---|---|---|---|---|
| Boundary value | 5.385 | 7.779 | 9.488 | 11.143 | 13.277 | 14.860 |
Monthly rating distributions.
|
| Number | Number | Number | Number | Number |
|
|---|---|---|---|---|---|---|
| 1 | 0 | 0 | 8 | 3 | 2 | 6.4189 |
| 2 | 0 | 6 | 9 | 14 | 20 | 19.501 |
| 3 | 1 | 3 | 6 | 6 | 5 | 0.7521 |
| 4 | 2 | 2 | 9 | 11 | 1 | 7.8567 |
| 5 | 0 | 1 | 4 | 8 | 0 | 7.5359 |
| 6 | 1 | 2 | 3 | 1 | 2 | 4.3157 |
| 7 | 1 | 1 | 8 | 4 | 2 | 3.4811 |
Figure 1The relationship between the number of ratings and the number of items.
Figure 2The relationship between the number of ratings and life cycles.
Figure 3χ 2 deviation values under different time interval sizes.
Figure 4Detection rates and false alarm rates of the fad items.
Figure 5Detection rates and false alarm rates of fashion items.
Figure 6Detection rates and false alarm rates of style items.
Figure 7Detection rates and false alarm rates of scallop items.