| Literature DB >> 27898691 |
Xibin Wang1,2, Fengji Luo3,4, Ying Qian1,2, Gianluca Ranzi3.
Abstract
With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies' ratings. The proposed PRS not only considers the movie's content information but also integrates the users' demographic and behavioral information to better capture the users' interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set.Entities:
Mesh:
Year: 2016 PMID: 27898691 PMCID: PMC5127501 DOI: 10.1371/journal.pone.0165868
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The extraction of relationship feature information of “User-Movie”.
Fig 2Regression model based on SVM classification for personalized recommendation.
Summary of the MovieLens 1M data set.
| File name | Description |
|---|---|
| user.dat | UserID, Gender, Age, Occupation, and Zip-code |
| Occupation List: “customer service”, 6: “doctor/health care”, 7: “executive/managerial”, 8: “farmer: UserID, Gender (‘M’ for male and ‘F’ for female), Age (1: Under 18, 18: 18–24, 25: 25–34, 35: 35–44, 45: 45–49, 50: 50–55, 56: 56+), Occupation (0: “other” or not specified, 1: “academic/educator”, 2: “artist”, 3: “clerical/admin”, 4: “college/grad student”, 5: “c”, 9: “homemaker”, 10: “K-12 student”, 11: “lawyer”, 12: “programmer”, 13: “retired”, 14: “sales/marketing”, 15: “scientist”, 16: “self-employed”, 17: “technician/engineer”, 18: “tradesman/craftsman”, 19: “unemployed”, 20: “writer”.) | |
| movie.dat | MovieID, Title, and Genres |
| Genres includes: Action, Adventure, Animation, Children’s, Comedy, Crime, Documentary, Drama, Fantasy, Film-Noir, Horror, Musical, Mystery, Romance, Sci-Fi, Thriller, War, and Western. | |
| Ratings.dat | UserID, MovieID, Rating, and Timestamp |
Fig 3Accuracy of recommended models based on four methods.
The average classification accuracy (%) of four algorithms.
| 20% | 30% | 50% | 70% | 90% | |
|---|---|---|---|---|---|
| IPSO | 72.3±3.3 | 72.9±2.8 | 73.7±2.5 | 74.9±2.1 | 75.4±1.9 |
| PSO | 69.7±3.6 | 70.9±3.2 | 71.6±2.7 | 72.1±2.4 | 73.7±2.2 |
| GA | 68.5±4.1 | 69.6±3.9 | 70.2±3.3 | 71.5±3.1 | 72.2±2.5 |
| GS | 70.7±5.1 | 72.1±4.6 | 73.3±3.8 | 74.2±3.1 | 74.5±2.9 |
Fig 4The parameters optimization curve corresponds to IPSO algorithm.
Fig 5The parameters optimization curve corresponds to GA.
Fig 6The parameters optimization curve corresponds to GS algorithm.
Fig 7The MAE of ratings based on six methods.
It shows the comparison results of the regression based on classification, SVM direct regression, User-based collaborative filtering, Item-based collaborative filtering, BP neural network, and Multiple linear regression.