| Literature DB >> 35719384 |
Yishuai Geng1, Xiao Xiao2, Xiaobing Sun1, Yi Zhu1.
Abstract
The last decades have witnessed a vast amount of interest and research in feature representation learning from multiple disciplines, such as biology and bioinformatics. Among all the real-world application scenarios, feature extraction from knowledge graph (KG) for personalized recommendation has achieved substantial performance for addressing the problem of information overload. However, the rating matrix of recommendations is usually sparse, which may result in significant performance degradation. The crucial problem is how to extract and extend features from additional side information. To address these issues, we propose a novel feature representation learning method for the recommendation in this paper that extends item features with knowledge graph via triple-autoencoder. More specifically, the comment information between users and items is first encoded as sentiment classification. These features are then applied as the input to the autoencoder for generating the auxiliary information of items. Second, the item-based rating, the side information, and the generated comment representations are incorporated into the semi-autoencoder for reconstructed output. The low-dimensional representations of this extended information are learned with the semi-autoencoder. Finally, the reconstructed output generated by the semi-autoencoder is input into a third autoencoder. A serial connection between the semi-autoencoder and the autoencoder is designed here to learn more abstract and higher-level feature representations for personalized recommendation. Extensive experiments conducted on several real-world datasets validate the effectiveness of the proposed method compared to several state-of-the-art models.Entities:
Keywords: autoencoder; collaborative filtering; personalized recommendation; representation learning; semi-autoencoder
Year: 2022 PMID: 35719384 PMCID: PMC9204654 DOI: 10.3389/fgene.2022.891265
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Illustration of a semi-autoencoder where the input and output layers can be inconsistent. The length of the input layer is longer/shorter than the output layer in the left/right part.
FIGURE 2Whole framework of the proposed KGTA
Important notations used in this article and their descriptions.
| Notations | Descriptions |
|---|---|
|
| The rating matrix |
|
| The attributes vectors of all items |
|
| The reconstructed comment vectors of all items |
|
| The language vectors of all items |
|
| The prediction matrix |
|
| The number of users |
|
| The number of items |
|
| The column of rating matrix |
|
| The row of rating matrix |
|
| The features dimension of hidden units |
|
| The number of hidden units |
|
| The |
|
| The reconstructed output of |
|
| The hidden feature representation matrix |
|
| The map and remap weight matrix |
|
| The map and remap bias vectors |
| • | The element-wise product of vectors or matrices |
Details of the three datasets used in our experiments.
| Dataset | Number of users | Number of items | Number of ratings | Rating density (%) |
|---|---|---|---|---|
| MovieLens 100K | 943 | 1,682 | 100,000 | 6.3 |
| MovieLens 1M | 6,039 | 3,883 | 1,000,209 | 4.27 |
The performance of RMSE on MovieLens 100K and MovieLens 1M datasets.
| Datasets | Methods | Proportion of training data | |||
|---|---|---|---|---|---|
| MovieLens 100K | - |
|
|
|
|
| NMF | 0.991 | 0.976 | 0.965 | 0.960 | |
| SVD++ | 0.943 | 0.927 | 0.915 | 0.909 | |
| MetaHIN | 1.062 | 1.046 | 1.041 | 1.032 | |
| MeLU | 1.154 | 1.144 | 1.132 | 1.121 | |
| AutoRec | 1.023 | 1.003 | 0.981 | 0.964 | |
| HCRSA | 0.948 | 0.937 | 0.923 | 0.919 | |
| PRKG | 0.928 | 0.917 | 0.907 | 0.899 | |
| KGTA |
|
|
|
| |
| MovieLens 1M | NMF | 0.928 | 0.925 | 0.921 | 0.918 |
| MetaHIN | 1.024 | 0.993 | 0.965 | 0.959 | |
| MeLU | 1.082 | 1.038 | 1.008 | 0.973 | |
| NCF | 0.914 | 0.911 | 0.909 | 0.907 | |
| AutoRec | 0.914 | 0.905 | 0.896 | 0.888 | |
| HCRSA | 0.903 | 0.892 | 0.884 | 0.874 | |
| PRKG | 0.885 | 0.875 | 0.868 | 0.861 | |
| KGTA |
|
|
|
| |
The bold values provided in Table 3 represent the experimental results of our proposed method (KGTA) and are the best results among all the comparison methods.
FIGURE 3RMSE of our KGTA and compared methods on the MovieLens 100K dataset.
FIGURE 4RMSE of our KGTA and compared methods on the MovieLens 1M dataset.
FIGURE 5The parameter influence of the number of hidden layer neurons on our KGTA. (A) The influence performance on MovieLens 100K. (B) The influence performance on MovieLens 1M.
FIGURE 6The parameter influence of the number of epochs on our KGTA. (A) The influence performance on MovieLens 100K. (B) The influence performance on MovieLens 1M.
FIGURE 7The parameter influence of the length of comments on our KGTA. (A) The influence performance on MovieLens 100K. (B) The influence performance on MovieLens 1M.