| Literature DB >> 35187474 |
Mehdi Elahi1, Alain Starke1,2, Nabil El Ioini3, Anna Alexander Lambrix3, Christoph Trattner1.
Abstract
A challenge for many young adults is to find the right institution to follow higher education. Global university rankings are a commonly used, but inefficient tool, for they do not consider a person's preferences and needs. For example, some persons pursue prestige in their higher education, while others prefer proximity. This paper develops and evaluates a university recommender system, eliciting user preferences as ratings to build predictive models and to generate personalized university ranking lists. In Study 1, we performed offline evaluation on a rating dataset to determine which recommender approaches had the highest predictive value. In Study 2, we selected three algorithms to produce different university recommendation lists in our online tool, asking our users to compare and evaluate them in terms of different metrics (Accuracy, Diversity, Perceived Personalization, Satisfaction, and Novelty). We show that a SVD algorithm scores high on accuracy and perceived personalization, while a KNN algorithm scores better on novelty. We also report findings on preferred university features.Entities:
Keywords: education; offline evaluation; recommender systems; university; usability; user study
Year: 2022 PMID: 35187474 PMCID: PMC8848746 DOI: 10.3389/frai.2021.796268
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Figure 1System Architecture for our University Recommender System. It depicts the flow of information in our system, as well as the different steps and features that users (depicted on the left-hand side) take when using our system.
Figure 2Snapshots of the system, in different stages of user interaction. In (A), users had to select at least three features they found important when selecting a university. In (B) ,they were asked rate at least three universities, while in (C) they were presented three personalized recommendation lists with universities. Panel (D) depicts the System Usability Questionnaire. Not depicted are the demographics and user evaluation screens.
Results of the offline experiment, performed using five-fold cross-validation.
|
|
|
| |||
|---|---|---|---|---|---|
|
|
|
|
| ||
| SVD | - |
|
|
|
|
| SVD++ | - | 22.6 | 26.0 | 1.3 | 24.1 |
| KNN1 | Basic | 24.9 | 29.2 | 1.7 | 27.7 |
| KNN2 | With baselines | 23.5 | 25.9 | 1.1 | 24.9 |
| Co-clustering | - | 24.2 | 29.9 | 2.1 | 27.5 |
| SlopeOne | - | 25.7 | 28.3 | 1.1 | 26.8 |
| Random | - | 34.5 | 39.8 | 2.1 | 36.5 |
SlopeOne and Random were used as baselines. Lower values of the Root Mean Square Error (RMSE) indicate that an algorithm has a higher predictive value. Denoted in bold is the best-performing algorithm in terms of RMSE.
Results of paired t-tests on different evaluation metrics (based on Ekstrand et al., 2014), in which users were asked to choose a recommendation list in relation to specific metrics.
|
| |||||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| 1...has more selections that you find appealing? |
| 37 | 12 | 1.00 | 3.56 | 2.36 | |
| Acc | 2... |
|
| 56 | 0.00 | -2.65 | 2.65 |
| 3... | 42 |
|
| 0.93 | 0.93 | 0.00 | |
| Div | 4...has a more varied selection of universities? | 24 | 32 |
| -0.62 | -1.54 | -0.90 |
| 5...has items that match a wider variety of preferences? | 27 |
| 29 | -1.31 | -0.21 | 1.10 | |
| 6...better reflects your preferences in universities? |
| 29 | 12 | 2.08 | 4.18 | 1.74 | |
| Und | 7...seems more personalized to your preferences? |
| 39 | 10 | 0.82 | 3.96 | 2.91 |
| 8... | 44 | 37 |
| 0.52 | 2.03 | 1.48 | |
| 9...would better help you find universities to consider? |
| 37 | 12 | 1.00 | 3.56 | 2.36 | |
| Sat | 10...would you likely to recommend to friends? |
| 34 | 15 | 1.19 | 3.19 | 1.84 |
| 11...has more universities you did not expect? | 17 | 15 |
| 0.27 | -4.21 | -4.61 | |
| 12... | 46 | 44 |
| 0.16 | 3.54 | 3.33 | |
| Nov | 13...has more pleasantly surprising universities? |
|
| 32 | 0.00 | 0.19 | 0.19 |
| 14... | 44 | 37 |
| 0.52 | 2.03 | 1.48 | |
Lists were generated by different algorithms (SVD, KNN1, KNN2), metrics were as follows: Accuracy (Acc), Diversity (Div), Understands Me (Und), Satisfaction (Sat), and Novelty (Nov). For positive items, the highest %'s were denoted in bold; for negative items (put in italics), the lowest %'s.
p < 0.001,
p < 0.01,
p < 0.05.
Self-reported features that are most important to users when choosing a university to attend for study, distributed across females and males.
|
|
|
|
|
|---|---|---|---|
|
|
|
| |
| 1. | 80.8 | 90.9 | 81.6 |
| 2. | 55.8 | 72.7 | 50.0 |
| 3. | 51.9 | 63.6 | 50.0 |
| 4. | 42.3 | 72.7 | 36.8 |
| 5. | 36.5 | 54.5 | 31.6 |
| 6. | 36.4 | 29 | 31.6 |
| 7. | 32.7 | 27.3 | 34.2 |
| 8. | 26.9 | 27.3 | 28.9 |
| 9. | 23.1 | 9.1 | 23.7 |
| 10. | 19.2 | 36.4 | 15.8 |
| 11. | 1.9 | 0 | 2.6 |
Users could select multiple features. Three users did not wish to disclose their gender and are only considered in the “All” column.
Frequencies of user responses to questionnaire items (i.e., propositions, such as “P1”) from the System Usability Scale (SUS) (Brooke, 1996).
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
| ||
| P01 Positive | 3 | 4 | 9 | 16 | 5 | 1.43 |
| P02 Negative | 10 | 16 | 10 | 1 | 0 | 3.95 |
| P03 Positive | 1 | 1 | 4 | 16 | 15 | 2.16 |
| P04 Negative | 21 | 10 | 4 | 2 | 0 | 4.35 |
| P05 Positive | 1 | 2 | 11 | 19 | 4 | 1.62 |
| P06 Negative | 7 | 20 | 7 | 3 | 0 | 3.84 |
| P07 Positive | 0 | 2 | 2 | 18 | 15 | 2.24 |
| P08 Negative | 12 | 18 | 7 | 0 | 0 | 4.14 |
| P09 Positive | 0 | 2 | 8 | 19 | 8 | 1.89 |
| P10 Negative | 18 | 12 | 4 | 2 | 1 | 4.19 |
Items were measured using 5-point Likert scales. Items measured whether a usability aspect was evaluated either positively or negatively. The SUS score, which was the sum of the denoted scores multiplied with 2.5, was 74.5 out of 100, which suggested that our system had a good usability (Brooke, .