| Literature DB >> 34950079 |
Yupei Zhang1,2, Yue Yun1,2, Rui An1,2, Jiaqi Cui1,2, Huan Dai1,2, Xunqun Shang1,2.
Abstract
Student performance prediction (SPP) aims to evaluate the grade that a student will reach before enrolling in a course or taking an exam. This prediction problem is a kernel task toward personalized education and has attracted increasing attention in the field of artificial intelligence and educational data mining (EDM). This paper provides a systematic review of the SPP study from the perspective of machine learning and data mining. This review partitions SPP into five stages, i.e., data collection, problem formalization, model, prediction, and application. To have an intuition on these involved methods, we conducted experiments on a data set from our institute and a public data set. Our educational dataset composed of 1,325 students, and 832 courses was collected from the information system, which represents a typical higher education in China. With the experimental results, discussions on current shortcomings and interesting future works are finally summarized from data collections to practices. This work provides developments and challenges in the study task of SPP and facilitates the progress of personalized education.Entities:
Keywords: educational data mining (EDM); pattern recognition; personalized education; review and discussion; student performance prediction
Year: 2021 PMID: 34950079 PMCID: PMC8688359 DOI: 10.3389/fpsyg.2021.698490
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
The works studied in different situations.
|
|
|
|
|
|---|---|---|---|
| Offline classroom | Al-Radaideh et al., | 20 | |
| Historical grade data & background information | Elbadrawy et al., | 5 | |
| Online classroom | Historical grade data & background information | Tabandeh and Sami, | 18 |
| Historical grade data & background information | Kloft et al., | 9 | |
| Blending classroom | Elbadrawy et al., | 7 |
The related works with different machine learning models.
|
|
|
|
|
|---|---|---|---|
| Decision trees | Classification | Safavian and Landgrebe, | 10 |
| Linear regression | Regression | Tabandeh and Sami, | 13 |
| Support vector machines | Classification | Kentli and Sahin, | 4 |
| Matrix factorization | Regression / Clustering | Lee and Seung, | 15 |
| Collaborative filtering | Classification / Clustering | Sheena et al., | 7 |
| Artificial neural network | Classification / Clustering / Regression | Andrews et al., | 9 |
| Deep learning | Classification / Clustering / Regression | Guo et al., | 5 |
| Other methods | Regression / Clustering | Slim et al., | 3 |
The list of references for performance evaluation.
|
|
|
|
|---|---|---|
| Single course grade prediction | Tabandeh and Sami, | 9 |
| The next-term performance prediction | Sweeney et al., | 4 |
| Whole learning period's performance prediction | Oladokun et al., | 7 |
The list of references of the practical application of SPP.
|
|
|
|
|---|---|---|
| The recommendation system | Ray and Sharma, | 6 |
| Early warning system | Blanchfield, | 9 |
| Other applications | Jussim, | 7 |
The details of the two datasets used in experiments.
| Dataset 1 | 0-9 | 10-11 | 12-13 | 14-15 | 16-20 |
| fail | sufficient | satisfactory | Good | Excellent | |
| Dataset 2 | grade <60 | 60 ≤ grade ≤ 80 | 80 < grade | ||
| Warning | Good | Very Good |
Figure 1Mathematics.
Figure 2Portuguese language.
Figure 3In (A,C): The solid lines are classification methods, and ACCs are marked on the left primary axis as the evaluation metric. The dotted lines are regression methods, and RMSEs are marked on the right secondary axis as the evaluation metric. In (B,D): The x-axis is the course whose weights are not zero after the feature selection by Lasso, and (*) represents the semester of the course. The y-axis is the weight of the course.
Figure 4The number of publications for student performance prediction per year.