| Literature DB >> 34170914 |
Souad Larabi-Marie-Sainte1, Roohi Jan1, Ali Al-Matouq2, Sara Alabduhadi1.
Abstract
Student's academic performance is the point of interest for both the student and the academic institution in higher education. This performance can be affected by several factors and one of them is student absences. This is mainly due to the missed lectures and other class activities. Studies related to university timetabling investigate the different techniques and algorithms to design course timetables without analyzing the relationship between student attendance behavior and timetable design. This article first aimed at demonstrating the impact of absences and timetabling design on student's academic performance. Secondly, this study showed that the number of absences can be caused by three main timetable design factors: namely, (1) the number of courses per semester, (2) the average number of lectures per day and (3) the average number of free timeslots per day. This was demonstrated using Educational Data Mining on a large dataset collected from Prince Sultan University. The results showed a high prediction performance reaching 92% when predicting student's GPA based on absences and the factors related to timetabling design. High prediction performance reaching 87% was also obtained when predicting student absences based on the three timetable factors mentioned above. The results demonstrated the importance of designing course timetables in view of student absence behavior. Some suggestions were reported such as limiting the number of enrolled courses based on student's GPA, avoiding busy and almost free days and using automated timetabling to minimize the number of predicted absences. This in turn will help in generating balanced student timetables, and thus improving student academic performance.Entities:
Year: 2021 PMID: 34170914 PMCID: PMC8232426 DOI: 10.1371/journal.pone.0253256
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Impact on student’s GPA by the number of students’ absences due to timetabling factors.
Fig 2Representation of Over and Under Sampling.
Fig 3The architecture of the Neural Network used.
The dataset attributes and their descriptions.
| Attribute number | Attribute name | Description |
|---|---|---|
| 1 | ID | Student Identification Number. |
| 2 | Prev GPA | GPA of the previous semester. |
| 3 | GPA | GPA of the current semester. |
| 4 | Department | Department to which student belongs to. |
| 5 | Level | The academic level of a student such as Freshman, Sophomore, Junior and Senior. |
| 6 | Term | Academic Term. Five academic terms were considered in this study (fall and spring terms between 2016 and 2019). |
| 7 | Attempted_Hours | Number of registered hours in a semester per student. |
| 8 | Number of courses | Number of courses enrolled by a student. |
| 9 | Number of Off days | Number of days when there are no classes for a student. |
| 10 | Average Lecture per day | Average number of lectures per day. |
| 11 | Average break per day | Average number of hours students are having a break. |
| 12 | Total Number of absences | Total Number of absences for a student throughout the semester. |
The size of CCIS and ENG datasets.
| Datasetr | Sample Size |
|---|---|
| ENG | 2661 |
| CCIS | 1664 |
| Total | 4325 |
Fig 4Scatter plot of the average number of absences and the GPA.
The results of the Linear Regression for the GPA variable using the whole dataset.
| Estimate | Std. Error | Pr(> |t|) | |
|---|---|---|---|
| 3.85732 | 0.11131 | <2 | |
| -0.06285 | 0.00406 | <2 | |
| 239 | |||
| 0.471 | |||
Parameter setting for the Neural Network.
| Decay | Size | Accuracy |
|---|---|---|
| 0.2 | 1 | 0.890 |
| 0.2 | 5 | 0.918 |
| 0.2 | 10 | 0.916 |
| 0.2 | 15 | 0.911 |
| 0.5 | 1 | 0.885 |
| 0.5 | 5 | 0.917 |
| 0.5 | 10 | 0.918 |
| 0.5 | 15 | 0.916 |
| 0.7 | 1 | 0.883 |
| 0.7 | 5 | 0.917 |
| 0.7 | 10 | 0.917 |
| 0.7 | 15 | 0.918 |
Experimental results using both classifiers for the testing dataset.
| Measures/Classifiers | NN | k-NN |
|---|---|---|
| 0.923 | 0.834 | |
| 0.883 | 0.745 | |
| <2 | <2 |
Linear Regression results for the absence variable using ENG and CCIS datasets.
| CCIS | ENG | |||
|---|---|---|---|---|
| Factors | P-Value of Coefficient | Correlation | P-Value of Coefficient | Correlation |
| 0.0149 | 0.77 | 0.000978 | 0.87 | |
| 0.0955 | -0.81 | 0.333 | -0.87 | |
| 0.029175 | 0.40 | 1.18 | 0.70 | |
| 0.00949 | 0.37 | 0.0497 | 0.40 | |
The class size of ENG and CCIS datasets before applying the sampling approaches.
| Classes/ Dataset | ENG Data | CCIS Data |
|---|---|---|
| 2271 | 1560 | |
| 225 | 104 | |
| 2496 | 1664 |
Fig 5The percentage of DN and NO_DN classes for CCIS (left) and ENG (right) datasets.
The prediction accuracy obtained using NN and k-NN classifiers to ENG and CCIS datasets.
| Sampling technique | ENG | CCIS | ||
|---|---|---|---|---|
| NN | k-NN | NN | k-NN | |
| 0.8316 | 0.6885 | 0.8717 | 0.7515 | |
| 0.7848 | 0.8636 | 0.7615 | 0.8076 | |
| 0.6845 | 0.7206 | 0.7054 | 0.7174 | |
| 0.7139 | 0.7326 | 0.7816 | 0.8657 | |
The prediction results obtained using both classifiers for ENG dataset.
| ENG / | Neural Network | k-Nearest Neighbors | ||||
|---|---|---|---|---|---|---|
| Class | Precision | Recall | F1 measure | Precision | Recall | F1 measure |
| 0.8561 | 0.9542 | 0.9025 | 0.9060 | 0.9420 | 0.9237 | |
| 0.5821 | 0.2847 | 0.3824 | 0.4328 | 0.3118 | 0.3625 | |
| 0.7191 | 0.6194 | 0.6424 | 0.6694 | 0.6269 | 0.6430 | |
| 0.5821 | 0.4328 | |||||
| 0.8561 | 0.9060 | |||||
| 4.927 min | 40.863 sec | |||||
The prediction results obtained using both classifiers for CCIS dataset.
| CCIS / | Neural Network | k-Nearest Neighbors | ||||
|---|---|---|---|---|---|---|
| Class | Precision | Recall | F1 measure | Precision | Recall | F1 measure |
| 0.8932 | 0.9676 | 0.9289 | 0.8953 | 0.9588 | 0.9260 | |
| 0.5484 | 0.2537 | 0.3469 | 0.4193 | 0.2097 | 0.2796 | |
| 0.7207 | 0.6107 | 0.6379 | 0.6573 | 0.5842 | 0.6028 | |
| 0.54839 | 0.41935 | |||||
| 0.89316 | 0.89530 | |||||
| 5.601 min | 2.685 min | |||||