| Literature DB >> 33285787 |
Danial Hooshyar1, Margus Pedaste1, Yeongwook Yang1.
Abstract
A significant amount of research has indicated that students' procrastination tendencies are an important factor influencing the performance of students in online learning. It is, therefore, vital for educators to be aware of the presence of such behavior trends as students with lower procrastination tendencies usually achieve better than those with higher procrastination. In the present study, we propose a novel algorithm-using student's assignment submission behavior-to predict the performance of students with learning difficulties through procrastination behavior (called PPP). Unlike many existing works, PPP not only considers late or non-submissions, but also investigates students' behavioral patterns before the due date of assignments. PPP firstly builds feature vectors representing the submission behavior of students for each assignment, then applies a clustering method to the feature vectors for labelling students as a procrastinator, procrastination candidate, or non-procrastinator, and finally employs and compares several classification methods to best classify students. To evaluate the effectiveness of PPP, we use a course including 242 students from the University of Tartu in Estonia. The results reveal that PPP could successfully predict students' performance through their procrastination behaviors with an accuracy of 96%. Linear support vector machine appears to be the best classifier among others in terms of continuous features, and neural network in categorical features, where categorical features tend to perform slightly better than continuous. Finally, we found that the predictive power of all classification methods is lowered by an increment in class numbers formed by clustering.Entities:
Keywords: educational data mining; higher education; online learning; predication of students’ performance; procrastination behavior
Year: 2019 PMID: 33285787 PMCID: PMC7516418 DOI: 10.3390/e22010012
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Comparison of our proposed approach with related works.
| Objective | Behavioral Patterns before Submission | Attributes | Classification Techniques | ||
|---|---|---|---|---|---|
| Inactive Time | Spare Time | ||||
| [ | Prediction of assignment submission | no | no | -students’ activity data-course and assignment information | DT (CART), Random Forest, NN, GaussianNB, Logit, LDA, SVC |
| [ | Prediction of students’ procrastination | no | yes | -grade | ZeroR, OneR, ID3, J48, Random Forest, Decision Stump, JRip, PART, NBTree, Prism |
| [ | Prediction of students at risk through assignment submission | no | no | -students’ activity data-course and assignment information-peers activity data | Neural Network |
| Our work | Prediction of procrastination | yes | yes | -students’ activity and assignment data-grade | L-SVM, R-SVM, Gaussian Processes, Decision Tree, Random Forest, Neural Network, AdaBoost, Naive Bayes |
Notations.
| Notation | Explanation |
|---|---|
|
| A set of students and assignments |
|
| A specific student and assignment |
|
| A spare time and an inactive time (both continuous and categorical values) |
|
| The open date of assignment |
|
| The due date of assignment |
|
| The student’s first view date of assignment |
|
| The student’s assignment submission date |
| A pair of continuous and categorical features for an assignment | |
| Continuous and categorical feature vectors for a student | |
|
| Weighted adjacency matrix |
| L | Unnormalized Laplacian |
|
| Eigenvector |
|
| The matrix containing the eigenvectors |
|
| The set of Performance metrics |
|
| The best classification method |
Figure 1Framework of the PPP approach.
Datasets used in this study.
| Course | Period | Type | # of Assignments | # of Students | |
|---|---|---|---|---|---|
| Dataset 1 (16 continuous features) | Teaching and reflection | 2019 | blended | 8 | 242 |
| Dataset 2 (16 categorical features) | Teaching and reflection | 2019 | blended | 8 | 242 |
Statistical analysis.
|
|
| Score | |
|---|---|---|---|
| spare time ( | 1 | –0.495 | 0.901 |
| inactive time ( | –0.495 | 1 | –0.508 |
| score | 0.901 | –0.508 | 1 |
| count | 242 | 242 | 242 |
| mean | 7.185 | 4 | 80.902 |
| standard deviation | 1.867 | 2.578 | 24.579 |
| minimum | 0 | 0 | –3.333 |
| maximum | 8 | 8 | 100 |
Figure 2Clusters produced by the spectral method: (a) k at 2, (b) k at 3, and (c) k at 4.
Figure 3Elbow result: (a) Continuous features, and (b) categorical features.
Performance metrics for all classification methods.
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| Precision | 0.992 | 0.974 | 0.981 | 0.983 | 0.983 | 0.985 | 0.987 | 0.985 |
| Recall | 0.993 | 0.980 | 0.984 | 0.985 | 0.985 | 0.986 | 0.986 | 0.984 |
| Accuracy | 0.993 | 0.980 | 0.984 | 0.985 | 0.985 | 0.986 | 0.986 | 0.984 |
| F1-score | 0.993 | 0.982 | 0.986 | 0.986 | 0.986 | 0.987 | 0.988 | 0.986 |
|
| ||||||||
| Precision | 0.984 | 0.992 | 0.989 | 0.991 | 0.990 | 0.990 | 0.990 | 0.989 |
| Recall | 0.992 | 0.996 | 0.994 | 0.996 | 0.995 | 0.994 | 0.994 | 0.994 |
| Accuracy | 0.992 | 0.996 | 0.994 | 0.996 | 0.995 | 0.994 | 0.994 | 0.994 |
| F1-score | 0.996 | 0.998 | 0.997 | 0.998 | 0.997 | 0.997 | 0.997 | 0.997 |
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| Precision | 0.957 | 0.934 | 0.943 | 0.935 | 0.935 | 0.940 | 0.931 | 0.892 |
| Recall | 0.952 | 0.929 | 0.937 | 0.927 | 0.926 | 0.933 | 0.920 | 0.885 |
| Accuracy | 0.952 | 0.929 | 0.937 | 0.927 | 0.926 | 0.933 | 0.920 | 0.885 |
| F1-score | 0.952 | 0.930 | 0.938 | 0.929 | 0.928 | 0.935 | 0.922 | 0.886 |
|
| ||||||||
| Precision | 0.867 | 0.920 | 0.938 | 0.946 | 0.950 | 0.952 | 0.954 | 0.956 |
| Recall | 0.930 | 0.954 | 0.961 | 0.963 | 0.963 | 0.965 | 0.963 | 0.962 |
| Accuracy | 0.930 | 0.954 | 0.961 | 0.963 | 0.963 | 0.965 | 0.963 | 0.962 |
| F1-score | 0.963 | 0.970 | 0.974 | 0.975 | 0.974 | 0.975 | 0.973 | 0.973 |
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| Precision | 0.764 | 0.809 | 0.862 | 0.861 | 0.862 | 0.877 | 0.850 | 0.813 |
| Recall | 0.842 | 0.841 | 0.880 | 0.868 | 0.868 | 0.881 | 0.837 | 0.805 |
| Accuracy | 0.842 | 0.841 | 0.880 | 0.868 | 0.868 | 0.881 | 0.837 | 0.805 |
| F1-score | 0.899 | 0.874 | 0.903 | 0.888 | 0.885 | 0.896 | 0.852 | 0.820 |
|
| ||||||||
| Precision | 0.596 | 0.778 | 0.843 | 0.856 | 0.858 | 0.866 | 0.848 | 0.855 |
| Recall | 0.719 | 0.840 | 0.887 | 0.886 | 0.881 | 0.889 | 0.870 | 0.873 |
| Accuracy | 0.719 | 0.840 | 0.887 | 0.886 | 0.881 | 0.889 | 0.870 | 0.873 |
| F1-score | 0.788 | 0.874 | 0.911 | 0.905 | 0.898 | 0.905 | 0.889 | 0.891 |
Figure 4Performance metrics of classification methods at different k-fold for two-class: (a) precision, (b) accuracy, and (c) F1-score.
Figure 5Performance metrics of classification methods at different k-fold for three-class: (a) precision, (b) accuracy, and (c) F1-score.
Figure 6Performance metrics of classification methods at different k-fold for four-class: (a) precision, (b) accuracy, and (c) F1-score.
Performance metrics of classification methods in three-class at different k-folds (i.e., 5, 10, 15, and 20).
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| Precision_5 | 0.958 | 0.933 | 0.940 | 0.933 | 0.933 | 0.939 | 0.927 | 0.898 |
| Precision_10 | 0.963 | 0.937 | 0.944 | 0.933 | 0.934 | 0.938 | 0.930 | 0.899 |
| Precision_15 | 0.952 | 0.934 | 0.945 | 0.941 | 0.940 | 0.944 | 0.937 | 0.893 |
| Precision_20 | 0.954 | 0.934 | 0.943 | 0.933 | 0.934 | 0.940 | 0.930 | 0.879 |
|
| ||||||||
| Precision_5 | 0.867 | 0.923 | 0.941 | 0.948 | 0.952 | 0.955 | 0.957 | 0.958 |
| Precision_10 | 0.866 | 0.913 | 0.934 | 0.941 | 0.944 | 0.945 | 0.949 | 0.949 |
| Precision_15 | 0.868 | 0.923 | 0.937 | 0.944 | 0.951 | 0.953 | 0.955 | 0.957 |
| Precision_20 | 0.868 | 0.921 | 0.941 | 0.951 | 0.953 | 0.955 | 0.957 | 0.959 |
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| Accuracy_5 | 0.950 | 0.924 | 0.932 | 0.926 | 0.926 | 0.934 | 0.918 | 0.884 |
| Accuracy_10 | 0.959 | 0.930 | 0.938 | 0.927 | 0.926 | 0.931 | 0.920 | 0.885 |
| Accuracy_15 | 0.946 | 0.932 | 0.941 | 0.932 | 0.931 | 0.937 | 0.927 | 0.890 |
| Accuracy_20 | 0.951 | 0.930 | 0.935 | 0.923 | 0.922 | 0.928 | 0.916 | 0.881 |
|
| ||||||||
| Accuracy_5 | 0.929 | 0.954 | 0.961 | 0.961 | 0.961 | 0.963 | 0.962 | 0.962 |
| Accuracy_10 | 0.930 | 0.951 | 0.959 | 0.962 | 0.961 | 0.962 | 0.960 | 0.960 |
| Accuracy_15 | 0.930 | 0.955 | 0.960 | 0.962 | 0.963 | 0.965 | 0.963 | 0.962 |
| Accuracy_20 | 0.929 | 0.954 | 0.963 | 0.967 | 0.966 | 0.968 | 0.965 | 0.963 |
|
|
|
|
|
|
|
|
|
|
|
| ||||||||
| F1_5 | 0.949 | 0.922 | 0.931 | 0.925 | 0.926 | 0.933 | 0.917 | 0.874 |
| F1_10 | 0.958 | 0.929 | 0.937 | 0.928 | 0.927 | 0.932 | 0.921 | 0.881 |
| F1_15 | 0.948 | 0.935 | 0.944 | 0.936 | 0.935 | 0.940 | 0.929 | 0.895 |
| F1_20 | 0.953 | 0.934 | 0.938 | 0.927 | 0.926 | 0.933 | 0.921 | 0.894 |
|
| ||||||||
| F1_5 | 0.963 | 0.969 | 0.973 | 0.971 | 0.972 | 0.973 | 0.970 | 0.970 |
| F1_10 | 0.963 | 0.969 | 0.974 | 0.974 | 0.972 | 0.974 | 0.972 | 0.972 |
| F1_15 | 0.963 | 0.971 | 0.972 | 0.973 | 0.974 | 0.975 | 0.973 | 0.973 |
| F1_20 | 0.962 | 0.972 | 0.979 | 0.979 | 0.978 | 0.979 | 0.978 | 0.977 |