| Literature DB >> 35919875 |
Muhammad Haziq Bin Roslan1, Chwen Jen Chen1.
Abstract
This study attempts to predict secondary school students' performance in English and Mathematics subjects using data mining (DM) techniques. It aims to provide insights into predictors of students' performance in English and Mathematics, characteristics of students with different levels of performance, the most effective DM technique for students' performance prediction, and the relationship between these two subjects. The study employed the archival data of students who were 16 years old in 2019 and sat for the Malaysian Certificate of Examination (MCE) in 2021. The learning of English and Mathematics is a concern in many countries. Three main factors, namely students' past academic performance, demographics, and psychological attributes were scrutinized to identify their impact on the prediction. This study utilized the Orange software for the DM process. It employed Decision Tree (DT) rules to determine the characteristics of students with low, moderate, and high performance in English and Mathematics subjects. DT and Naïve Bayes (NB) techniques show the best predictive performance for English and Mathematics subjects, respectively. Such characteristics and predictions may cue appropriate interventions to improve students' performance in these subjects. This study revealed students' past academic performance as the most critical predictor, as well as a few demographics and psychological attributes. By examining top predictors derived using four different classifier types, this study found that students' past Mathematics performance predicts their MCE English performance and students' past English performance predicts their MCE Mathematics performance. This finding shows students' performances in both subjects are interrelated.Entities:
Keywords: Data mining techniques; Educational data mining; English; Mathematics; Performance prediction; Secondary education
Year: 2022 PMID: 35919875 PMCID: PMC9334550 DOI: 10.1007/s10639-022-11259-2
Source DB: PubMed Journal: Educ Inf Technol (Dordr) ISSN: 1360-2357
Fig. 1Research framework
Fig. 2DM model development process
Confusion matrix
| Actual class | |||
|---|---|---|---|
| Positive | Negative | ||
| Predicted class | Positive | True Positive (TP) | False Negative (FN) |
| Negative | False Positive (FP) | True Negative (TN) | |
Fig. 3Predicting students’ performance process using DT
List of data required
| Factors | Attributes |
|---|---|
| Past academic performance | Form Four English and Mathematics mid-term examinations results |
| Form Four English and Mathematics final examinations results | |
| Actual MCE English and Mathematics results | |
| Demographics | Gender: Male (121), Female (81) |
| Ethnicity: Malay (186), Chinese (7), Indian (8), Others (1) | |
| Religion: Muslim (189), Buddhist (4), Hindu (6), Christian (3), Others (0) | |
Parents’ occupational status (i.e., Permanent, Temporary): Father’s job—Permanent (111), Temporary (48); Mother’s job – Permanent (87), Temporary (72) | |
Parents’ educational level (i.e., MCE, Diploma, Bachelor, Master, Ph.D.): Father’s education – MCE (75), Diploma (36), Bachelor (35), Master (8), Ph.D. (5); Mother’s education – MCE (80), Diploma (24), Bachelor (37), Master (12), Ph.D. (6) | |
| Parents’ marital status: Married (148), Divorced (10), Widowed (9) | |
| Psychological attributes | Fifteen attributes were examined via a psychometric test (Hamzah, 2011) that was administered by the Ministry of Education, Malaysia on all Form Four students. These attributes include autonomy, creativity, aggression, extrovert, achievement, diversity, intellectual, leadership, structure, resilience, help, analytical, self-criticism, vision, and transparency. Each attribute was measured on a 10-point scale. A score between 1 to 3 is considered Low, 4 to 6 is considered Moderate and 7 to 10 is considered High |
Fig. 4Tree diagram of students with low English performance
Fig. 5Tree diagram of students with moderate English performance
Fig. 6Tree diagram of students with high English performance
Fig. 7Tree diagram of students with low Mathematics performance
Fig. 8Tree diagram of students with moderate Mathematics performance
Fig. 9Tree diagram of students with high Mathematics performance
Fig. 10Process of identifying main predictors using DT in orange
Fig. 11Process of identifying main predictors using NN in orange
Fig. 12Process of identifying main predictors using SVM in orange
Fig. 13Process of identifying main predictors using NB in orange
Top five attributes on the prediction of English performance using DT, NN, SVM, and NB
| DT | |
| Attributes | Impact |
| Form Four English final examination | 0.225 |
| Form Four Mathematics final examination | 0.202 |
| Form Four English mid-term examination | 0.165 |
| Form Four Mathematics mid-term examination | 0.142 |
| Diversity | 0.090 |
| NN | |
| Attributes | Impact |
| Form Four English final examination | 0.168 |
| Form Four Mathematics final examination | 0.132 |
| Form Four English mid-term examination | 0.117 |
| Form Four Mathematics mid-term examination | 0.091 |
| Gender | 0.085 |
| SVM | |
| Attributes | Impact |
| Form Four English final examination | 0.152 |
| Form Four Mathematics final examination | 0.121 |
| Form Four English mid-term examination | 0.104 |
| Form Four Mathematics mid-term examination | 0.090 |
| Father’s educational level | 0.084 |
| NB | |
| Attributes | Impact |
| Form Four English final examination | 0.168 |
| Form Four Mathematics final examination | 0.132 |
| Form Four English mid-term examination | 0.117 |
| Form Four Mathematics mid-term examination | 0.091 |
| Gender | 0.085 |
Top five attributes on the prediction of Mathematics performance using DT, NN, SVM, and NB
| DT | |
| Attributes | Impact |
| Form Four Mathematics final examination | 0.341 |
| Form Four Mathematics mid-term examination | 0.235 |
| Form Four English final examination | 0.124 |
| Diversity | 0.086 |
| Form Four English mid-term examination | 0.083 |
| NN | |
| Attributes | Impact |
| Form Four Mathematics final examination | 0.223 |
| Form Four Mathematics mid-term examination | 0.151 |
| Form Four English final examination | 0.093 |
| Ethnicity | 0.074 |
| Gender | 0.065 |
| SVM | |
| Attributes | Impact |
| Form Four Mathematics final examination | 0.205 |
| Form Four Mathematics mid-term examination | 0.135 |
| Form Four English final examination | 0.082 |
| Parents’ marital status | 0.064 |
| Father’s educational level | 0.048 |
| NB | |
| Attributes | Impact |
| Form Four Mathematics final examination | 0.223 |
| Form Four Mathematics mid-term examination | 0.151 |
| Form Four English final examination | 0.093 |
| Form Four English mid-term examination | 0.059 |
| Diversity | 0.044 |
Summary of measures on DM classifiers for English performance prediction
| DM classifiers | Prediction measures (%) | ||||
|---|---|---|---|---|---|
| Area under curve | Accuracy | Precision | Recall | F1-score | |
| DT | 89.3 | 87.1 | 87.4 | 87.1 | 87.2 |
| NN | 83.3 | 71.0 | 72.0 | 71.0 | 71.1 |
| SVM | 80.3 | 74.2 | 75.6 | 74.2 | 74.6 |
| NB | 81.3 | 71.0 | 76.5 | 71.0 | 72.3 |
Summary of measures on DM classifiers in Mathematics performance prediction
| DM classifiers | Prediction measures (%) | ||||
|---|---|---|---|---|---|
| Area under curve | Accuracy | Precision | Recall | F1-score | |
| DT | 82.9 | 83.9 | 84.3 | 83.9 | 83.2 |
| NN | 87.1 | 74.2 | 77.6 | 74.2 | 71.2 |
| SVM | 72.2 | 71.0 | 80.3 | 71.0 | 65.0 |
| NB | 92.8 | 83.9 | 84.8 | 83.9 | 83.8 |
Results of Spearman’s rank correlation
| Actual MCE English performance | Actual MCE Mathematics performance | ||
|---|---|---|---|
| Form Four Mathematics mid-term examination | 0.345 | Form Four English mid-term examination | 0.285 |
| Form Four Mathematics final examination | 0.391 | Form Four English final examination | 0.367 |
| Actual MCE Mathematics performance | 0.433 | Actual MCE English performance | 0.433 |