| Literature DB >> 35875826 |
Pratiyush Guleria1, Manu Sood2.
Abstract
Machine Learning concept learns from experiences, inferences and conceives complex queries. Machine learning techniques can be used to develop the educational framework which understands the inputs from students, parents and with intelligence generates the result. The framework integrates the features of Machine Learning (ML), Explainable AI (XAI) to analyze the educational factors which are helpful to students in achieving career placements and help students to opt for the right decision for their career growth. It is supposed to work like an expert system with decision support to figure out the problems, the way humans solve the problems by understanding, analyzing, and remembering. In this paper, the authors have proposed a framework for career counseling of students using ML and AI techniques. ML-based White and Black Box models analyze the educational dataset comprising of academic and employability attributes that are important for the job placements and skilling of the students. In the proposed framework, White Box and Black Box models get trained over an educational dataset taken in the study. The Recall and F-Measure score achieved by the Naive Bayes for performing predictions is 91.2% and 90.7% that is best compared to the score of Logistic Regression, Decision Tree, SVM, KNN, and Ensemble models taken in the study.Entities:
Keywords: Artificial intelligence; Career counseling; Educational data mining; Explainable AI; Machine learning
Year: 2022 PMID: 35875826 PMCID: PMC9287825 DOI: 10.1007/s10639-022-11221-2
Source DB: PubMed Journal: Educ Inf Technol (Dordr) ISSN: 1360-2357
Fig. 1Standard Approach of Career Counseling
Fig. 2Explainable AI (XAI) White Box and Black Box Model stages
Machine Learning Techniques and Approaches Followed by Authors related to Career Counseling in Educational Sector
| Reference | ML Techniques/Approach Followed | Description of the Work Done |
|---|---|---|
| Abidi et al., | Machine Learning Techniques | Intelligent tutoring system for identifying the students who face problems when attempting homework exercises |
| AlaoKazeem & IbamOnwuka, | Expert systems | Web-enabled career counseling system for students of Nigeria to seek guidance for taking admission in any course |
| Al-Sudani and Palaniappan, | Artificial Neural Network | Finding low-performing students at an early stage of the semester |
| Goga et al., | ML Framework | An intelligent recommender system for improving the students' performance and predicted the first-year academic performance |
| Hendahewa et al., | Artificial intelligence | Artificial intelligence approach for effective career guidance |
| Ho et al., | ML Model | Investigated the satisfaction level of undergraduate students using remote learning in higher education |
| Hoffait & Schyns, | Logistic Regression, Random Forest, and Artificial Neural Network | Early detection of potential difficulties faced by university students |
| Huang et al., | Deep Reinforcement Learning Framework | Suitable learning exercises to students and increasing interaction by receiving students' performance feedback |
| Hussain et al., | Decision Tree | Observing the students who express low-engagement during assessment activities |
| Iatrellis et al., | K-Means Clustering Algorithm | Clustering-aided approach for predicting student outcomes in higher education |
| Jishan et al., | Naïve Bayes, Decision Tree, and Artificial Neural Networks | Forecast students’ final result before the final exam |
| Kausar et al., | Ensemble techniques | Examine the relationship between students’ semester course and final results |
| Khan et al., | Artificial Neural Networks | Proposed a conceptual framework for attribute selection and predicting student performance |
| Mimis et al., | Decision Tree, Naive Bayes, and Neural Network | Predicting students' performance |
| Musso et al., | Artificial Neural Networks | Grade point average of students, academic retention and degree completion outcomes |
| Pujari et al., | Random Forest and Linear Regression | Student career counseling using artificial intelligence |
| Villegas-Ch et al., | Artificial Intelligence and Data Analysis | Integrating machine learning and data analysis in learning management system |
| Zhai et al., | ML Framework | An automatic assessment of students learning using responses, simulations, educational assessment |
Fig. 3Machine learning framework for career counseling
Mathematical description of ML algorithms taken in a study
| SrNo | Classifier Name | Description |
|---|---|---|
| 1 | K-Nearest Neighbor | K-Nearest Neighbor is implemented for classification and regression. Here, the new point is classified based on the nearest distance to the point. It calculates the point based on the two measures i.e. similarity and distance. KNN calculates the similarity based on the Euclidean and Manhattan distance functions as shown in Eqs. 1 and 2 Euclidean Distance: Manhattan Distance: |
| 2 | SVM | SVM is known as a Support vector machine. It is also a supervised Machine Learning algorithm as it works on labeled training data. In SVM, the separating Hyperplane approach is followed and it categorizes the plane into two parts. The separating Hyperplane divides the attribute into two classes i.e. one class to one side of Hyperplane and second class to another side of Hyperplane. In the case of Linear SVM, for a given training dataset of n points of the form as shown in Eq. 3, where the where which means anything on or above this boundary is one class whereas anything on or below this boundary is another class |
| 3 | Naïve Bayes | Naïve Bayes is based on the probability of events and performs predictions. The Naïve Bayes is used to predict the probability of data belonging to an input class based on prior knowledge i.e. given data. It equates a Posterior Probability as shown in Eq. 7. |
| 4 | J48 | J48 is a class for generating a pruned or unpruned C4.5 decision tree, first proposed by Quinlan ( where The pruned Decision Tree is shown in Table |
| 5 | Ensembling Methods | AdaBoost is the Ensembling technique to boost up the performance of the decision trees and is used mainly for classification. Random Forest first proposed by Ho ( Here, The final aggregate classifier for regression is shown in Eq. 11 Here, x is the point and the average of The final aggregate classifier for classification is shown in Eq. 12 |
| 6 | Association Rule Mining | Association Rule Mining is a rule based machine learning and derives the interesting relations between attributes in databases (Piatetsky-Shapiro, |
J48 pruned tree
Fig. 10Decision Tree
Sample Dataset
| sl_no | gender | ssc_p (Secondary Education %) | ssc_b (Educational Board Secondary) | hsc_p (Higher Secondary %) | hsc_b (Educational Board Higher Secondary) | hsc_s (Subjects in Higher Secondary) | degree_p (% of marks in degree) | degree_t (Subjects in Under Graduate Degree) | Workex (Work Experience) | etest_p (Employability Test) | Specialisation Post-Graduation(MBA) | mba_p % obtained | Status Placement Status | Salary Salary offered |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | M | 67 | Others | 91 | Others | Commerce | 58 | Sci&Tech | No | 55 | Mkt&HR | 58.8 | Placed | 270,000 |
| 2 | M | 79.33 | Central | 78.33 | Others | Science | 77.48 | Sci&Tech | Yes | 86.5 | Mkt&Fin | 66.28 | Placed | 200,000 |
| 3 | M | 65 | Central | 68 | Central | Arts | 64 | Comm&Mgmt | No | 75 | Mkt&Fin | 57.8 | Placed | 250,000 |
| 4 | M | 56 | Central | 52 | Central | Science | 52 | Sci&Tech | No | 66 | Mkt&HR | 59.43 | Not Placed | NaN |
| 5 | M | 85.8 | Central | 73.6 | Central | Commerce | 73.3 | Comm&Mgmt | No | 96.8 | Mkt&Fin | 55.5 | Placed | 425,000 |
Comparative metrics of ML models
| Sr.No | ML Model Type | Correctly Classified Instances | Incorrectly Classified Instances | TP Rate | FP Rate | Recall | F-Measure | Root mean squared error | Accuracy | Time is taken to build the model (seconds) |
|---|---|---|---|---|---|---|---|---|---|---|
| [1] | J48(C4.5) | 178 | 37 | 0.828 | 0.249 | 0.828 | 0.826 | 0.3923 | 82.790% | 0.03 |
| [2] | Bagging | 180 | 35 | 0.837 | 0.262 | 0.837 | 0.833 | 0.3368 | 83.720% | 0.03 |
| [3] | KNN | 159 | 56 | 0.740 | 0.559 | 0.740 | 0.676 | 0.5078 | 73.953% | 0 |
| [4] | Logistic Regression | 188 | 27 | 0.874 | 0.204 | 0.874 | 0.872 | 0.3088 | 87.441% | 0.29 |
| [5] | Naïve Bayes | 196 | 19 | 0.912 | 0.195 | 0.912 | 0.907 | 0.2541 | 91.162% | 0 |
| [6] | SVM | 186 | 29 | 0.865 | 0.208 | 0.865 | 0.863 | 0.3673 | 86.511% | 0.03 |
| [7] | Random Forest | 184 | 31 | 0.856 | 0.261 | 0.856 | 0.849 | 0.3258 | 85.581% | 0.12 |
Decision Tree, SVM, and Logistic Regression
| Model Type | Kernel Function/Ensemble Method | Accuracy | Prediction Speed | Training Time | Feature Selection | Enable Principal Component Analysis(PCA) or Not | Categorical Predictors Used or Not |
|---|---|---|---|---|---|---|---|
| Decision Tree | Fine Tree | 63.7% | ~ 2000 obs/sec | 6.2652 s | All features used in the model, before PCA | Numeric predictors are used | All 7 categorical predictors are used in the model. PCA is not applied to categoricals |
| Maximum number of splits: 100 | |||||||
| Split criterion: Gini's diversity index | |||||||
| Linear SVM | Gaussian Kernel Function | 65.6% | ~ 2200 obs/sec | 2.4079 s | ✓ | ✓ | ✓ |
| Multi-Class Method: One-vs-One | |||||||
| Logistic Regression | 68.8% | ~ 1700 obs/sec | 4.9942 s | ✓ | ✓ | ✓ |
Ensemblers and narrow neural network
| Model Type | Kernel Function/Ensemble Method | Accuracy | Prediction Speed | Training Time | Feature Selection | Principal Component Analysis(PCA) | Predictors |
|---|---|---|---|---|---|---|---|
| Boosted Trees | Ensemble method: AdaBoost | 64.2% | ~ 740 obs/sec | 2.9058 s | ✓ | Numeric predictors are used | All 7 categorical predictors are used in the model. PCA is not applied to categoricals |
| Learner Type: Decision Tree | |||||||
| Maximum number of splits: 20 | |||||||
| Number of learners: 30 | |||||||
| Learning rate: 0.1 | |||||||
| Bagged Trees | Ensemble method: Bag | 65.6% | ~ 830 obs/sec | 2.9635 s | ✓ | ✓ | ✓ |
| Learner Type: Decision Tree | |||||||
| Maximum number of splits: 214 | |||||||
| Number of learners: 30 | |||||||
| RUS Boosted Trees | Ensemble method: RUSBoost | 58.1% | ~ 920 obs/sec | 2.8604 s | ✓ | ✓ | ✓ |
| Learner Type: Decision Tree | |||||||
| Maximum number of splits: 20 | |||||||
| Number of learners: 30 | |||||||
| Learning rate: 0.1 | |||||||
| Split criterion: Gini's diversity index | |||||||
| Neural Network | Narrow Neural Network | 68.8% | ~ 2800 obs/sec | 2.841 s | ✓ | ✓ | ✓ |
| Number of fully connected layers: 1 | |||||||
| First layer size: 10 | |||||||
| Activation: ReLU | |||||||
| Iteration limit: 1000 |
Confusion matrix of White Box models
| Model type | Class | Outcome variable | Predicted Class | TPR | FNR | PPV | FDR | |
|---|---|---|---|---|---|---|---|---|
| Not Placed | Placed | |||||||
| Decision Tree | True Class | Not Placed | 20 | 47 | 29.9% | 70.1% | 39.2% | 60.8% |
| Placed | 31 | 117 | 79.1% | 20.9% | 71.3% | 28.7% | ||
| SVM | True Class | Not Placed | 0 | 67 | 0 | 100% | 0 | 100% |
| Placed | 1 | 147 | 99.3% | 0.7% | 68.7% | 31.3% | ||
| Logistic Regression | True Class | Not Placed | 0 | 67 | 0 | 100% | 0 | 0 |
| Placed | 0 | 148 | 100% | 0 | 68.8% | 31.2% | ||
TPR = True Positive Rate; FNR = False Negative Rate; PPV = Positive Predicted Value; FDR = False Discovery Rate
Confusion matrix of Black Box models
| Model type | Class | Outcome variable | Predicted Class | TPR | FNR | PPV | FDR | |
|---|---|---|---|---|---|---|---|---|
| Not Placed | Placed | |||||||
| Boosted Trees | True Class | Not Placed | 17 | 50 | 25.4% | 74.6% | 38.6% | 61.4% |
| Placed | 27 | 121 | 81.8% | 18.2% | 70.8% | 29.2% | ||
| Bagged Trees | True Class | Not Placed | 17 | 50 | 25.4% | 74.6% | 39.5% | 60.5% |
| Placed | 26 | 122 | 82.4% | 17.6% | 70.9% | 29.1% | ||
| RUSBoosted Trees | True Class | Not Placed | 37 | 30 | 55.2% | 44.8% | 38.5% | 61.5% |
| Placed | 59 | 89 | 60.1% | 39.9% | 74.8% | 25.2% | ||
| Neural Network | True Class | Not Placed | 0 | 67 | 0% | 100% | 0% | 0% |
| Placed | 0 | 148 | 100% | 0% | 68.8% | 31.2% | ||
Fig. 4Machine learning workflow for performing predictions
Fig. 5(a) Decision tree correct predictions ‘ssc_p’ vs ‘hsc_p’. (b). Decision tree incorrect predictions ‘ssc_p’ vs ‘hsc_p’
Fig. 6(a) Decision tree correct predictions ‘etest_p’ vs ‘degree_t’. (b). Decision tree incorrect predictions ‘etest_p’ vs ‘degree_t’
Fig. 7(a) Bagging correct predictions ‘ssc_p’ vs ‘hsc_p’. (b). Bagging incorrect predictions ‘ssc_p’ vs ‘hsc_p’
Fig. 8(a) Bagging correct predictions ‘etest_p’ vs ‘degree_t’. (b). Bagging incorrect predictions ‘etest_p’ vs ‘degree_t’
Fig. 9Comparative evaluation of the ML classifiers