| Literature DB >> 35401723 |
Abstract
Although the digital transformation is advancing, a significant portion of the population in all countries of the world is not familiar with the technological means that allow malicious users to deceive them and gain great financial benefits using phishing techniques. Phishing is an act of deception of Internet users. The perpetrator pretends to be a credible entity, abusing the lack of protection provided by electronic tools and the ignorance of the victim (user) to illegally obtain personal information, such as bank account codes and sensitive private data. One of the most common targets for digital phishing attacks is the education sector, as distance learning became necessary for billions of students worldwide during the pandemic. Many educational institutions were forced to transition to the digital environment with minimal or no preparation. This paper presents a semisupervised majority-weighted vote system for detecting phishing attacks in a unique case study for the education sector. A realistic majority weighted vote scheme is used to optimize learning ability in selecting the most appropriate classifier, which proves to be exceptionally reliable in complex decision-making environments. In particular, the voting naive Bayes positive algorithm is presented, which offers an innovative approach to the probabilistic part-supervised learning process, which accurately predicts the class of test snapshots using prerated training snapshots only from the positive class examples.Entities:
Mesh:
Year: 2022 PMID: 35401723 PMCID: PMC8989555 DOI: 10.1155/2022/7402085
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1The majority weighted vote methodology.
Performance measures.
| Model | Accuracy | Auc | Recall | Prec. | F1 | Kappa | MCC | TT (sec) |
|---|---|---|---|---|---|---|---|---|
| Voting naive bayes positive | 0.9314 | 0.9982 | 0.9292 | 0.9320 | 0.9312 | 0.8722 | 0.8871 | 2.339 |
| Light gradient boosting machine | 0.8949 | 0.9777 | 0.8770 | 0.8970 | 0.8941 | 0.8197 | 0.8218 | 0.244 |
| Extreme gradient boosting | 0.8942 | 0.9759 | 0.8745 | 0.8976 | 0.8935 | 0.8187 | 0.8211 | 15.896 |
| CatBoost classifier | 0.8926 | 0.9763 | 0.8710 | 0.8950 | 0.8921 | 0.8154 | 0.8172 | 4.328 |
| Random forest classifier | 0.8918 | 0.9739 | 0.8685 | 0.8961 | 0.8918 | 0.8145 | 0.8169 | 0.562 |
| Gradient boosting classifier | 0.8864 | 0.9747 | 0.8635 | 0.8914 | 0.8861 | 0.8053 | 0.8082 | 0.665 |
| SVM - radial kernel | 0.8726 | 0.9498 | 0.8388 | 0.8765 | 0.8716 | 0.7806 | 0.7832 | 0.387 |
| k-Neighbors classifier | 0.8687 | 0.9494 | 0.8336 | 0.8700 | 0.8666 | 0.7727 | 0.7753 | 0.128 |
| MLP classifier | 0.7988 | 0.8728 | 0.8076 | 0.7877 | 0.7541 | 0.7719 | 0.7056 | 6.322 |
Figure 2Precision majority vote (left) vs. precision weighted vote (right).