| Literature DB >> 33954233 |
Arushi Agarwal1, Purushottam Sharma1, Mohammed Alshehri2, Ahmed A Mohamed3,4, Osama Alfarraj5.
Abstract
In today's cyber world, the demand for the internet is increasing day by day, increasing the concern of network security. The aim of an Intrusion Detection System (IDS) is to provide approaches against many fast-growing network attacks (e.g., DDoS attack, Ransomware attack, Botnet attack, etc.), as it blocks the harmful activities occurring in the network system. In this work, three different classification machine learning algorithms-Naïve Bayes (NB), Support Vector Machine (SVM), and K-nearest neighbor (KNN)-were used to detect the accuracy and reducing the processing time of an algorithm on the UNSW-NB15 dataset and to find the best-suited algorithm which can efficiently learn the pattern of the suspicious network activities. The data gathered from the feature set comparison was then applied as input to IDS as data feeds to train the system for future intrusion behavior prediction and analysis using the best-fit algorithm chosen from the above three algorithms based on the performance metrics found. Also, the classification reports (Precision, Recall, and F1-score) and confusion matrix were generated and compared to finalize the support-validation status found throughout the testing phase of the model used in this approach.Entities:
Keywords: Intrusion detection system; K-Nearest Neighbors (KNN); Naive Bayes (NB); Support vector machine (SVM); UNSWNB15 dataset
Year: 2021 PMID: 33954233 PMCID: PMC8049129 DOI: 10.7717/peerj-cs.437
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1SVM classifier. (A) SVM classification technique. (B) SVM hyperplane selection.
Figure 2Classification with KNN where K = 3.
Attack types classification of UNSW-NB15.
| Sl. No. | Attack class | No. of samples | Attack subcategory |
|---|---|---|---|
| 1. | Fuzzers | 5,052 | FTP, HTTP, RIP, Syslog, PPTP, FTP, DCERPC, OSPF |
| 2. | Reconnaissance | 1,760 | Telnet, SNMP, NetBIOS, DNS, SCTP, MSSOL, SMTP |
| 3. | Shellcode | 224 | FreeBSD, HP-UX, NetBSD, AIX, Scolnix, decoders, IRIX, MAOSX, BSDi, Solaris |
| 4. | Analysis | 527 | HTML, Port Scanner, Spam |
| 5. | Backdoors | 535 | Backdoors |
| 6. | DoS | 1,168 | Ethernet, VPN, IRC, DP, TCP, VNC, XINETP, NTP, Asterisk, RTSP, CUPS, Cisco Skinny |
| 7. | Worms | 25 | Worms |
| 8. | Generic | 7,523 | SIP, IXIA, Superflow, TETP, HTTP |
| 9. | Exploits | 5,410 | Evasions, SCCP, WINS, DCERPC, Dameware, SCADA, VNC, CDAP, RTSP, LPD, RDesktop, NNTP, SMB, Evasions, RADIUS, SCCP, SIP, PPTP |
Figure 3Confusion matrix.
Figure 4Confusion matrix with rules.
Figure 5Proposed framework model.
Figure 6Importing libraries.
Figure 7Dataset information.
Figure 8Importing algorithms.
Figure 9Heat-map.
Figure 10Training and testing phase.
Figure 11Training and testing information.
(A) Training Data Set for the x-axis. (B) Training Data Set for the y-axis.
Figure 12Support vector machine (SVM).
Figure 13K-nearest neighbour (KNN).
Figure 14Naïve Bayes (NB).
Figure 15(A) Implementation result of an existing model. (B) Implementation result of existing model.
Comparison result between existing and proposed model.
| Category | Accuracy of existing model | Accuracy of proposed model |
|---|---|---|
| Model name | ||
| Naïve Bayes (NB) | 87.4126 | 92.0591 |
| kNN | 93.7063 | 93.7228 |
| SVM | 91.6084 | 95.1488 |
Accuracy comparison table (using UNSW-NB15 dataset).
| Serial No. | Classification algorithms | Accuracy obtained | ||
|---|---|---|---|---|
| 1 | SVM | 97.7777 | Highest | |
| 2 | kNN | 93.3333 | ⟶ | |
| 3 | NB | 95.5555 |
Figure 16Accuracy comparison graph (using UNSW-NB15 dataset).