| Literature DB >> 34777966 |
Vikash Kumar1, Ditipriya Sinha1.
Abstract
With the introduction of the Internet to the mainstream like e-commerce, online banking, health system and other day-to-day essentials, risk of being exposed to various are increasing exponentially. Zero-day attack(s) targeting unknown vulnerabilities of a software or system opens up further research direction in the field of cyber-attacks. Existing approaches either uses ML/DNN or anomaly-based approach to protect against these attacks. Detecting zero-day attacks through these techniques miss several parameters like frequency of particular byte streams in network traffic and their correlation. Covering attacks that produce lower traffic is difficult through neural network models because it requires higher traffic for correct prediction. This paper proposes a novel robust and intelligent cyber-attack detection model to cover the issues mentioned above using the concept of heavy-hitter and graph technique to detect zero-day attacks. The proposed work consists of two phases (a) Signature generation and (b) Evaluation phase. This model evaluates the performance using generated signatures at the training phase. The result analysis of the proposed zero-day attack detection shows higher performance for accuracy of 91.33% for the binary classification and accuracy of 90.35% for multi-class classification on real-time attack data. The performance against benchmark data set CICIDS18 shows a promising result of 91.62% for binary-class classification on this model. Thus, the proposed approach shows an encouraging result to detect zero-day attacks.Entities:
Keywords: Cyber-attacks; Heavy-hitters; High volume attack; Low volume attack; Signature generation; Token extraction; Zero-day attack
Year: 2021 PMID: 34777966 PMCID: PMC8160422 DOI: 10.1007/s40747-021-00396-9
Source DB: PubMed Journal: Complex Intell Systems ISSN: 2199-4536
Fig. 1Different phases of ZA
Fig. 2Example of the adjacency matrix for the above connectivity
Summary of key-state-of-the-arts
| Author & Year | Methodology | Summary |
|---|---|---|
| Blaise et al. [ | Statistical approach | |
| Based on analysis | ||
| of ports | ||
| R.M. et al. [ | DNN based approach | |
| Javed et al. [ | LSTM based CNN Model | |
| through automotive vehicles on different classifier to make final decision | ||
| Sameera & Shashi et al. [ | DNN | |
| Hindy et al. [ | DNN | |
| Alauthman et al. [ | Reinforcement learning | |
| 2020 | -based detection | |
| Singh et al. [ | Hybrid approach using | |
| Snort IDS | ||
| Tang et al. [ | Statistical Model | |
| Khan et al. [ | Hybrid approach using | |
| Bloom filter & KNN | ||
| Kumar et al. [ | Deep Learning | |
| approach | ||
| Sun et al. [ | Bayesian networks | |
| based approach | ||
| Kim et al. [ | GAN based on deep | |
| autoencoder | ||
| Duessel et al. [ | One class SVM | |
Fig. 3Building signature-base for ZA detection
Fig. 4Pool example
Abbreviations used in algorithms of the proposed work
| Notation | Description |
|---|---|
| Non-attack or Genuine traffic pool | |
| HVA pool | |
| High volume attack signature knowledge base | |
| Unqualified heavy-hitter tokens | |
| Merged tokens list | |
| Count of token in attack pool | |
| Count of token in genuine pool | |
| Threshold value for HVA signature qualification | |
| LVA pool | |
| Adjacency matrix of generated graph | |
| Merged tokens of LVA module | |
| Array of score value assigned to each vertex in the graph | |
| Constant value for tuning score value | |
| Weight of path from vertex A to B | |
| Average path weight | |
| Threshold value for LVA signature qualification |
Fig. 5Extraction of tokens from input stream
Fig. 14Roc curve to decide the best value of
Fig. 6Advancing of LVA signature extraction phase
Fig. 7Analogy graph used in the proposed work
Fig. 8Detecting ZA by applying signature knowledge base
Fig. 9Working procedure of the proposed work
System Specification
| Operating System | Version | System Specification |
|---|---|---|
| Kali Linux 64-bit | 2020-3, 2019-4 | |
| Ubuntu Server/Client | 14.04.5 LTS, 16.04 LTS | |
| Metasploitable | v4.11.4-2015071402 | |
| Windows 32/64-bit | NT 6.1, NT 6.3, NT 10.0 |
Fig. 10System setup to generate data for the proposed work
Fig. 11Wireshark window capturing DoS attack
Fig. 12Performing DoS attack using ettercap
Fig. 13Performing probe on a target host
Class distribution among training and testing data
| Attack Type | Training Data | Testing Data |
|---|---|---|
| TCP SYN | Yes | No |
| UDP Flood | No | Yes |
| HTTP Flood | Yes | No |
| Probe | Yes (Service scan) | Yes (OS,Network scan) |
| Data Theft | HTTP | FTP |
Abbreviations used for the complexity analysis
| Notation | Description |
|---|---|
| Number of tokens generated using HVA pool | |
| Number of tokens generated using LVA pool | |
| Number of tokens generated using genuine pool | |
| Number of packets in HVA pool | |
| Number of packets in LVA pool | |
| Number of packets in genuine traffic pool | |
| Total number of tokens extracted from traffic | |
| Number of frequent tokens | |
| Total signatures | |
| Edge set | |
| LVA signatures | |
| Vertex set comprising of LVA qualified tokens | |
| number of test packets |
Distribution of instances in different classes
| Classes | DoS | DDoS | OS scan | Network scan | Data theft | Normal |
|---|---|---|---|---|---|---|
| Instances | 5000 | 4000 | 5000 | 3400 | 1600 | 6032 |
Confusion matrix for real-time test data set under binary-class classification
| Classes | Attack | Normal |
|---|---|---|
| Attack | 16867 | 2133 |
| Normal | 38 | 5994 |
Fig. 15Performance evaluation of the proposed system for binary class
Confusion matrix for real-time test data set under multi-class classification
| HVA | LVA | Normal | |
|---|---|---|---|
| HVA | 6960 | 78 | 1962 |
| LVA | 166 | 9663 | 171 |
| Normal | 21 | 17 | 5994 |
Fig. 16Precision recall comparison for multi-class classification
Fig. 17Performance matrices for multiclass classification on real-time data
Confusion matrix for CICIDS18 under binary-class classification
| Classes | Attack | Normal |
|---|---|---|
| Attack | 314018 | 29243 |
| Normal | 38 | 5994 |
Confusion matrix for CICIDS18 data set under multi-class classification
| HVA | LVA | Normal | |
|---|---|---|---|
| HVA | 242014 | 8054 | 23537 |
| LVA | 1183 | 62764 | 5706 |
| Normal | 21 | 17 | 5994 |
Fig. 18Performance matrices for binary-class classification on CICIDS18 data
Fig. 19Performance matrices for multiclass classification on CICIDS18 data
Fig. 20Comparative performance analysis of proposed framework with DNN-based approach [41]