| Literature DB >> 35957178 |
Eva Rodríguez1, Pol Valls1, Beatriz Otero1, Juan José Costa1, Javier Verdú1, Manuel Alejandro Pajuelo1, Ramon Canal1.
Abstract
Cyberattacks in the Internet of Things (IoT) are growing exponentially, especially zero-day attacks mostly driven by security weaknesses on IoT networks. Traditional intrusion detection systems (IDSs) adopted machine learning (ML), especially deep Learning (DL), to improve the detection of cyberattacks. DL-based IDSs require balanced datasets with large amounts of labeled data; however, there is a lack of such large collections in IoT networks. This paper proposes an efficient intrusion detection framework based on transfer learning (TL), knowledge transfer, and model refinement, for the effective detection of zero-day attacks. The framework is tailored to 5G IoT scenarios with unbalanced and scarce labeled datasets. The TL model is based on convolutional neural networks (CNNs). The framework was evaluated to detect a wide range of zero-day attacks. To this end, three specialized datasets were created. Experimental results show that the proposed TL-based framework achieves high accuracy and low false prediction rate (FPR). The proposed solution has better detection rates for the different families of known and zero-day attacks than any previous DL-based IDS. These results demonstrate that TL is effective in the detection of cyberattacks in IoT environments.Entities:
Keywords: IoT networks; convolutional neural network; cybersecurity; intrusion detection systems; transfer learning
Mesh:
Year: 2022 PMID: 35957178 PMCID: PMC9371036 DOI: 10.3390/s22155621
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Transfer-learning-based intrusion detection in IoT environments.
| Reference | TL | Source Dataset | Target Dataset | Accuracy |
|---|---|---|---|---|
| Wu et al. [ | CNN-CNN | UNSW-NB15 | NSL-KDD | 81.94% |
| Masum et al. [ | DNN-DNN | VGG-16 | NSL-KDD | 70.97% |
| Sameera et al. [ | PCA-KNN | NSL-KDD | NSL-KDD | 89.79% |
| Singla et al. [ | DNN-DNN | UNSW-NB15 Subset | UNSW-NB15 Subset (single new attack) | 95–98% |
| Li et al. [ | SVM-RF | AWID | AWID | 96% |
| Mehedi et al. [ | CNN-CNN | Custom | Custom | 98.1% |
| Fan et al. [ | CNN-CNN | CICIDS2017 | Custom | 91.93% |
| Idrissi et al. [ | CNN-CNN | BoT-IoT | TON-IoT | 99.43% |
| Guan et al. [ | BiT EfficientNet | Custom | 10% USTC-TFC2016 | 96% |
| Mehedi et al. [ | CNN | Custom | Custom | 87% |
Figure 1CNN architecture.
Figure 2Overall structure of the proposed intrusion detection framework.
Figure 3CNN-based IDS-model structure.
Figure 4CNN-TL IDS-model structure.
BoT-IoT Dataset.
| Category | Subcategory | Records | Description |
|---|---|---|---|
| Normal | Normal | 9543 | Natural transaction data. |
| DoS | TCP | 38,532,480 | A malicious attack to cripple the services offered by a site, server, or network overloading the target of its associated infrastructure by flooding the site with many requests. |
| DDoS | TCP | 33,005,194 | Attack where multiple compromised computer systems attack a target, causing a DoS. |
| Reconnaissance | OS fingerprinting | 1,821,639 | All the different strikes simulating attacks gathering information. |
| Information Theft | Keylogging | 1587 | Stealing of personal user information. |
UNSW-NB15 dataset.
| Category | Records | Description |
|---|---|---|
| Normal | 2,218,761 | Natural transaction data. |
| Generic | 215,481 | Attack against blockciphers with a given block and key size (not considering its structure). |
| Exploits | 44,525 | Attack that exploits vulnerabilities, taking advantage of security problems (of an operating system or a piece of software) known by the attackers. |
| Fuzzers | 24,246 | Attack that suspends a program or network, feeding it with randomly generated data. |
| DoS | 16,353 | A malicious attack that makes a server or network resource unavailable, overloading the target of the associated infrastructure with a flood of Internet traffic. |
| Reconnaissance | 13,987 | Comprises different attacks that gather information. |
| Analysis | 2677 | Different attacks on penetrations (HTML files, spam, and port scan). |
| Backdoors | 2329 | An attack that bypasses a system security mechanism to access a computer or its data. |
| Shellcode | 1511 | Attack that exploits software vulnerabilities using small pieces of code as payloads. |
| Worms | 174 | Attack where the attacker replicates itself to spread to other computers. |
UNSW-NB15-Basic-Train and UNSW-NB15-Basic-Test datasets.
| UNSW-NB15-Basic-Train | UNSW-NB15-Basic-Test | |||
|---|---|---|---|---|
| Name | Records | Percentage | Records | Percentage |
| Normal | 217,552 | 49.95% | 72,794 | 50.14% |
| Generic | 161,865 | 37.17% | 53,616 | 36.93% |
| Exploits | 33,408 | 7.67% | 11,117 | 7.66% |
| DoS | 12,196 | 2.80% | 4157 | 2.86% |
| Reconnaissance | 10,498 | 2.41% | 3489 | 2.40% |
UNSW-NB15-Test+ and UNSW-NB15-Test datasets.
| UNSW-NB15-Test+ | UNSW-NB15-Test | |||
|---|---|---|---|---|
| Name | Records | Percentage | Records | Percentage |
| Normal | 30,937 | 50.00% | 321,283 | 50.00% |
| Generic | - | - | 215,481 | 33.53% |
| Exploits | - | - | 44,525 | 6.93% |
| DoS | - | - | 16,353 | 2.54% |
| Reconnaissance | - | - | 13,987 | 2.18% |
| Fuzzers | 24,246 | 39.19% | 24,246 | 3.77% |
| Analysis | 2677 | 4.33% | 2677 | 0.42% |
| Backdoor | 2329 | 3.76% | 2329 | 0.36% |
| Shellcode | 1511 | 2.44% | 1511 | 0.24% |
| Worms | 174 | 0.28% | 174 | 0.03% |
Dataset summary showing the number of records corresponding to normal and malicious traffic, the corresponding percentage of attacks, and the percentage of novel attacks.
| Dataset | Normal | Attack | % Attack | % Novel Attack |
|---|---|---|---|---|
| BoT-IoT | 9543 | 5,823,226 | 99.84% | - |
| UNSW-NB15-Basic-Train | 217,552 | 217,967 | 50.04% | - |
| UNSW-NB15-Basic-Test | 72,794 | 72,379 | 49.85% | 0.00% |
| UNSW-NB15-Test+ | 30,937 | 30,937 | 50.00% | 100.00% |
| UNSW-NB15-Test | 321,283 | 321,283 | 50.00% | 9.63% |
Common features for the BoT-IoT and UNSW-NB15 datasets.
| BoT-IoT | UNSW-NB15 | Type | Description | |
|---|---|---|---|---|
| 1 | proto | proto | nominal | Textual representation of transaction protocols present in network flow. |
| 2 | saddr | srcip | nominal | Source IP address. |
| 3 | sport | sport | integer | Source port number. |
| 4 | daddr | dstip | nominal | Destination IP address. |
| 5 | dport | dsport | integer | Destination port number. |
| 6 | spkts | spkts | float | Source-to-destination packet count. |
| 7 | dpkts | dpkts | float | Destination-to-source packet count. |
| 8 | sbytes | sbytes | float | Source-to-destination byte count. |
| 9 | dbytes | dbytes | float | Destination-to-source byte count. |
| 10 | state | state | nominal | Transaction state. |
| 11 | stime | stime | timestamp | Record start time. |
| 12 | ltime | ltime | timestamp | Record last time. |
| 13 | dur | dur | float | Record total duration. |
| 14 | attack | label | binary | Class label: 0 for normal traffic, 1 for attack. |
| 15 | category | attack_cat | nominal | Cyberattack family. |
Classification layers parameters summary: CNN-TL model.
| Classification Head | Layer 1 | Layer 2 | Layer 3 | Output Layer |
|---|---|---|---|---|
| Number of neurons | 448 | 224 | 112 | 2 |
| Dropout probability | 0.4 | 0.3 | 0.3 | - |
| Activation | ReLu | ReLu | ReLu | Softmax |
TL model training parameters summary.
| Model | Epochs | Batch Size | Optimizer | Learning Rate | Loss |
|---|---|---|---|---|---|
| CNN-B | 25 | 208 | Adam |
| Categorical cross-entropy |
| CNN-TL | 15 | 4096 | Adam |
| Categorical cross-entropy |
Attack detection summary UNSW-NB15 dataset: Zero-day attacks.
| Traffic | Detection Rate | Detected Samples | Non Detected Samples |
|---|---|---|---|
| Normal | 98.34% | 30,358 | 513 |
| Analysis | 100.00% | 622 | 0 |
| Backdoor | 100.00% | 357 | 0 |
| Fuzzers | 99.95% | 21,507 | 10 |
| Shellcode | 99.93% | 1510 | 1 |
| Worms | 98.85% | 172 | 2 |
Attack detection summary UNSW-NB15 dataset: Known and zero-day attacks.
| Traffic | Detection Rate | Detected Samples | Non Detected Samples |
|---|---|---|---|
| Normal | 98.53% | 315,902 | 46,081 |
| DoS | 99.43% | 3841 | 22 |
| Exploits | 99.75% | 28,249 | 68 |
| Generic | 99.98% | 213,678 | 40 |
| Reconnaissance | 99.94% | 11,848 | 6 |
| Analysis | 99.84% | 621 | 1 |
| Backdoor | 99.44% | 355 | 2 |
| Fuzzers | 99.79% | 21,472 | 45 |
| Shellcode | 99.93% | 1510 | 1 |
| Worms | 98.85% | 172 | 2 |
Attack detection rate for known and zero-day attacks.
| UNSW-NB15-Test | UNSW-NB15-Test+ | |||||
|---|---|---|---|---|---|---|
| Traffic | CNN | TL | Improvement | CNN | TL | Improvement |
| Normal | 99.65% | 98.54% | −1.11% | 98.52% | 98.34% | −0.18% |
| DoS | 96.73% | 99.43% | 2.7 0% | - | - | - |
| Exploits | 97.90% | 99.76% | 1.86% | - | - | - |
| Generic | 99.16% | 99.98% | 0.82% | - | - | - |
| Reconnaissance | 92.85% | 99.95% | 7.10% | - | - | - |
| Analysis | 86.14% | 99.84% | 13.7% | 66.72% | 100.00% | 33.28% |
| Backdoor | 83.62% | 99.44% | 15.82% | 89.64% | 100.00% | 16.38% |
| Fuzzers | 80.76% | 99.79% | 19.03% | 69.20% | 99.95% | 30.75% |
| Shellcode | 89.43% | 99.93% | 10.50% | 98.34% | 99.93% | 1.59% |
| Worms | 96.31% | 98.85% | 2.54% | 95.97% | 98.85% | 2.88% |