| Literature DB >> 31584950 |
Gaoming Yang1, Xu Yu2, Lingwei Xu2, Yu Xin3, Xianjin Fang1.
Abstract
Sensor network intrusion detection has attracted extensive attention. However, previous intrusion detection methods face the highly imbalanced attack class distribution problem, and they may not achieve a satisfactory performance. To solve this problem, we propose a new intrusion detection algorithm based on normalized cut spectral clustering for sensor network in this paper. The main aim is to reduce the imbalance degree among classes in an intrusion detection system. First, we design a normalized cut spectral clustering to reduce the imbalance degree between every two classes in the intrusion detection data set. Second, we train a network intrusion detection classifier on the new data set. Finally, we do extensive experiments and analyze the experimental results in detail. Simulation experiments show that our algorithm can reduce the imbalance degree among classes and reserves the distribution of the original data on the one hand, and improve effectively the detection performance on the other hand.Entities:
Year: 2019 PMID: 31584950 PMCID: PMC6777755 DOI: 10.1371/journal.pone.0221920
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The minimum cut criteria illustration.
Fig 2The NCSC algorithm.
Fig 3The NIDNCSC algorithm.
Fig 4An illustration on the NIDNCSC algorithm.
Fig 5The training sample distribution of the first artificial data set.
Fig 6The training sample distribution of the second artificial data set.
Fig 7Generation example of synthetic examples.
Fig 8The new data distribution acquired by the NIDNCSC algorithm for the first artificial data set.
Fig 9The new data distribution acquired by the NIDNCSC algorithm for the second artificial data set.
Classification performance on the first artificial data set.
| Classification algorithm | Precision(%) | Recall(%) | Running time (s) | C | |
|---|---|---|---|---|---|
| SMOTE | 93.6 | 90.6 | 12.2 | 128 | 100 |
| RankRC | 93.2 | 90.1 | 8.6 | 64 | 50 |
| NIDNCSC | 95.2 | 92.2 | 5.3 | 16 | 200 |
Classification performance on the second artificial data set.
| Classification algorithm | Precision(%) | Recall(%) | Running time (s) | C | |
|---|---|---|---|---|---|
| SMOTE | 90.1 | 88.2 | 20.3 | 256 | 1000 |
| RankRC | 89.9 | 88.0 | 12.8 | 32 | 200 |
| NIDNCSC | 93.3 | 90.6 | 8.2 | 32 | 500 |
Data distributions of different classes.
| Class | Sample number | Percentage(%) |
|---|---|---|
| Normal | 97278 | 19.69 |
| Dos | 391458 | 79.24 |
| Probing | 4107 | 90.6 |
| R2L | 1126 | 0.23 |
| U2R | 52 | 0.01 |
The distribution of data set I.
| Class | Training sample number | Testing sample number(%) |
|---|---|---|
| Normal | 900 | 960 |
| Dos | 3700 | 3790 |
| Probing | 600 | 800 |
| R2L | 300 | 398 |
| U2R | 30 | 22 |
The distribution of data set II.
| Class | Training sample number | Testing sample number(%) |
|---|---|---|
| Normal | 900 | 960 |
| Dos | 3700 | 3790 |
| Probing | 600 | 800 |
| R2L | 1200 | 398 |
| U2R | 330 | 22 |
Fig 10Comparison of the precision rate.
Fig 11Comparison of the false alarm rate.
The best experimental parameter values and running time comparison.
| Algorithms | Running time (s) | C | |
|---|---|---|---|
| NIDNCSC | 56.2 | 256 | 200 |
| FMSVM | 70.1 | 32 | 50 |
| CVSGGDI | 82.3 | 1024 | 100 |
| RankRC | 66.2 | 128 | 50 |