| Literature DB >> 34884022 |
Nurfazrina M Zamry1, Anazida Zainal1, Murad A Rassam2,3, Eman H Alkhammash4, Fuad A Ghaleb1, Faisal Saeed5,6.
Abstract
Wireless Sensors Networks have been the focus of significant attention from research and development due to their applications of collecting data from various fields such as smart cities, power grids, transportation systems, medical sectors, military, and rural areas. Accurate and reliable measurements for insightful data analysis and decision-making are the ultimate goals of sensor networks for critical domains. However, the raw data collected by WSNs usually are not reliable and inaccurate due to the imperfect nature of WSNs. Identifying misbehaviours or anomalies in the network is important for providing reliable and secure functioning of the network. However, due to resource constraints, a lightweight detection scheme is a major design challenge in sensor networks. This paper aims at designing and developing a lightweight anomaly detection scheme to improve efficiency in terms of reducing the computational complexity and communication and improving memory utilization overhead while maintaining high accuracy. To achieve this aim, one-class learning and dimension reduction concepts were used in the design. The One-Class Support Vector Machine (OCSVM) with hyper-ellipsoid variance was used for anomaly detection due to its advantage in classifying unlabelled and multivariate data. Various One-Class Support Vector Machine formulations have been investigated and Centred-Ellipsoid has been adopted in this study due to its effectiveness. Centred-Ellipsoid is the most effective kernel among studies formulations. To decrease the computational complexity and improve memory utilization, the dimensions of the data were reduced using the Candid Covariance-Free Incremental Principal Component Analysis (CCIPCA) algorithm. Extensive experiments were conducted to evaluate the proposed lightweight anomaly detection scheme. Results in terms of detection accuracy, memory utilization, computational complexity, and communication overhead show that the proposed scheme is effective and efficient compared few existing schemes evaluated. The proposed anomaly detection scheme achieved the accuracy higher than 98%, with O(nd) memory utilization and no communication overhead.Entities:
Keywords: anomaly detection; one-class support vector machine; principal component analysis; sensor data analysis; wireless sensors networks
Mesh:
Year: 2021 PMID: 34884022 PMCID: PMC8659524 DOI: 10.3390/s21238017
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The Overview of Proposed CESVM-DR anomaly detection scheme.
Figure 2The training and detection phase of the proposed CESVM-DR anomaly detection scheme.
Figure 3Flowchart of Training Phase for the CESVM-DR Model.
Figure 4Flowchart for Detection Phase in the proposed CESVM-DR Scheme.
Statistical characteristics for normal and generated artificial anomalies for GSB datasets.
| Dataset | Variable | Normal | Anomalies | ||
|---|---|---|---|---|---|
| Mean | Std. Dev | Mean | Std. Dev | ||
| D1 | Ambient temperature | 5.26 | 8.28 | 7.75 | 9.90 |
| D2 | Ambient temperature | 3.61 | 7.44 | 5.39 | 9.00 |
| D3 | Ambient temperature | 3.29 | 6.86 | 5.33 | 9.77 |
| D4 | Ambient temperature | 4.56 | 8.15 | 7.69 | 10.02 |
| D5 | Ambient temperature | 3.37 | 7.92 | 10.65 | 11.60 |
Figure 5Histogram plots for (a) IBRL, (b) LUCE, (c) PDG and (d) NAMOS Dataset.
Figure 6Average Performance Comparisons Accuracy and DR (left) while FPR and FNR (right).
Performance comparison between the proposed CESVM-DR scheme and related anomaly detection schemes using simulated labelling with a kernel width of 2.
| Measure | Scheme | D1 | D2 | D3 | D4 | D5 | Average |
|---|---|---|---|---|---|---|---|
| DR (%) | CESVM-DR | 96.4 | 92 | 92 | 100 | 100 | 96.08 |
| CESVM | 74.8 | 85.2 | 82 | 78.8 | 80.8 | 80.32 | |
| EOOD (local) | 100 | 100 | 100 | 100 | 100 | 100 | |
| kPCA(local) | 100 | 100 | 100 | 100 | 100 | 100 | |
| FPR (%) | CESVM-DR | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 |
|
| CESVM | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | 1.2 | |
| EOOD (local) | 32.7 | 32.7 | 40.9 | 34.5 | 30.5 | 34.26 | |
| kPCA(local) | 7.4 | 37.6 | 47.2 | 35.4 | 7.4 | 27 | |
| FNR (%) | CESVM-DR | 3.6 | 8 | 8 | 0 | 0 |
|
| CESVM | 25.2 | 14.8 | 18 | 21.2 | 19.2 | 19.68 | |
| EOOD (local) | 0 | 0 | 0 | 0 | 0 | 0 | |
| kPCA(local) | 0 | 0 | 0 | 0 | 0 | 0 | |
| Accuracy (%) | CESVM-DR | 98.6 | 98.2 | 98.2 | 98.9 | 98.9 |
|
| CESVM | 96.6 | 97.6 | 97.3 | 97 | 97.2 | 97.14 | |
| EOOD (local) | 70.3 | 70.3 | 62.8 | 68.6 | 72.3 | 68.86 | |
| kPCA(local) | 93.3 | 65.3 | 57.1 | 64 | 63.6 | 68.66 |
Figure 7Average Performance Comparisons Accuracy and DR (left) while FPR and FNR (right).
The Comparison Proposed of Effectiveness Evaluation with Other Related Anomaly Detection Schemes Using Histogram-Based Labelling.
| Dataset | Model | DR | ACC | FPR | FNR |
|---|---|---|---|---|---|
| IBRL | DWT + OCSVM | 100 | 98.3 | 1.9 | 0 |
| DWT + SOM | 100 | 99 | 1.09 | 0 | |
| PCCAD | 100 | 99.7 | 0.3 | 0 | |
| CESVM-DR | 100 | 98.4 | 1.6 | 0 | |
| LUCE | DWT + OCSVM | 100 | 98.3 | 1.9 | 0 |
| DWT + SOM | 100 | 99 | 1.09 | 0 | |
| PCCAD | 100 | 99.9 | 0.09 | 0 | |
| CESVM-DR | 100 | 98 | 2 | 0 | |
| PDG | DWT + OCSVM | 99.7 | 97.6 | 2.6 | 0.3 |
| DWT + SOM | 83 | 97.8 | 0.5 | 16.5 | |
| PCCAD | 97.9 | 96.7 | 3.5 | 2.1 | |
| CESVM-DR | 99.1 | 78.6 | 25.8 | 0.01 | |
| NAMOS | DWT + OCSVM | 100 | 88.6 | 12.8 | 0 |
| DWT + SOM | 100 | 99.4 | 0.5 | 0 | |
| PCCAD | 100 | 90.2 | 11.5 | 0 | |
| CESVM-DR | 100 | 100 | 0 | 0 |
The efficiency evaluation between CESVM-DR and other schemes.
| Scheme | Memory Utilization | Computational Complexity | Communication Overhead |
|---|---|---|---|
| CESVM |
|
|
|
| EOOD |
|
| - |
| PCCAD |
|
| - |
| kPCA |
|
|
|
| DWT + SOM |
|
|
|
| DWT + OCSVM |
|
|
|
| CESVM-DR |
|
| - |
The description of efficiency parameter.
| Legends | Descriptions |
|---|---|
|
| Low-rank approximation of the kernel Gram matrix |
|
| Number of the data observations |
|
| The dimension of the data vector |
|
| The reduced dimension of the data vector |
|
| linear optimization problem calculation |
|
| The calculation of CCIPCA |
|
| applying anomaly detection for DWT |
|
| applying anomaly detection online for OCSVM |
|
| applying anomaly detection for SOM |
|
| communication of wavelet coefficient to the central node |