| Literature DB >> 35742161 |
Eman Ashraf1,2, Nihal F F Areed2,3, Hanaa Salem1, Ehab H Abdelhay2, Ahmed Farouk4.
Abstract
Recently, there has been considerable growth in the internet of things (IoT)-based healthcare applications; however, they suffer from a lack of intrusion detection systems (IDS). Leveraging recent technologies, such as machine learning (ML), edge computing, and blockchain, can provide suitable and strong security solutions for preserving the privacy of medical data. In this paper, FIDChain IDS is proposed using lightweight artificial neural networks (ANN) in a federated learning (FL) way to ensure healthcare data privacy preservation with the advances of blockchain technology that provides a distributed ledger for aggregating the local weights and then broadcasting the updated global weights after averaging, which prevents poisoning attacks and provides full transparency and immutability over the distributed system with negligible overhead. Applying the detection model at the edge protects the cloud if an attack happens, as it blocks the data from its gateway with smaller detection time and lesser computing and processing capacity as FL deals with smaller sets of data. The ANN and eXtreme Gradient Boosting (XGBoost) models were evaluated using the BoT-IoT dataset. The results show that ANN models have higher accuracy and better performance with the heterogeneity of data in IoT devices, such as intensive care unit (ICU) in healthcare systems. Testing the FIDChain with different datasets (CSE-CIC-IDS2018, Bot Net IoT, and KDD Cup 99) reveals that the BoT-IoT dataset has the most stable and accurate results for testing IoT applications, such as those used in healthcare systems.Entities:
Keywords: IoT; blockchain; federated learning; healthcare security; intrusion detection; machine learning
Year: 2022 PMID: 35742161 PMCID: PMC9222634 DOI: 10.3390/healthcare10061110
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Figure 1Layered architecture of the proposed system FIDChain [9,38].
Figure 2Flow execution of the proposed FIDChain system.
Figure 3Diagram of the FIDChain aggregation of weights into the blockchain network.
Figure 4Artificial neural network architecture (binary classification).
The hyper-parameters used in the proposed detection model.
| Hyper-Parameters | Value |
|---|---|
| Learning rate | 0.001:0.1 (+0.01) |
| Number of epochs | 2:10 (+1) |
| Batch size | 100:1000 (+100) |
| Classification type | Binary |
| Activation function | Sigmoid |
| Optimization algorithm | Stochastic gradient descent (SGD) |
The description of BoT-IoT best features.
| State_Number | Numerical Representation of Feature State |
|---|---|
| Seq | Argus sequence number |
| N_IN_Conn_P_SrcIP | Number of inbound connections per source IP |
| N_IN_Conn_P_DstIP | Number of inbound connections per destination IP |
| Srate | Source-to-destination packets per second |
| Drate | Destination-to-source packets per second |
| Min | Minimum duration of aggregated records |
| Max | Maximum duration of aggregated records |
| Mean | Average duration of aggregated records |
| Stddev | Standard deviation of aggregated records |
Figure 5Feature ranking based on information gain.
Figure 6Confusion matrix.
Effectiveness main metrics.
| Metric | Equation | Definition |
|---|---|---|
| Accuracy |
| Ratio of correctly predicted instances to total number of predicted instances. |
| Precision |
| Ratio of the correctly predicted positive instances to total positive predictions. |
| Recall |
| Ratio of the correctly predicted positive instances to the overall available positive data category. |
| Specificity |
| Ratio of the correctly predicted negative instances to the overall available negative data category. |
| F1-score |
| Hybrid metric indicates the overall performance of the model respecting to both precision and recall, useful for unbalanced classes |
| False alarm rate |
| Ratio of false positive alarms per the total number of false prediction warnings or alarms. |
The performance analysis of FIDChain using ANN compared to XGBoost with BoT-IoT dataset.
| ML Algorithm | ANN | XGBOOST | ||
|---|---|---|---|---|
| Dataset Version | Full Features | Best 10 Features | Full Features | Best 10 Features |
| Accuracy | 99.99% | 99.99% | 98.40% | 98.96% |
| Precision (Detection Rate) | 100% | 100% | 99.36% | 99.38% |
| Recall (Sensitivity) | 99.99% | 99.99% | 99.59% | 99.57% |
| F-score | 99.99% | 99.99% | 99.47% | 99.47% |
| Specificity | 88.89% | 100% | 56.98% | 57.12% |
| False Alarm Rate | 11.11% | 0% | 43.02% | 42.88% |
Figure 7The average of training and testing losses of edge gateways (clients) of the proposed algorithm using: (a) BoT-IoT (full features); (b) BoT-IoT (best features).
Figure 8Losses in testing for each client of the proposed algorithm using: (a) BoT-IoT (full features); (b) BoT-IoT (best features).
Figure 9Average global epoch time with and without blockchain.
Comparison with related work tested on BoT-IoT dataset.
| Ref. | Model | Classification Type | Accuracy | Precision (Detection Rate) | Recall | F1-Score | Mode | Integration with Blockchain |
|---|---|---|---|---|---|---|---|---|
| [ | CNN-TSODE | Binary | 99.99% | 99.99% | 99.99% | 99.99% | Centralized | No |
| Multi | 99.04% | 99.04% | 99.04% | 99.04% | ||||
| [ | DNN | Multi | 98.37% | - | - | - | Centralized | No |
| RNN | ||||||||
| CNN | ||||||||
| [ | RNN | Multi | 98.20% | - | - | - | Centralized | No |
| [ | DeepDCA | Binary | 98.73% | 99.17% | 98.36% | 98.77% | Centralized | No |
| [ | Naive Bayes | Binary | 51.5% | - | - | - | Centralized | No |
| KNN | 92.1% | - | - | - | ||||
| ANN | 82.8% | - | - | - | ||||
| [ | RF | Multi | 99.99% | 99.99% | 99.99% | 99.99% | Centralized | Yes |
| XGBoost | 99.99% | 87.77% | 94.36% | 87.90% | ||||
| [ | NB | Binary | 52.18% | 79.67% | 99.70% | 69.50% | Centralized | No |
| KNN | 99.48% | 99.65% | 99.68% | 99.58% | ||||
| RF | 99.51% | 99.70% | 99.79% | 99.65% | ||||
| Log R | 99.50% | 95.28% | 90.39% | 94.70% | ||||
| DT | 99.47% | 99.69% | 99.79% | 99.63% | ||||
| [ | decision tree | Multi | 99.99% | 97.10% | 94.27% | 98.95% | Centralized | No |
| Naive Bayes | 97.49% | 56.28% | 57.95% | 98.44% | ||||
| Random Forest | 99.98% | 95.05% | 91.37% | 99.99% | ||||
| SVM | 97.80% | 57.89% | 43.24% | 98.48% | ||||
| [ | ANN | Multi | 99.9% | - | - | - | Centralized | No |
| 92.5% | - | - | - | Federated | ||||
| Our work | ANN | Binary | 99.99% | 100% | 99.99% | 99.99% | Federated | Yes |
Description of used datasets.
| Dataset | Description |
|---|---|
| CSE-CIC-IDS2018 [ | Network traffic-based dataset proposed by the Communications Security Establishment (CSE) & the Canadian Institute for Cybersecurity (CIC) including 7 botnet types with 80 network flow features. |
| Bot Net IoT [ | Internet-connected devices-based dataset proposed by Beigi et al. which is divided into training (with 7 botnet types) and test datasets (with 16 botnet types) with four groups of features (byte-based, packet-based, time, and behavior-based). |
| KDD Cup 99 [ | Network traffic-based dataset consists of approximately 4,900,000 vectors. The botnet types are divided into four categories (user-to-root attack (U2R), remote-to-local attack (R2L), probing attack, and denial-of-service attack (DoS)) containing 41 features, which are categorized into three classes (basic features, traffic features, and content features). |
Results of testing FIDChain with different datasets.
| Dataset | Precision (Detection Rate) | Recall (Sensitivity) | F-Score | Specificity | Accuracy | False Alarm Rate |
|---|---|---|---|---|---|---|
| CSE-CIC-IDS2018 | 0.4461 | 0.8581 | 0.5870 | 0.8589 | 0.8588 | 0.1411 |
| Bot Net IoT | 1.0000 | 0.9742 | 0.9869 | 0.9996 | 0.9756 | 0.0004 |
| Bot-IoT (10 Features) | 1.0000 | 0.9999 | 0.9999 | 1.0000 | 0.9999 | 0.0000 |
| Bot-IoT (All Features) | 1.0000 | 0.9999 | 0.9999 | 0.8889 | 0.9999 | 0.1111 |
| KDD Cup 99 | 0.9709 | 0.9491 | 0.9599 | 0.9928 | 0.9840 | 0.0072 |
Figure 10The average of training and testing losses of edge gateways (clients) of the proposed algorithm using: (a) CSE-CIC-IDS2018; (b) Bot Net IoT; and (c) KDD Cup 99.
Figure 11Losses in testing for each client of the proposed algorithm using: (a) CSE-CIC-IDS2018; (b) Bot Net IoT; and (c) KDD Cup 99.