| Literature DB >> 33286928 |
Katarzyna Filus1, Adam Domański2, Joanna Domańska1, Dariusz Marek2, Jakub Szyguła2.
Abstract
The paper examines the ability of neural networks to classify Internet traffic data in terms of self-similarity expressed by the Hurst exponent. Fractional Gaussian noise is used for the generation of synthetic data for modeling the genuine ones. It is presented that the trained model is capable of classifying the synthetic data obtained from the Pareto distribution and the real traffic data. We present the results of training for different optimizers of the cost function and a different number of convolutional layers in the neural network.Entities:
Keywords: Hurst exponent; Internet traffic; convolutional neural networks; fractional Gaussian noise; neural networks; self-similarity
Year: 2020 PMID: 33286928 PMCID: PMC7597326 DOI: 10.3390/e22101159
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1The scenario of using a machine learning model—the convolutional neural network—in a router to adaptively change the active queue management’s mechanisms in the event of self-similarity level fluctuations
Figure 2The conceptual structure of a convolutional neural network used for the purpose of time-series analysis
The values of the parameters used for optimizers used in the experiment. The following symbols are used: is the learning rate, is the momentum coefficient, is the exponential decay rate for the first moment estimates, is the exponential decay rate for the second moment estimates and is the history and future gradient discounting factor.
| Optimizer | Parameters |
|---|---|
| Adam |
|
| RMSprop |
|
| SGD + Momentum + NAG |
|
Figure 3The training and validation loss during the training.
The averaged and highest accuracy measurements in percent for different optimizers for the models with two, three and four convolutional layers and the dimension 50 × 200 of data to treat time as a spatial dimension on test dataset generated from FGN source.
| Optimizer | ||||
|---|---|---|---|---|
|
|
|
| ||
|
| 2 layers | 89.23 | 89.52 | 88.11 |
| 3 layers | 90.41 | 90.51 | 90.70 | |
| 4 layers | 89.79 | 91.38 | 90.63 | |
|
| 2 layers | 90.41 | 91.61 | 89.02 |
| 3 layers | 91.17 | 91.30 | 91.04 | |
| 4 layers | 91.54 | 92.11 | 91.54 | |
The averaged accuracy measurements in percent for different optimizers for the models with two, three and four convolutional layers and the dimension 1000 × 10 of data to treat time as a spatial dimension on test dataset generated from FGN source.
| Optimizer | ||||
|---|---|---|---|---|
|
|
|
| ||
|
| 2 layers | 94.55 | 94.30 | 94.6 |
| 3 layers | 95.34 | 95.73 | 94.93 | |
| 4 layers | 95.67 | 96.29 | 95.46 | |
|
| 2 layers | 95.11 | 95.02 | 95.11 |
| 3 layers | 95.70 | 96.11 | 95.57 | |
| 4 layers | 96.69 | 96.94 | 95.91 | |
The accuracy measurements for test data from different sources for the models with two, three and four convolutional layers and the dimension 50 × 200 of data to treat time as a spatial dimension on test datasets generated from Pareto source and real traffic.
| Optimizer | ||||
|---|---|---|---|---|
|
|
|
| ||
|
| 2 layers | 87.69 | 87.92 | 87.10 |
| 3 layers | 86.73 | 86.78 | 86.66 | |
| 4 layers | 86.53 | 86.12 | 86.32 | |
|
| 2 layers | 83.04 | 83.39 | 83.74 |
| 3 layers | 83.63 | 83.86 | 84.09 | |
| 4 layers | 84.33 | 84.33 | 83.86 | |
|
| 2 layers | 79.49 | 80.37 | 79.44 |
| 3 layers | 79.80 | 79.54 | 79.23 | |
| 4 layers | 79.54 | 79.60 | 79.49 | |
The accuracy measurements for test data from different sources for the models with two, three and four convolutional layers and the dimension 1000 × 10 of data to treat time as a spatial dimension on test datasets generated from Pareto source and real traffic.
| Optimizer | ||||
|---|---|---|---|---|
|
|
|
| ||
|
| 2 layers | 88.54 | 88.12 | 88.90 |
| 3 layers | 88.36 | 88.58 | 89.65 | |
| 4 layers | 88.94 | 88.35 | 88.94 | |
|
| 2 layers | 86.20 | 86.20 | 86.90 |
| 3 layers | 86.20 | 85.50 | 85.96 | |
| 4 layers | 85.26 | 85.61 | 86.43 | |
|
| 2 layers | 82.61 | 83.07 | 83.07 |
| 3 layers | 83.28 | 82.14 | 81.36 | |
| 4 layers | 80.58 | 81.72 | 82.19 | |
Figure 4Confusion matrices of the most accurate two-layer model on the testing data generated from the FGN source and the source based on the Pareto distribution generated with the seaborn library.
Figure 5Receiver Operating Characteristic (ROC) of the micro and macro average results of the most accurate two-layer model on the testing data generated from the FGN source and the source based on the Pareto distribution.