| Literature DB >> 35808501 |
Nagaiah Mohanan Balamurugan1, Malaiyalathan Adimoolam2, Mohammed H Alsharif3, Peerapong Uthansakul4.
Abstract
Network data traffic is increasing with expanded networks for various applications, with text, image, audio, and video for inevitable needs. Network traffic pattern identification and analysis of traffic of data content are essential for different needs and different scenarios. Many approaches have been followed, both before and after the introduction of machine and deep learning algorithms as intelligence computation. The network traffic analysis is the process of incarcerating traffic of a network and observing it deeply to predict what the manifestation in traffic of the network is. To enhance the quality of service (QoS) of a network, it is important to estimate the network traffic and analyze its accuracy and precision, as well as the false positive and negative rates, with suitable algorithms. This proposed work is coining a new method using an enhanced deep reinforcement learning (EDRL) algorithm to improve network traffic analysis and prediction. The importance of this proposed work is to contribute towards intelligence-based network traffic prediction and solve network management issues. An experiment was carried out to check the accuracy and precision, as well as the false positive and negative parameters with EDRL. Also, convolutional neural network (CNN) machines and deep learning algorithms have been used to predict the different types of network traffic, which are labeled text-based, video-based, and unencrypted and encrypted data traffic. The EDRL algorithm has outperformed with mean Accuracy (97.20%), mean Precision (97.343%), mean false positive (2.657%) and mean false negative (2.527%) than the CNN algorithm.Entities:
Keywords: deep learning; internet traffic; machine learning; network traffic; reinforcement learning; traffic prediction
Mesh:
Year: 2022 PMID: 35808501 PMCID: PMC9269698 DOI: 10.3390/s22135006
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Various classes and methods of machine and deep learning techniques.
| Class | Method | Learning Technique |
|---|---|---|
| LSTM | Discriminative | Supervised |
| CNN | ||
| RNN | ||
| MLP |
Data input, policy relationship for types of ML and DL algorithms.
| Method | Data Input | |
|---|---|---|
| Know Answer | Policy/Problems | |
| Supervised learning | Learned output with supervision | Learning reward-based output with supervision |
| Reinforcement learning method | Maximize reward-based output | Feedback trained maximized reward-based output |
| Deep learning method | Deep learning-based output | Deep learning and feedback trained-based output |
| Deep reinforcement learning method | Deep-based maximize reward-based output | Deep and feedback trained-based maximize reward-based output |
| Enhanced deep reinforcement learning method | Accurate deep-based maximize reward-based output | Accurate output, based on deep and feedback trained maximize reward |
Network data features and its notation.
| Feature Index | Notation | Feature Description |
|---|---|---|
| f1 | avg_seg_sz | Average size of segment |
| f2 | win_sz | Window size |
| f3 | r_t_t | Round Trip delay Time |
| f4 | var_pack | Variance in packets |
| f5 | Act_dt_pkt | Actual data packet |
| f6 | clt_pn | Client port number |
| f7 | svr_pn | Server port number |
Classes of network traffic.
| Class Index | Notation | Class Description | Applications |
|---|---|---|---|
| c1 | www_pkt | www packet | General browsing data |
| c2 | p2p_pkt | P2P network packet | Torrent streaming |
| c3 | ml_pkt | Mail service packet | SMTP, POP, MIME, IMAP |
| c4 | db_pkt | Database packet | SQL net |
| c5 | mul_pkt | Multimedia packet | Video storage server YouTube |
Figure 1DNN multilayer perceptron for network traffic processing.
Figure 2An EDRL architecture for network traffic analysis.
Group statistics for accuracy comparison of EDRL vs. CNN algorithms to measure mean, standard deviation, and standard error mean.
| Group Statistics | |||||
|---|---|---|---|---|---|
| Algorithm | N | Mean | Std. Deviation | Std. Error Mean | |
| Accuracy | EDRL | 10 | 97.200 | 1.71156 | 0.538 |
| CNN | 10 | 93.055 | 2.29835 | 0.727 | |
Comparison of the independent sample t-test parameters of EDRL and CNN algorithms.
| Independent Samples Test | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Levene’s Test for Equality of Variances | ||||||||||
| F | Sig. | t | df | Sig. (2-Tailed) | Mean Difference | Std. Error Difference | 95% Confidence Interval of the Difference | |||
| Lower | Upper | |||||||||
| Accuracy | Equal variances assumed | 1.111 | 0.306 | 4.519 | 18 | 0.000 | 4.095 | 0.90619 | 2.191 | 5.999 |
| Equal variances not assumed | 4.519 | 16.634 | 0.000 | 4.095 | 0.90619 | 2.180 | 6.010 | |||
Group statistics for precision comparison of EDRL vs. CNN algorithms to measure mean, standard deviation, and standard error mean.
| Group Statistics | |||||
|---|---|---|---|---|---|
| Algorithm | N | Mean | Std. Deviation | Std. Error Mean | |
| Precision | EDRL | 10 | 97.343 | 1.519 | 0.480 |
| CNN | 10 | 93.972 | 2.403 | 0.760 | |
Comparison of independent samples t-test parameters of EDRL and CNN algorithms for precision.
| Independent Samples Test | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Levene’s Test for Equality of Variances | ||||||||||
| F | Sig. | t | df | Sig. (2-Tailed) | Mean Difference | Std. Error Difference | 95% Confidence Interval of the Difference | |||
| Lower | Upper | |||||||||
| Precision | Equal variances assumed | 2.351 | 0.143 | 4.295 | 18 | 0.000 | 3.861 | 0.899 | 1.972 | 5.750 |
| Equal variances not assumed | 4.295 | 15.20 | 0.001 | 3.861 | 0.899 | 1.947 | 5.780 | |||
Group statistics for false positive comparison of EDRL vs. CNN algorithms to measure the mean, standard deviation, and standard error mean.
| Group Statistics | |||||
|---|---|---|---|---|---|
| Algorithm | N | Mean | Std. Deviation | Std. Error Mean | |
| False positive | EDRL | 10 | 2.657 | 1.85335 | 0.586 |
| CNN | 10 | 6.325 | 2.19063 | 0.693 | |
Comparison of independent samples t-test parameters of EDRL and CNN algorithms for false positive.
| Independent Samples Test | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Levene’s Test for Equality of Variances | ||||||||||
| F | Sig. | t | df | Sig. (2-Tailed) | Mean Difference | Std. Error Difference | 95% Confidence Interval of the Difference | |||
| Lower | Upper | |||||||||
| False positive | Equal variances assumed | 0.372 | 0.550 | −4.042 | 18 | 0.001 | −3.668 | 0.907 | −5.574 | −1.762 |
| Equal variances not assumed | −4.042 | 17.51 | 0.001 | −3.668 | 0.907 | −5.578 | −1.758 | |||
Group statistics for false negative comparison of EDRL vs. CNN algorithms to measure the mean, standard deviation, and standard error mean.
| Group Statistics | |||||
|---|---|---|---|---|---|
| Algorithm | N | Mean | Std. Deviation | Std. Error Mean | |
| False negative | EDRL | 10 | 2.5270 | 1.22734 | 0.38812 |
| CNN | 10 | 5.6750 | 1.9920 | 0.61643 | |
Comparison of independent sample t-test parameters of the EDRL and CNN algorithms for the false negative.
| Independent Samples Test | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Levene’s Test for Equality of Variances | ||||||||||
| F | Sig. | t | df | Sig. (2-Tailed) | Mean Difference | Std. Error Difference | 95% Confidence Interval of the Difference | |||
| Lower | Upper | |||||||||
| False negative | Equal variances assumed | 3.113 | 0.095 | −4.826 | 18 | 0.00 | −3.598 | 0.746 | −5.164 | −2.032 |
| Equal variances not assumed | −4.826 | 14.87 | 0.00 | −3.598 | 0.746 | −5.188 | −2.008 | |||
Figure 3Mean accuracy comparison of EDRL and CNN, including error rates.
Figure 4Precision comparison of EDRL and CNN algorithms with the error mean measure.
Figure 5The false positive comparison between the EDRL and CNN algorithms with stand error bars with SD and confidence interval.
Figure 6False negative comparison between EDRL and CNN algorithm with stand error bars with SD and confidence interval.
Various ML and DL accuracy measure comparisons with the proposed EDRL algorithm.
| Work Name | Algorithm Used | Accuracy |
|---|---|---|
| EDONKEY application network traffic [ | KNN and RF | 72.08% and 90.53% |
| FTP_CONTROL [ | ANN | 78.00% |
| The network traffic of FTP and P2P [ | KNN | 94% |
| The CNN based application identification task [ | CNN | 94% |
| Traffic classification was less with UNB ISCX VPN-Non-VPN dataset [ | SVM | 94.2% |
| Orange platform of Nigerian University [ | KNN, RF, NN, and NB | 79.6%, 84.8%, 84.6%, and 87.6% |
| Internet traffic of different applications | The proposed EDRL algorithm | 97.20% |