| Literature DB >> 35632184 |
Joseph Isabona1, Agbotiname Lucky Imoize2,3, Yongsung Kim4.
Abstract
Over the past couple of decades, many telecommunication industries have passed through the different facets of the digital revolution by integrating artificial intelligence (AI) techniques into the way they run and define their processes. Relevant data acquisition, analysis, harnessing, and mining are now fully considered vital drivers for business growth in these industries. Machine learning, a subset of artificial intelligence (AI), can assist, particularly in learning patterns in big data chunks, intelligent extrapolative extraction of data and automatic decision-making in predictive learning. Firstly, in this paper, a detailed performance benchmarking of adaptive learning capacities of different key machine-learning-based regression models is provided for extrapolative analysis of throughput data acquired at the different user communication distances to the gNodeB transmitter in 5G new radio networks. Secondly, a random forest (RF)-based machine learning model combined with a least-squares boosting algorithm and Bayesian hyperparameter tuning method for further extrapolative analysis of the acquired throughput data is proposed. The proposed model is herein referred to as the RF-LS-BPT method. While the least-squares boosting algorithm is engaged to turn the possible RF weak learners to form stronger ones, resulting in a single strong prediction model, the Bayesian hyperparameter tuning automatically determines the best RF hyperparameter values, thereby enabling the proposed RF-LS-BPT model to obtain desired optimal prediction performance. The application of the proposed RF-LS-BPT method showed superior prediction accuracy over the ordinary random forest model and six other machine-learning-based regression models on the acquired throughput data. The coefficient of determination (Rsq) and mean absolute error (MAE) values obtained for the throughput prediction at different user locations using the proposed RF-LS-BPT method range from 0.9800 to 0.9999 and 0.42 to 4.24, respectively. The standard RF models attained 0.9644 to 0.9944 Rsq and 5.47 to 12.56 MAE values. The improved throughput prediction accuracy of the proposed RF-LS-BPT method demonstrates the significance of hyperparameter tuning/optimization in developing precise and reliable machine-learning-based regression models. The projected model would find valuable applications in throughput estimation and modeling in 5G and beyond 5G wireless communication systems.Entities:
Keywords: 5G performance measurement; adaptive learning; hyperparameter tuning; least-squares boosting; machine learning; optimization; random forest; throughput data
Mesh:
Year: 2022 PMID: 35632184 PMCID: PMC9146345 DOI: 10.3390/s22103776
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Limitations of some related works.
| Year | Reference | Focus and Coverage | Limitations | Comparison with This Paper |
|---|---|---|---|---|
| 1989 | Battiti [ | The work focuses on accelerated backpropagation learning, considering two optimization techniques. | There is a need to assess the performance of the models for networks with a large number of weights. | This paper presents a detailed statistical analysis of the acquired throughput data through performance status quality reporting at the different user equipment terminal locations. |
| 2008 | Castillo [ | Adaptive learning algorithms for Bayesian network classifiers were projected. The work aims to handle the cost–performance trade-off and deals with concept drift. | The work did not provide adequate information on how to resolve the bottleneck challenges in a prequential learning framework as the training data increase over time. | The current work examined the performance of the projected learning-based models for 5G wireless networks using large-scale throughput data acquired from several network operators in the United States. |
| 2011 | Khan, Tembine, and Vasilakos [ | The work presents game dynamics and the cost of learning in heterogeneous 4G networks. | The work provides numerical examples and OPNET simulations concerning network selection in WLAN and LTE. However, experimental validation of the numerical results is missing. | Our work presents performance benchmarking of adaptive learning capabilities of different machine-learning-based regression models based on the experimental 5G throughput data. |
| 2016 | Pandey and Janhunen [ | The work presents a method based on reinforcement learning for automating parts of the management of mobile networks. | The work did not cover the concept of learning with partial observability and cooperative learning that considers the neighboring base stations. | Our work addresses the problem of learning with partial observability and cooperative learning by integrating the neighboring base stations based on the 5G data analyzed. |
| 2018 | Li, Cao and Hao [ | The work presents an adaptive-learning-based network selection approach for 5G dynamic environments. The system enables users to adaptively adjust their selections in response to the gradually or abruptly changing environment. | Though the proposed approach enables a population of terminal users to adapt effectively to the network dynamics, experimental validation of the proposed approach is missing. | Our work proposed an RF-LS-BPT regression model for improved dataset predictive modeling and learning based on 5G experimental datasets. |
| 2020 | Narayanan et al. [ | The work focuses on commercial 5G performance on smartphones using 5G networks of three carriers in three US cities. Additionally, the work explored the feasibility of using location and other environmental data to predict network performance. | The work developed practical and sound measurement methodologies for 5G networks on COTS smartphones but did not provide the learning-based models for the 5G performance measurements. | The current work projected learning-based models for improved dataset predictive modeling and learning based on the 5G throughput data. |
| 2021 | Moodi, Ghazvini, and Moodi [ | The work considers a hybrid intelligent approach to detect android botnets using a smart self-adaptive-learning-based PSO-SVM. | The authors observed that one of the factors influencing the selection of important features of a dataset is the approach and the parameters used on that dataset. However, practical deployment of the projected hybrid intelligent approach was not considered. | An optimized RF-LS-BPT regression model was proposed for accurate throughput data modeling and learning using different performance indicators based on experimental datasets. |
| 2022 | Hervis Santana et al. [ | The work examines the application of a machine-learning-based algorithm to approximate a complex 5G path loss prediction model. Specifically, the decision tree ensembles (bagging) algorithm was employed to build a generic model which was used to estimate the pathloss. | Time optimization for the feature (input) calculation process was not considered in this work. Experimental validation of the proposed model is also missing. Lastly, practical testing of the model for accurate wireless network planning is required. | The current work captured optimization for the features (inputs) variables and experimentally validated the proposed model using practical 5G throughput data. |
Figure 1(a) Flowchart for the Proposed RF-LS-BPT model; (b) the RF-LS-BPT model and its hyperparameter tuning implementation process.
Figure 2Measured throughput qualities attained at close communication distances of 25, 50, 75, 100, and 160 m between the transmitter and UET.
Throughput quality status attained from 25 to 160 m UET communication distance.
| Distance (m) | Max. | Min. | Mean | Median | STD |
|---|---|---|---|---|---|
| 25 | 2.35 × 10³ | 31.43 | 947.61 | 450.56 | 625.27 |
| 50 | 2.08 × 10³ | 335.54 | 925.27 | 734.02 | 604.43 |
| 75 | 2.07 × 10³ | 906.13 | 807.38 | 807.38 | 614.36 |
| 100 | 1.97 × 10³ | 10.49 | 855.26 | 718.17 | 540.10 |
| 160 | 1.99 × 10³ | 146.79 | 808.43 | 655.34 | 482.53 |
Figure 3Throughput quality comparison of different machine learning models. (a) Least-squares (LS). (b) Neural networks (NNs). (c) Support vector machine (SVM). (d) Decision tree (DT). (e) Gaussian process regression (GPR). (f) K-nearest neighbor (KNN). (g) Random forest (RF).
Figure 4Throughput quality prediction accuracy with MAE attained by the different machine learning models. (a) Least-squares (LS). (b) Neural networks (NNs). (c) Support vector machine (SVM). (d) Decision tree (DT). (e) Gaussian process regression (GPR). (f) K-nearest neighbor (KNN). (g) Random forest (RF).
Figure 5Throughput quality prediction accuracy with Rsq attained by the different machine learning models on throughput quality. (a) Least-squares (LS). (b) Neural networks (NNs). (c) Support vector machine (SVM). (d) Decision tree (DT). (e) Gaussian process regression (GPR). (f) K-nearest neighbor (KNN). (g) Random forest (RF).
Figure 6Minimum objective versus function evaluation pattern with Bayesian search.
Figure 7Cross-validated MSE curves arising from hyperparameter tuning.
Optimal hyperparameter values using grid search and Bayesian search.
| Hyperparameters | Best Grid Search Hyperparameter Values | Best Bayesian Search Hyperparameter Values |
|---|---|---|
| Learning Rate | 0.25 | 0.29025 |
| Num. Trees | 52 | 23 |
| MaxNumSplits | 32 | 195 |
Figure 8Predicted throughput quality performance of the proposed RF-LS-BPT model and standard RF model at 25 m distance.
Figure 9Predicted throughput quality performance of the proposed RF-LS-BPT model and standard RF model at 50 m distance.
Figure 10Predicted throughput quality performance of the proposed RF-LS-BPT model and standard RF model at 75 m distance.
Figure 11Predicted throughput quality performance of the proposed RF-LS-BPT model and standard RF model at 100 m distance.
Figure 12Predicted throughput quality performance of the proposed RF-LS-BPT model and standard RF model at 160 m distance.
Figure 13Predicted throughput quality correlation performance of the proposed RF-LS-BPT model and standard RF model at 25 m distance.
Figure 14Predicted throughput quality correlation performance of the proposed RF-LS-BPT model and standard RF model at 50 m distance.
Figure 15Predicted throughput quality correlation performance of the proposed RF-LS-BPT model and standard RF model at 75 m distance.
Figure 16Predicted throughput quality correlation performance of the proposed RF-LS-BPT model and standard RF model at 100 m distance.
Figure 17Predicted throughput quality correlation performance of the proposed RF-LS-BPT model and standard RF model at 160 m distance.
Proposed RF-LS-BPT and standard RF regression model accuracy for training.
| Accuracy | 25 | 50 | 75 | 100 | 160 | |
|---|---|---|---|---|---|---|
| Optimized RF | MAE 1 | 2.40 | 0.42 | 0.86 | 2.95 | 4.24 |
| Standard RF | MAE 2 | 9.24 | 5.47 | 6.58 | 7.84 | 12.56 |
| Optimized RF | NRMSE 1 | 0007 | 0.0001 | 0.0027 | 0.0111 | 0.081 |
| Standard RF | NRMSE 2 | 0.009 | 0.0045 | 0.0049 | 0.0117 | 0.02 |
| Optimized RF | Rsq 21 | 0.9999 | 0.9999 | 0.9999 | 0.9986 | 0.9890 |
| Standard RF | Rsq 22 | 0.9998 | 0.9998 | 0.9997 | 0.9983 | 0.9488 |
Proposed RF-LS-BPT and standard RF regression model accuracy for testing.
| Accuracy | 25 | 50 | 75 | 100 | 160 | |
|---|---|---|---|---|---|---|
| Optimized RF | MAE 1 | 1.33 | 0.88 | 1.12 | 11.37 | 11.92 |
| Standard RF | MAE 2 | 9.27 | 2.41 | 3.84 | 12.88 | 13.82 |
| Optimized RF | NRMSE 1 | 0.0041 | 0.0025 | 0.0029 | 0.2700 | 0.0490 |
| Standard RF | NRMSE 2 | 0.0043 | 0.0029 | 0.0035 | 0.2720 | 0.0494 |
| Optimized RF | Rsq 21 | 0.9998 | 0.9999 | 0.9999 | 0.9926 | 0.9881 |
| Standard RF | Rsq 22 | 0.9990 | 0.9977 | 0.9997 | 0.9920 | 0.9800 |
Figure 18MAE values attained in engaging the LS-Boost and bagging algorithms in the random forest model in adaptively testing the throughput quality at different distances.
Throughput data training accuracy using LS-Boosting and bagging.
| Accuracy | 25 | 50 | 75 | 100 | 160 | |
|---|---|---|---|---|---|---|
| Training (LS-Boosting) | MAE 1 | 1.71 | 0.66 | 0.72 | 2.73 | 5.21 |
| Training (Bagging) | MAE 2 | 63.03 | 42.37 | 37.77 | 33.57 | 49.97 |
| Training (LS Boosting) | NRMSE 1 | 0.0052 | 0.0012 | 0.0104 | 0.0102 | 0.0210 |
| Training (Bagging) | NRMSE 2 | 0.0684 | 0.0500 | 0.0342 | 0.0353 | 0.0560 |
| Training (LS Boosting) | Rsq | 0.9996 | 0.9999 | 0.9999 | 0.9986 | 0.9935 |
| Training (Bagging) | Rsq | 0.9835 | 0.9984 | 0.9984 | 0.9883 | 0.9719 |
Throughput data testing accuracy using LS-Boosting and bagging.
| Accuracy | 25 | 50 | 75 | 100 | 160 | |
|---|---|---|---|---|---|---|
| Testing (LS-Boosting) | MAE 1 | 4.22 | 0.71 | 1.79 | 8.65 | 8.07 |
| Testing (Bagging) | MAE 2 | 77.39 | 27.91 | 47.58 | 43.76 | 50.08 |
| Testing (LS-Boosting) | NRMSE 1 | 0.012 | 0.0024 | 0.0047 | 0.024 | 0.0374 |
| Testing (Bagging) | NRMSE 2 | 0.090 | 0.0032 | 0.0466 | 0.047 | 0.0696 |
| Testing (LS Boosting) | Rsq1 | 0.9983 | 0.9999 | 0.9998 | 0.9947 | 0.9860 |
| Testing (Bagging) | Rsq2 | 0.9935 | 0.9935 | 0.9935 | 0.9818 | 0.9654 |