| Literature DB >> 35271067 |
Ibrahim A Nemer1, Tarek R Sheltami1,2, Slim Belhaiza2,3, Ashraf S Mahmoud1,2.
Abstract
Unmanned Aerial Vehicles (UAVs) are considered an important element in wireless communication networks due to their agility, mobility, and ability to be deployed as mobile base stations (BSs) in the network to improve the communication quality and coverage area. UAVs can be used to provide communication services for ground users in different scenarios, such as transportation systems, disaster situations, emergency cases, and surveillance. However, covering a specific area under a dynamic environment for a long time using UAV technology is quite challenging due to its limited energy resources, short communication range, and flying regulations and rules. Hence, a distributed solution is needed to overcome these limitations and to handle the interactions among UAVs, which leads to a large state space. In this paper, we introduced a novel distributed control solution to place a group of UAVs in the candidate area in order to improve the coverage score with minimum energy consumption and a high fairness value. The new algorithm is called the state-based game with actor-critic (SBG-AC). To simplify the complex interactions in the problem, we model SBG-AC using a state-based potential game. Then, we merge SBG-AC with an actor-critic algorithm to assure the convergence of the model, to control each UAV in a distributed way, and to have learning capabilities in case of dynamic environments. Simulation results show that the SBG-AC outperforms the distributed DRL and the DRL-EC3 in terms of fairness, coverage score, and energy consumption.Entities:
Keywords: UAV; actor–critic; coverage score; fairness; reinforcement learning
Year: 2022 PMID: 35271067 PMCID: PMC8915037 DOI: 10.3390/s22051919
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Literature summary of game- and learning-based models.
| Ref. | Method | Type | Objective(s) | 2D/3D | Utility/Reward | UAVs | Metrics |
|---|---|---|---|---|---|---|---|
| [ | Potential | Game | Coverage maximization | 2D | Coverage and | 4 to 9 | Coverage with iterations, |
| [ | Potential | Game | Maximize coverage | 2D | Coverage probability | 15 | Coverage with iterations |
| [ | DRL | Learning | Maximizes energy | 2D | Coverage score, | 5 to 10 | Coverage score, |
| [ | Actor-Critic | Learning | Scheduling and | 3D | Energy consumption | 1 | Energy |
| [ | Q-learning | Learning | Coverage and | 3D | Energy consumption | 1 | Reward, energy, |
| [ | DDPG | Learning | Enhance energy efficiency | 3D | Throughput fairness | 1 | Reward |
Figure 1UAV network with the actor–critic algorithm.
Machine/software specifications.
| Hardware/Software | Description |
|---|---|
| Processor | Intel(R) Xeon(R) Gold 5218 CPU @ 2.30 GHz 2.29 GHz |
| Operating System | Microsoft Windows 10 Professional x64 |
| Memory | 256 GB |
| Python | v3.7 |
| Tensorflow | v2.0.0 |
Simulation setting for the UAV network.
| Parameter | Notation | Value | Parameter | Notation | Value |
|---|---|---|---|---|---|
| Number of UAVs |
| [3,4,5,6,7,8,9] | LoS attenuation |
| 1 dB |
| Transmission power |
| 32 dBm | NLOS attenuation |
| 20 dB |
| Cells/squares |
| 100×100 | Environmental constants |
| 0.11, 0.6 |
| Timeslot |
| 1 s | Elevation angle |
| 45 |
| Duration/iterations |
| 200 | LOS link |
| 10.39, 0.05 |
| UAV speed |
| 10 m/s | NLOS link |
| 29.06, 0.03 |
| Path-loss exponent |
| 2.5 | Tip speed |
| 120 m/s |
| Carrier frequency |
| 2 GHz | Mean rotor-induced velocity |
| 0.002 m/s |
| Speed of light |
| 0.3 Gm/s | Fuselage drag ratio |
| 0.48 |
| Air density |
| 1.225 kg/m | Rotor solidity |
| 0.0001 |
| disc rotor area |
| 0.5 s | Blade profile power |
| 99.66 W |
| Sensing and power constants |
| Random (0.8–1) | Derived power |
| 120.16 W |
Figure 2Actor–critic network.
Parameters for the actor and critic networks.
| Parameters | Actor | Critic |
|---|---|---|
| Number of hidden layers | 2 | 2 |
| Neurons per Hidden Layer 1 | 1000 | 500 |
| Neurons per Hidden Layer 2 | 500 | 400 |
| Activation function in hidden layers | ReLU | ReLU |
| Activation function in output layer | tanh | ReLU |
| Learning rate | 0.001 | 0.002 |
| Loss function | Equation ( | Equation ( |
| Optimizer | Adam | Adam |
| Batch size | 64 | |
| Memory capacity | 5000 | |
| Discount factor | 0.999 | |
| Noise variance | 0 and 0.01 | |
| Episode | 400 | |
Description of the SBG-AC, DRL-EC3, and distributed-DRL models.
| Features | SBG-AC | DRL-EC3 | Distributed-DRL |
|---|---|---|---|
| Simulation | Same setting | Same setting | Same setting |
| Model | Game + learning | Learning | Learning |
| Altitude | Varied with a | Fixed for all | Fixed for all |
| Network | Boundaries and | Boundaries and | Boundaries and |
| State and | 3D location | Horizontal flying angle | Horizontal flying angle |
| Reward | Equation ( |
| |
| Learning | Multiple actor- | One actor- | Multiple actor- |
| Network type | Distributed | Centralized | Distributed |
Figure 3Average coverage score value for a 3-, 4-, 5-, 6-, 7-, 8-, and 9-UAV network.
Figure 4Fairness index for a 3-, 4-, 5-, 6-, 7-, 8-, and 9-UAV network.
Figure 5Normalized average energy consumption for a 3-, 4-, 5-, 6-, 7-, 8-, and 9-UAV network.
Performance comparison for an 8-UAV network.
| Metrics | SBG-AC | DRL-EC3 | Distributed-DRL |
|---|---|---|---|
| Coverage score per episode | 0.846 | 0.808 | 0.827 |
| Fairness index per episode | 0.934 | 0.905 | 0.917 |
| Normalized average energy consumption per episode | 0.263 | 0.269 | 0.267 |
Figure 6Accumulated reward over the number of episodes.