| Literature DB >> 35009812 |
Abstract
Unmanned Aerial Vehicle (UAV)-assisted cellular networks over the millimeter-wave (mmWave) frequency band can meet the requirements of a high data rate and flexible coverage in next-generation communication networks. However, higher propagation loss and the use of a large number of antennas in mmWave networks give rise to high energy consumption and UAVs are constrained by their low-capacity onboard battery. Energy harvesting (EH) is a viable solution to reduce the energy cost of UAV-enabled mmWave networks. However, the random nature of renewable energy makes it challenging to maintain robust connectivity in UAV-assisted terrestrial cellular networks. Energy cooperation allows UAVs to send their excessive energy to other UAVs with reduced energy. In this paper, we propose a power allocation algorithm based on energy harvesting and energy cooperation to maximize the throughput of a UAV-assisted mmWave cellular network. Since there is channel-state uncertainty and the amount of harvested energy can be treated as a stochastic process, we propose an optimal multi-agent deep reinforcement learning algorithm (DRL) named Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to solve the renewable energy resource allocation problem for throughput maximization. The simulation results show that the proposed algorithm outperforms the Random Power (RP), Maximal Power (MP) and value-based Deep Q-Learning (DQL) algorithms in terms of network throughput.Entities:
Keywords: Multi-Agent Deep Reinforcement Learning (MADDPG); Unmanned Aerial Vehicles (UAVs); energy cooperation; energy harvesting; power allocation
Mesh:
Year: 2021 PMID: 35009812 PMCID: PMC8749623 DOI: 10.3390/s22010270
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Network architecture.
List of notations.
| Notations | Definitions |
|---|---|
|
| Number of UAVs |
|
| Number of user sets |
|
| Mean number of buildings per square kilometer |
|
| Scale parameter |
|
| Fraction of area covered by buildings to the total area |
|
| Path loss |
|
| LOS path loss exponent |
|
| NLOS path loss exponent |
|
| Intercept of the LOS link |
|
| Intercept of the NLOS link |
|
| Small-scale fading |
|
| Nakagami fading parameter for LOS link |
|
| Nakagami fading parameter for NLOS link |
|
| Channel gain from the |
|
| Directional antenna gain |
|
| Maximum antenna gain |
|
| Azimuth plane |
|
| Elevation plane |
|
| Mean lobe gain |
|
| Side lobe gain |
|
| Signal-to-interference-plus-noise ratio from UAV |
|
| Transmit power selected by UAV |
|
| Maximum transmission power |
|
| Interference to UAV |
|
| Noise power level |
|
| Total number of time slots |
|
| Te slot duration |
|
| Amount of harvested energy for UAV |
|
| Maximum harvested energy |
|
| Battery capacity of each UAV |
|
| Battery state for UAV |
|
|
|
|
| MmWave transmission bandwidth |
|
| Total throughput |
|
| Energy transferred from UAV |
|
| Energy transfer efficiency between two UAVs. |
|
| State space |
|
| Action space |
|
| State transition function |
|
| Reward function |
Figure 2Sectorized antenna pattern.
Simulation parameters.
| Parameters | Values | |
|---|---|---|
| Number of UAVs | 4 | |
| Maximum flying altitude of UAVs | 100 m | |
| Number of users | 12 | |
| Mean number of buildings per square kilometer is
| 300/km2 | |
| Fraction of area covered by buildings to the total area
| 0.5 | |
| Scale parameter
| 20 m | |
|
| 1.39 | |
|
|
| |
|
| 2 | |
|
| 3 | |
|
| 3 | |
|
| 2 | |
| Available bandwidth
| 1 GHz | |
|
| 10 dB | |
|
| ||
|
| (0,20) dBm | |
| Battery capacity | 4000 J | |
|
| (0,125) J | |
| Energy transfer efficiency between two UAVs
| 0.9 | |
| Number of episodes | 5000 | |
| Number of time slots per episode | 500 | |
| Batch size | 500 | |
| Replay memory size | 50,000 | |
| Learning rate for DQL |
| |
| Learning rate for MADDPG | Actor | Critic |
|
|
| |
Figure 3Average throughput as a function of the number of UAVs.
Figure 4Average throughput as a function of the number of time slots.
Figure 5Average throughput as a function of the number of users with 4 UAVs.
Figure 6Energy transfer efficiency as a function of the number of users.
Figure 7Average throughput as a function of the energy arrival .
Figure 8Average throughput as a function of the battery capacity .
Figure 9Average throughput as a function of the energy transfer efficiency between two UAVs.
Figure 10Convergence behavior for the reinforcement learning protocols.