| Literature DB >> 35458964 |
Siqi Tang1, Zhisong Pan1, Guyu Hu1, Yang Wu2, Yunbo Li1.
Abstract
Large-scale terminals' various QoS requirements are key challenges confronting the resource allocation of Satellite Internet of Things (S-IoT). This paper presents a deep reinforcement learning-based online channel allocation and power control algorithm in an S-IoT uplink scenario. The intelligent agent determines the transmission channel and power simultaneously based on contextual information. Furthermore, the weighted normalized reward concerning success rate, power efficiency, and QoS requirement is adopted to balance the performance between increasing resource efficiency and meeting QoS requirements. Finally, a practical deployment mechanism based on transfer learning is proposed to promote onboard training efficiency and to reduce computation consumption of the training process. The simulation demonstrates that the proposed method can balance the success rate and power efficiency with QoS requirement guaranteed. For S-IoT's normal operation condition, the proposed method can improve the power efficiency by 60.91% and 144.44% compared with GA and DRL_RA, while its power efficiency is only 4.55% lower than that of DRL-EERA. In addition, this method can be transferred and deployed to a space environment by merely 100 onboard training steps.Entities:
Keywords: Satellite Internet of Things; channel allocation; deep reinforcement learning; power control; transfer learning; various QoS
Year: 2022 PMID: 35458964 PMCID: PMC9024869 DOI: 10.3390/s22082979
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1S-IoT scenario based on multibeam LEO satellite constellation.
Figure 2Framework of DRL_CAPC algorithm.
Detailed parameters of S-IoT scenario.
| Parameters | Value |
| Satellite altitude | 550 km |
| Beams | 19 |
| Data transmission channels | 8 |
| Channel bandwidth | 10 KHz |
| Frequency band | 14 GHz |
| Terminals’ antenna max power | 300 mW |
| Terminals’ antenna power level | 3 |
| Terminals’ antenna gain | 10 dBi |
| Satellite receiving antenna G/T | 3.7 db |
| Path loss | 170.38 db |
| Satellite backhaul power limitation | 300 W |
| Beam power limitation | 25 W |
| Satellite amplifier magnification | 5 |
| 1.1 |
DRL-QoS-RA algorithm parameters.
| Algorithm Parameters | Value |
| Replay start size | 2000 |
| Replay memory | 20,000 |
| Batchsize | 32 |
| Target network update step | 50 |
| Discount factor | 0.99 |
| Initial exploration rate | 1.0 |
| Final exploration rate | 0.01 |
| Exploration rate decay | 5 × 10−4 |
| Learning rate | 0.001 |
Figure 3Convergence process of DRL-QoS-RA method.
Figure 4The simulation process of each methods’ training process.
Figure 5Deployment training process based on transfer learning.
Figure 6Methods’ performance with requested traffic increase. (a) The trend of transmission success rate with traffic increase. (b) The trend of energy efficiency with traffic increase.
Performance of comparative methods with different request arrival rates.
| Methods | ||||||
|---|---|---|---|---|---|---|
| Success Rate | Energy Efficiency | Computational Time (s) | Success Rate | Energy Efficiency | Computational Time (s) | |
| random | 0.94 | 0.16 | - | 0.34 | 0.12 | - |
| GA | 0.99 | 0.41 | 3.24 × 103 | 0.53 | 0.22 | 1.67 × 104 |
| DRL-RA | 1 | 0.27 | 64.77 | 0.89 | 0.23 | 64.77 |
| DRL-EERA | 1 | 0.69 | 71.52 | 0.48 | 0.38 | 71.52 |
| DRL-QoS-RA | 1 | 0.66 | 78.31 | 0.71 | 0.33 | 78.31 |
| DRL-QoS-RA-transferred | 1 | 0.64 | 11.53 | 0.72 | 0.31 | 11.53 |