| Literature DB >> 35626523 |
Yixiang Ren1, Zhenhui Ye2, Guanghua Song1, Xiaohong Jiang2.
Abstract
Mobile crowdsensing (MCS) is attracting considerable attention in the past few years as a new paradigm for large-scale information sensing. Unmanned aerial vehicles (UAVs) have played a significant role in MCS tasks and served as crucial nodes in the newly-proposed space-air-ground integrated network (SAGIN). In this paper, we incorporate SAGIN into MCS task and present a Space-Air-Ground integrated Mobile CrowdSensing (SAG-MCS) problem. Based on multi-source observations from embedded sensors and satellites, an aerial UAV swarm is required to carry out energy-efficient data collection and recharging tasks. Up to date, few studies have explored such multi-task MCS problem with the cooperation of UAV swarm and satellites. To address this multi-agent problem, we propose a novel deep reinforcement learning (DRL) based method called Multi-Scale Soft Deep Recurrent Graph Network (ms-SDRGN). Our ms-SDRGN approach incorporates a multi-scale convolutional encoder to process multi-source raw observations for better feature exploitation. We also use a graph attention mechanism to model inter-UAV communications and aggregate extra neighboring information, and utilize a gated recurrent unit for long-term performance. In addition, a stochastic policy can be learned through a maximum-entropy method with an adjustable temperature parameter. Specifically, we design a heuristic reward function to encourage the agents to achieve global cooperation under partial observability. We train the model to convergence and conduct a series of case studies. Evaluation results show statistical significance and that ms-SDRGN outperforms three state-of-the-art DRL baselines in SAG-MCS. Compared with the best-performing baseline, ms-SDRGN improves 29.0% reward and 3.8% CFE score. We also investigate the scalability and robustness of ms-SDRGN towards DRL environments with diverse observation scales or demanding communication conditions.Entities:
Keywords: UAV control; deep reinforcement learning; graph network; maximum-entropy learning; mobile crowdsensing
Year: 2022 PMID: 35626523 PMCID: PMC9140918 DOI: 10.3390/e24050638
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.738
Figure 1Proposed SAG-MCS Scenario Schematic.
Figure 2The observation space of UAV u in SAG-MCS.
Figure 3ms-SDRGN Model Architecture.
Figure 4The training process of ms-SDRGN. In the flow chart, solid lines indicate feed forward propagation, and dashed lines denote updating parameters by backpropagation.
Figure 5(a) The episodic reward learning curves of DRL algorithms. (b) The global metrics learning curves of ms-SDRGN.
Comparison of DRL Baselines.
| Algorithm | Reward | CFE Score | Coverage | Fairness | Energy |
|---|---|---|---|---|---|
| ms-SDRGN |
|
|
|
|
|
| DGN |
|
|
|
|
|
| MAAC |
|
|
|
|
|
| DQN |
|
|
|
|
|
Figure 6The evaluation results in environments with different communication dropout rate: (a) mean episodic reward and (b) CFE score.
Simulation Environment Scale Experiment Settings.
| Environment Scale Factor | 0.5 | 1.0 | 1.5 | 2.0 |
|---|---|---|---|---|
| Environment Size in Pixels |
|
|
|
|
| Coverage Range | 5 | 10 | 15 | 20 |
| Observation Range | 7 | 13 | 20 | 26 |
| Communication Range | 9 | 18 | 27 | 36 |
Figure 7The evaluation results in environments with different scale factors: (a) mean episodic reward and (b) CFE score. (‘w/o local CNN encoder’ denotes using linear encoder to process local observations).
Ablation study of ms-SDRGN method.
| Algorithm | Reward | CFE Score |
|---|---|---|
| ms-SDRGN |
|
|
| ms-SDRGN-ms |
|
|
| ms-SDRGN-Soft |
|
|
| ms-SDRGN-1GAT |
|
|
| ms-SDRGN-2GAT |
|
|
| ms-SDRGN-GRU |
|
|
‘-ms’ means removing local CNN encoder. ‘-Soft’ means training a deterministic policy instead of a stochastic policy. ‘-1GAT’ and ‘-2GAT’ denotes disabling one GAT layer and two GAT layers separately. ‘-GRU’ means disabling GRU memory unit.