| Literature DB >> 36247361 |
Dandan Ma1, Dequan Kong1, Xiaowei Chen1, Lingyu Zhang1, Mingrun Yuan1.
Abstract
In the recent years, with the rapid development of science and technology, robot location-based service (RLBS) has become the main application service on mobile intelligent devices. When people use location services, it generates a large amount of location data with real location information. If a malicious third party gets this location information, it will cause the risk of location-related privacy disclosure for users. The wide application of crowdsensing service has brought about the leakage of personal privacy. However, the existing privacy protection strategies cannot adapt to the crowdsensing environment. In this paper, we propose a novel location privacy protection based on the Q-learning particle swarm optimization algorithm in mobile crowdsensing. By generalizing tasks, this new algorithm makes the attacker unable to distinguish the specific tasks completed by users, cuts off the association between users and tasks, and protects users' location privacy. The strategy uses Q-learning to continuously combine different confounding tasks and train a confounding task scheme that can output the lowest rejection rate. The Q-learning method is improved by particle swarm optimization algorithm, which improves the optimization ability of the method. Experimental results show that this scheme has good performance in privacy budget error, availability, and cloud timeliness and greatly improves the security of user location data. In terms of inhibition ratio, the value is close to the optimal value.Entities:
Keywords: Q-learning; RLBS; crowdsensing service; location privacy protection; particle swarm optimization
Year: 2022 PMID: 36247361 PMCID: PMC9561907 DOI: 10.3389/fnbot.2022.981390
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 3.493
Figure 1System framework.
Figure 2Framework of Q-learning.
Decision space state.
|
|
|
|
|---|---|---|
| 0 ≤ | Nearest | 1 |
| 0.25 ≤ | Near | 2 |
| 0.5 ≤ | Far | 3 |
| 0.75 ≤ | Furthest | 4 |
Target space state.
|
|
|
|
|---|---|---|
| 0 ≤ | Minimum | 1 |
| 0.25 ≤ | Small | 2 |
| 0.5 ≤ | Large | 3 |
| 0.75 ≤ | Maximum | 4 |
Relationship between particle action and parameter value.
|
|
|
|
|
| |
|---|---|---|---|---|---|
| Global search | Large-range search | 1 | 1.0 | 2.5 | 0.5 |
| Small-range search | 2 | 0.8 | 2.0 | 1.0 | |
| Slow convergence | 3 | 0.5 | 1.0 | 2.0 | |
| Fast convergence | 4 | 0.4 | 0.5 | 2.5 | |
| Local search | 0 | 0 | 3.0 | ||
QLPSO-based inhibition rate optimization with experience replay.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Experimental environment.
|
|
|
|---|---|
| Processor | Intel Core i7 4790 @3.6 GHz |
| Memory | 32 GB |
| Video card | 1060 Ti |
| RAM | 16 GB |
| Solid-state drive (SSD) | 128 GB |
| System | Ubuntu 6.04 |
| Programming language | Python |
Experimental parameters.
|
|
|
|---|---|
| Total number of training fragments M | 3,000 |
| Number of tasks submitted by users before the submission deadline n | 10 |
| Action selection probability ε |
|
| Learning parameter α | 0.02 |
| Reward coefficient δ | 10−3 |
| Discount factor γ | 0.8 |
| Number of users in an anonymous zone | 130 |
| User task completion rate per period of time | 0.9 |
| Number of anonymous area tasks | 255 |
| Anonymous set size | 6 |
| Task area size | 4,000 m × 4,000 m |
The effect of learning rate on average inhibition rate (AIR) (%).
| Learning rate | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 |
| AIR | 10.2 | 10.2 | 10.2 | 15.3 | 15.6 |
The effect of training number on average reward value (%).
| Episode M | 1,000 | 2,000 | 3,000 | 4,000 |
| OPT | 92.5 | 92.5 | 92.5 | 92.5 |
| QLPSO | 70.2 | 83.6 | 86.7 | 91.6 |
Figure 3Effect of user number on inhibition rate.
Figure 4Effect of tolerance delay on inhibition rate.