| Literature DB >> 26086946 |
Shaofei Chen1, Feng Wu2, Lincheng Shen3, Jing Chen3, Sarvapali D Ramchurn4.
Abstract
We investigate a multi-agent patrolling problem where information is distributed alongside threats in environments with uncertainties. Specifically, the information and threat at each location are independently modelled as multi-state Markov chains, whose states are not observed until the location is visited by an agent. While agents will obtain information at a location, they may also suffer damage from the threat at that location. Therefore, the goal of the agents is to gather as much information as possible while mitigating the damage incurred. To address this challenge, we formulate the single-agent patrolling problem as a Partially Observable Markov Decision Process (POMDP) and propose a computationally efficient algorithm to solve this model. Building upon this, to compute patrols for multiple agents, the single-agent algorithm is extended for each agent with the aim of maximising its marginal contribution to the team. We empirically evaluate our algorithm on problems of multi-agent patrolling and show that it outperforms a baseline algorithm up to 44% for 10 agents and by 21% for 15 agents in large domains.Entities:
Mesh:
Year: 2015 PMID: 26086946 PMCID: PMC4472811 DOI: 10.1371/journal.pone.0130154
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Example of information and threat models at a vertex.
(a) A threat model with 2 states (i.e., R 1 and R 2) and (b) an information model with 3 states (i.e., I 1, I 2 and I 3), where the probabilities of each information/threat state changes to another over a time step are given (e.g., the probability of R 1 changes to R 2 is 0.1).
Fig 2Scenario of 15 agents patrolling.
Fig 3Rewards in Scenario A.
Fig 4Rewards in Scenario B.