Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning.

Literature DB >> 29990289

MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning.

Felipe Leno Da Silva, Ruben Glatt, Anna Helena Reali Costa.

Abstract

Reinforcement learning (RL) is a widely known technique to enable autonomous learning. Even though RL methods achieved successes in increasingly large and complex problems, scaling solutions remains a challenge. One way to simplify (and consequently accelerate) learning is to exploit regularities in a domain, which allows generalization and reduction of the learning space. While object-oriented Markov decision processes (OO-MDPs) provide such generalization opportunities, we argue that the learning process may be further simplified by dividing the workload of tasks amongst multiple agents, solving problems as multiagent systems (MAS). In this paper, we propose a novel combination of OO-MDP and MAS, called multiagent OO-MDP (MOO-MDP). Our proposal accrues the benefits of both OO-MDP and MAS, better addressing scalability issues. We formalize the general model MOO-MDP and present an algorithm to solve deterministic cooperative MOO-MDPs. We show that our algorithm learns optimal policies while reducing the learning space by exploiting state abstractions. We experimentally compare our results with earlier approaches in three domains and evaluate the advantages of our approach in sample efficiency and memory requirements.

Entities: Chemical

Year: 2017 PMID： 29990289 DOI： 10.1109/TCYB.2017.2781130

Source DB: PubMed Journal: IEEE Trans Cybern ISSN： 2168-2267 Impact factor: 11.448

Keyword Cloud
Cited

1 in total

1. Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment.

Authors: Wei-Cheng Jiang; Vignesh Narayanan; Jr-Shin Li
Journal: IEEE Trans Cybern Date: 2021-12-22 Impact factor: 11.448

1 in total