Literature DB >> 14692633

Inter-module credit assignment in modular reinforcement learning.

Kazuyuki Samejima1, Kenji Doya, Mitsuo Kawato.   

Abstract

Critical issues in modular or hierarchical reinforcement learning (RL) are (i) how to decompose a task into sub-tasks, (ii) how to achieve independence of learning of sub-tasks, and (iii) how to assure optimality of the composite policy for the entire task. The second and last requirements are often under trade-off. We propose a method for propagating the reward for the entire task achievement between modules. This is done in the form of a 'modular reward', which is calculated from the temporal difference of the module gating signal and the value of the succeeding module. We implement modular reward for a multiple model-based reinforcement learning (MMRL) architecture and show its effectiveness in simulations of a pursuit task with hidden states and a continuous-time non-linear control task.

Mesh:

Year:  2003        PMID: 14692633     DOI: 10.1016/S0893-6080(02)00235-6

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  7 in total

1.  Computational models of reinforcement learning: the role of dopamine as a reward signal.

Authors:  R D Samson; M J Frank; Jean-Marc Fellous
Journal:  Cogn Neurodyn       Date:  2010-03-21       Impact factor: 5.082

2.  Visuomotor coordination and cortical connectivity of modular motor learning.

Authors:  Pablo I Burgos; Juan J Mariman; Scott Makeig; Gonzalo Rivera-Lillo; Pedro E Maldonado
Journal:  Hum Brain Mapp       Date:  2018-05-15       Impact factor: 5.038

3.  Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments.

Authors:  Ken-Ichi Amemori; Leif G Gibb; Ann M Graybiel
Journal:  Front Hum Neurosci       Date:  2011-05-27       Impact factor: 3.169

4.  Credit assignment in multiple goal embodied visuomotor behavior.

Authors:  Constantin A Rothkopf; Dana H Ballard
Journal:  Front Psychol       Date:  2010-11-22

5.  From internal models toward metacognitive AI.

Authors:  Mitsuo Kawato; Aurelio Cortese
Journal:  Biol Cybern       Date:  2021-10       Impact factor: 2.086

6.  Temporal-difference reinforcement learning with distributed representations.

Authors:  Zeb Kurth-Nelson; A David Redish
Journal:  PLoS One       Date:  2009-10-20       Impact factor: 3.240

7.  Modeling sensory-motor decisions in natural behavior.

Authors:  Ruohan Zhang; Shun Zhang; Matthew H Tong; Yuchen Cui; Constantin A Rothkopf; Dana H Ballard; Mary M Hayhoe
Journal:  PLoS Comput Biol       Date:  2018-10-25       Impact factor: 4.475

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.