Literature DB >> 12020450

Multiple model-based reinforcement learning.

Kenji Doya1, Kazuyuki Samejima, Ken-ichi Katagiri, Mitsuo Kawato.   

Abstract

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The "responsibility signal," which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.

Mesh:

Year:  2002        PMID: 12020450     DOI: 10.1162/089976602753712972

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  80 in total

1.  A unifying computational framework for motor control and social interaction.

Authors:  Daniel M Wolpert; Kenji Doya; Mitsuo Kawato
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2003-03-29       Impact factor: 6.237

2.  Functional magnetic resonance imaging examination of two modular architectures for switching multiple internal models.

Authors:  Hiroshi Imamizu; Tomoe Kuroda; Toshinori Yoshioka; Mitsuo Kawato
Journal:  J Neurosci       Date:  2004-02-04       Impact factor: 6.167

3.  Abstract rule learning: the differential effects of lesions in frontal cortex.

Authors:  Andrew S Kayser; Mark D'Esposito
Journal:  Cereb Cortex       Date:  2012-01-31       Impact factor: 5.357

4.  A pallidus-habenula-dopamine pathway signals inferred stimulus values.

Authors:  Ethan S Bromberg-Martin; Masayuki Matsumoto; Simon Hong; Okihide Hikosaka
Journal:  J Neurophysiol       Date:  2010-06-10       Impact factor: 2.714

5.  Protection and expression of human motor memories.

Authors:  Sarah E Pekny; Sarah E Criscimagna-Hemminger; Reza Shadmehr
Journal:  J Neurosci       Date:  2011-09-28       Impact factor: 6.167

6.  A model of prefrontal cortical mechanisms for goal-directed behavior.

Authors:  Michael E Hasselmo
Journal:  J Cogn Neurosci       Date:  2005-07       Impact factor: 3.225

7.  Neural correlates of the divergence of instrumental probability distributions.

Authors:  Mimi Liljeholm; Shuo Wang; June Zhang; John P O'Doherty
Journal:  J Neurosci       Date:  2013-07-24       Impact factor: 6.167

8.  Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.

Authors:  Michael J Frank; David Badre
Journal:  Cereb Cortex       Date:  2011-06-21       Impact factor: 5.357

9.  Computational models of reinforcement learning: the role of dopamine as a reward signal.

Authors:  R D Samson; M J Frank; Jean-Marc Fellous
Journal:  Cogn Neurodyn       Date:  2010-03-21       Impact factor: 5.082

10.  Temporal-difference reinforcement learning with distributed representations.

Authors:  Zeb Kurth-Nelson; A David Redish
Journal:  PLoS One       Date:  2009-10-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.