Literature DB >> 23832417

Modular inverse reinforcement learning for visuomotor behavior.

Constantin A Rothkopf1, Dana H Ballard.   

Abstract

In a large variety of situations one would like to have an expressive and accurate model of observed animal or human behavior. While general purpose mathematical models may capture successfully properties of observed behavior, it is desirable to root models in biological facts. Because of ample empirical evidence for reward-based learning in visuomotor tasks, we use a computational model based on the assumption that the observed agent is balancing the costs and benefits of its behavior to meet its goals. This leads to using the framework of reinforcement learning, which additionally provides well-established algorithms for learning of visuomotor task solutions. To quantify the agent's goals as rewards implicit in the observed behavior, we propose to use inverse reinforcement learning, which quantifies the agent's goals as rewards implicit in the observed behavior. Based on the assumption of a modular cognitive architecture, we introduce a modular inverse reinforcement learning algorithm that estimates the relative reward contributions of the component tasks in navigation, consisting of following a path while avoiding obstacles and approaching targets. It is shown how to recover the component reward weights for individual tasks and that variability in observed trajectories can be explained succinctly through behavioral goals. It is demonstrated through simulations that good estimates can be obtained already with modest amounts of observation data, which in turn allows the prediction of behavior in novel configurations.

Entities:  

Mesh:

Year:  2013        PMID: 23832417      PMCID: PMC3773182          DOI: 10.1007/s00422-013-0562-6

Source DB:  PubMed          Journal:  Biol Cybern        ISSN: 0340-1200            Impact factor:   2.086


  15 in total

1.  How the mind works.

Authors:  S Pinker
Journal:  Ann N Y Acad Sci       Date:  1999-06-30       Impact factor: 5.691

Review 2.  The primate basal ganglia: parallel and integrative networks.

Authors:  Suzanne N Haber
Journal:  J Chem Neuroanat       Date:  2003-12       Impact factor: 3.052

3.  Temporal difference models describe higher-order learning in humans.

Authors:  Ben Seymour; John P O'Doherty; Peter Dayan; Martin Koltzenburg; Anthony K Jones; Raymond J Dolan; Karl J Friston; Richard S Frackowiak
Journal:  Nature       Date:  2004-06-10       Impact factor: 49.962

4.  Representation of action-specific reward values in the striatum.

Authors:  Kazuyuki Samejima; Yasumasa Ueda; Kenji Doya; Minoru Kimura
Journal:  Science       Date:  2005-11-25       Impact factor: 47.728

5.  Modularity in cognition: framing the debate.

Authors:  H Clark Barrett; Robert Kurzban
Journal:  Psychol Rev       Date:  2006-07       Impact factor: 8.934

Review 6.  The computational neurobiology of learning and reward.

Authors:  Nathaniel D Daw; Kenji Doya
Journal:  Curr Opin Neurobiol       Date:  2006-03-24       Impact factor: 6.627

Review 7.  A framework for mesencephalic dopamine systems based on predictive Hebbian learning.

Authors:  P R Montague; P Dayan; T J Sejnowski
Journal:  J Neurosci       Date:  1996-03-01       Impact factor: 6.167

Review 8.  A neural substrate of prediction and reward.

Authors:  W Schultz; P Dayan; P R Montague
Journal:  Science       Date:  1997-03-14       Impact factor: 47.728

9.  Behavioral dynamics of steering, obstacle avoidance, and route selection.

Authors:  Brett R Fajen; William H Warren
Journal:  J Exp Psychol Hum Percept Perform       Date:  2003-04       Impact factor: 3.332

10.  Credit assignment in multiple goal embodied visuomotor behavior.

Authors:  Constantin A Rothkopf; Dana H Ballard
Journal:  Front Psychol       Date:  2010-11-22
View more
  7 in total

Review 1.  Control of gaze in natural environments: effects of rewards and costs, uncertainty and memory in target selection.

Authors:  Mary M Hayhoe; Jonathan Samir Matthis
Journal:  Interface Focus       Date:  2018-06-15       Impact factor: 3.906

2.  Recentering bias for temporal saccades only: Evidence from binocular recordings of eye movements.

Authors:  Jérôme Tagu; Karine Doré-Mazars; Judith Vergne; Christelle Lemoine-Lardennois; Dorine Vergilino-Perez
Journal:  J Vis       Date:  2018-01-01       Impact factor: 2.240

3.  The role of uncertainty and reward on eye movements in a virtual driving task.

Authors:  Brian T Sullivan; Leif Johnson; Constantin A Rothkopf; Dana Ballard; Mary Hayhoe
Journal:  J Vis       Date:  2012-12-21       Impact factor: 2.240

Review 4.  The Hierarchical Evolution in Human Vision Modeling.

Authors:  Dana H Ballard; Ruohan Zhang
Journal:  Top Cogn Sci       Date:  2021-04-10

5.  Davida Teller Award Lecture 2017: What can be learned from natural behavior?

Authors:  Mary M Hayhoe
Journal:  J Vis       Date:  2018-04-01       Impact factor: 2.240

6.  Identification of animal behavioral strategies by inverse reinforcement learning.

Authors:  Shoichiro Yamaguchi; Honda Naoki; Muneki Ikeda; Yuki Tsukada; Shunji Nakano; Ikue Mori; Shin Ishii
Journal:  PLoS Comput Biol       Date:  2018-05-02       Impact factor: 4.475

7.  Modeling sensory-motor decisions in natural behavior.

Authors:  Ruohan Zhang; Shun Zhang; Matthew H Tong; Yuchen Cui; Constantin A Rothkopf; Dana H Ballard; Mary M Hayhoe
Journal:  PLoS Comput Biol       Date:  2018-10-25       Impact factor: 4.475

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.