Literature DB >> 19864565

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Samuel J Gershman1, Bijan Pesaran, Nathaniel D Daw.   

Abstract

Humans and animals are endowed with a large number of effectors. Although this enables great behavioral flexibility, it presents an equally formidable reinforcement learning problem of discovering which actions are most valuable because of the high dimensionality of the action space. An unresolved question is how neural systems for reinforcement learning-such as prediction error signals for action valuation associated with dopamine and the striatum-can cope with this "curse of dimensionality." We propose a reinforcement learning framework that allows for learned action valuations to be decomposed into effector-specific components when appropriate to a task, and test it by studying to what extent human behavior and blood oxygen level-dependent (BOLD) activity can exploit such a decomposition in a multieffector choice task. Subjects made simultaneous decisions with their left and right hands and received separate reward feedback for each hand movement. We found that choice behavior was better described by a learning model that decomposed the values of bimanual movements into separate values for each effector, rather than a traditional model that treated the bimanual actions as unitary with a single value. A decomposition of value into effector-specific components was also observed in value-related BOLD signaling, in the form of lateralized biases in striatal correlates of prediction error and anticipatory value correlates in the intraparietal sulcus. These results suggest that the human brain can use decomposed value representations to "divide and conquer" reinforcement learning over high-dimensional action spaces.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19864565      PMCID: PMC2796632          DOI: 10.1523/JNEUROSCI.2469-09.2009

Source DB:  PubMed          Journal:  J Neurosci        ISSN: 0270-6474            Impact factor:   6.167


  55 in total

1.  Temporal difference models and reward-related learning in the human brain.

Authors:  John P O'Doherty; Peter Dayan; Karl Friston; Hugo Critchley; Raymond J Dolan
Journal:  Neuron       Date:  2003-04-24       Impact factor: 17.173

2.  Functional organization of human intraparietal and frontal cortex for attending, looking, and pointing.

Authors:  Serguei V Astafiev; Gordon L Shulman; Christine M Stanley; Abraham Z Snyder; David C Van Essen; Maurizio Corbetta
Journal:  J Neurosci       Date:  2003-06-01       Impact factor: 6.167

Review 3.  Mechanisms of selection and guidance of reaching movements in the parietal lobe.

Authors:  John F Kalaska; Paul Cisek; Nadia Gosselin-Kessiby
Journal:  Adv Neurol       Date:  2003

4.  Matching behavior and the representation of value in the parietal cortex.

Authors:  Leo P Sugrue; Greg S Corrado; William T Newsome
Journal:  Science       Date:  2004-06-18       Impact factor: 47.728

5.  Parietal neurons related to memory-guided hand manipulation.

Authors:  A Murata; V Gallese; M Kaseda; H Sakata
Journal:  J Neurophysiol       Date:  1996-05       Impact factor: 2.714

6.  The Psychophysics Toolbox.

Authors:  D H Brainard
Journal:  Spat Vis       Date:  1997

7.  Modular decomposition in visuomotor learning.

Authors:  Z Ghahramani; D M Wolpert
Journal:  Nature       Date:  1997-03-27       Impact factor: 49.962

8.  Functional magnetic resonance imaging of macaque monkeys performing visually guided saccade tasks: comparison of cortical eye fields with humans.

Authors:  Minoru Koyama; Isao Hasegawa; Takahiro Osada; Yusuke Adachi; Kiyoshi Nakahara; Yasushi Miyashita
Journal:  Neuron       Date:  2004-03-04       Impact factor: 17.173

Review 9.  Predictive reward signal of dopamine neurons.

Authors:  W Schultz
Journal:  J Neurophysiol       Date:  1998-07       Impact factor: 2.714

10.  FMRI evidence for a 'parietal reach region' in the human brain.

Authors:  Jason D Connolly; Richard A Andersen; Melvyn A Goodale
Journal:  Exp Brain Res       Date:  2003-09-04       Impact factor: 1.972

View more
  58 in total

Review 1.  Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis.

Authors:  Henry W Chase; Poornima Kumar; Simon B Eickhoff; Alexandre Y Dombrovski
Journal:  Cogn Affect Behav Neurosci       Date:  2015-06       Impact factor: 3.282

2.  A reinforcement learning mechanism responsible for the valuation of free choice.

Authors:  Jeffrey Cockburn; Anne G E Collins; Michael J Frank
Journal:  Neuron       Date:  2014-07-24       Impact factor: 17.173

3.  Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.

Authors:  Michael J Frank; David Badre
Journal:  Cereb Cortex       Date:  2011-06-21       Impact factor: 5.357

4.  Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI.

Authors:  David Badre; Michael J Frank
Journal:  Cereb Cortex       Date:  2011-06-21       Impact factor: 5.357

5.  Effort, success, and nonuse determine arm choice.

Authors:  Nicolas Schweighofer; Yupeng Xiao; Sujin Kim; Toshinori Yoshioka; James Gordon; Rieko Osu
Journal:  J Neurophysiol       Date:  2015-05-06       Impact factor: 2.714

6.  Causal Inference About Good and Bad Outcomes.

Authors:  Hayley M Dorfman; Rahul Bhui; Brent L Hughes; Samuel J Gershman
Journal:  Psychol Sci       Date:  2019-02-13

7.  Action selection in multi-effector decision making.

Authors:  Seth Madlon-Kay; Bijan Pesaran; Nathaniel D Daw
Journal:  Neuroimage       Date:  2012-12-07       Impact factor: 6.556

8.  Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia.

Authors:  Carlos Diuk; Karin Tsai; Jonathan Wallis; Matthew Botvinick; Yael Niv
Journal:  J Neurosci       Date:  2013-03-27       Impact factor: 6.167

9.  Learning to represent reward structure: a key to adapting to complex environments.

Authors:  Hiroyuki Nakahara; Okihide Hikosaka
Journal:  Neurosci Res       Date:  2012-10-13       Impact factor: 3.304

10.  Motor preparatory activity in posterior parietal cortex is modulated by subjective absolute value.

Authors:  Asha Iyer; Axel Lindner; Igor Kagan; Richard A Andersen
Journal:  PLoS Biol       Date:  2010-08-03       Impact factor: 8.029

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.