Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Literature DB >> 19864565

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Samuel J Gershman¹, Bijan Pesaran, Nathaniel D Daw.

Abstract

Humans and animals are endowed with a large number of effectors. Although this enables great behavioral flexibility, it presents an equally formidable reinforcement learning problem of discovering which actions are most valuable because of the high dimensionality of the action space. An unresolved question is how neural systems for reinforcement learning-such as prediction error signals for action valuation associated with dopamine and the striatum-can cope with this "curse of dimensionality." We propose a reinforcement learning framework that allows for learned action valuations to be decomposed into effector-specific components when appropriate to a task, and test it by studying to what extent human behavior and blood oxygen level-dependent (BOLD) activity can exploit such a decomposition in a multieffector choice task. Subjects made simultaneous decisions with their left and right hands and received separate reward feedback for each hand movement. We found that choice behavior was better described by a learning model that decomposed the values of bimanual movements into separate values for each effector, rather than a traditional model that treated the bimanual actions as unitary with a single value. A decomposition of value into effector-specific components was also observed in value-related BOLD signaling, in the form of lateralized biases in striatal correlates of prediction error and anticipatory value correlates in the intraparietal sulcus. These results suggest that the human brain can use decomposed value representations to "divide and conquer" reinforcement learning over high-dimensional action spaces.

Entities: Chemical Disease Species

Mesh：

Substances：
Oxygen

Year: 2009 PMID： 19864565 PMCID： PMC2796632 DOI： 10.1523/JNEUROSCI.2469-09.2009

Source DB: PubMed Journal: J Neurosci ISSN： 0270-6474 Impact factor: 6.167

55 in total

1. Temporal difference models and reward-related learning in the human brain.

Authors: John P O'Doherty; Peter Dayan; Karl Friston; Hugo Critchley; Raymond J Dolan
Journal: Neuron Date: 2003-04-24 Impact factor: 17.173

2. Functional organization of human intraparietal and frontal cortex for attending, looking, and pointing.

Authors: Serguei V Astafiev; Gordon L Shulman; Christine M Stanley; Abraham Z Snyder; David C Van Essen; Maurizio Corbetta
Journal: J Neurosci Date: 2003-06-01 Impact factor: 6.167

Review 3. Mechanisms of selection and guidance of reaching movements in the parietal lobe.

Authors: John F Kalaska; Paul Cisek; Nadia Gosselin-Kessiby
Journal: Adv Neurol Date: 2003

4. Matching behavior and the representation of value in the parietal cortex.

Authors: Leo P Sugrue; Greg S Corrado; William T Newsome
Journal: Science Date: 2004-06-18 Impact factor: 47.728

5. Parietal neurons related to memory-guided hand manipulation.

Authors: A Murata; V Gallese; M Kaseda; H Sakata
Journal: J Neurophysiol Date: 1996-05 Impact factor: 2.714

6. The Psychophysics Toolbox.

Authors: D H Brainard
Journal: Spat Vis Date: 1997

7. Modular decomposition in visuomotor learning.

Authors: Z Ghahramani; D M Wolpert
Journal: Nature Date: 1997-03-27 Impact factor: 49.962

8. Functional magnetic resonance imaging of macaque monkeys performing visually guided saccade tasks: comparison of cortical eye fields with humans.

Authors: Minoru Koyama; Isao Hasegawa; Takahiro Osada; Yusuke Adachi; Kiyoshi Nakahara; Yasushi Miyashita
Journal: Neuron Date: 2004-03-04 Impact factor: 17.173

Review 9. Predictive reward signal of dopamine neurons.

Authors: W Schultz
Journal: J Neurophysiol Date: 1998-07 Impact factor: 2.714

10. FMRI evidence for a 'parietal reach region' in the human brain.

Authors: Jason D Connolly; Richard A Andersen; Melvyn A Goodale
Journal: Exp Brain Res Date: 2003-09-04 Impact factor: 1.972

58 in total

Review 1. Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis.

Authors: Henry W Chase; Poornima Kumar; Simon B Eickhoff; Alexandre Y Dombrovski
Journal: Cogn Affect Behav Neurosci Date: 2015-06 Impact factor: 3.282

2. A reinforcement learning mechanism responsible for the valuation of free choice.

Authors: Jeffrey Cockburn; Anne G E Collins; Michael J Frank
Journal: Neuron Date: 2014-07-24 Impact factor: 17.173

3. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.

Authors: Michael J Frank; David Badre
Journal: Cereb Cortex Date: 2011-06-21 Impact factor: 5.357

4. Mechanisms of hierarchical reinforcement learning in cortico-striatal circuits 2: evidence from fMRI.

Authors: David Badre; Michael J Frank
Journal: Cereb Cortex Date: 2011-06-21 Impact factor: 5.357

5. Effort, success, and nonuse determine arm choice.

Authors: Nicolas Schweighofer; Yupeng Xiao; Sujin Kim; Toshinori Yoshioka; James Gordon; Rieko Osu
Journal: J Neurophysiol Date: 2015-05-06 Impact factor: 2.714

6. Causal Inference About Good and Bad Outcomes.

Authors: Hayley M Dorfman; Rahul Bhui; Brent L Hughes; Samuel J Gershman
Journal: Psychol Sci Date: 2019-02-13

7. Action selection in multi-effector decision making.

Authors: Seth Madlon-Kay; Bijan Pesaran; Nathaniel D Daw
Journal: Neuroimage Date: 2012-12-07 Impact factor: 6.556

8. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia.

Authors: Carlos Diuk; Karin Tsai; Jonathan Wallis; Matthew Botvinick; Yael Niv
Journal: J Neurosci Date: 2013-03-27 Impact factor: 6.167

9. Learning to represent reward structure: a key to adapting to complex environments.

Authors: Hiroyuki Nakahara; Okihide Hikosaka
Journal: Neurosci Res Date: 2012-10-13 Impact factor: 3.304

10. Motor preparatory activity in posterior parietal cortex is modulated by subjective absolute value.

Authors: Asha Iyer; Axel Lindner; Igor Kagan; Richard A Andersen
Journal: PLoS Biol Date: 2010-08-03 Impact factor: 8.029