Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Short-term memory traces for action bias in human reinforcement learning.

Literature DB >> 17459346

Short-term memory traces for action bias in human reinforcement learning.

Rafal Bogacz¹, Samuel M McClure, Jian Li, Jonathan D Cohen, P Read Montague.

Abstract

Recent experimental and theoretical work on reinforcement learning has shed light on the neural bases of learning from rewards and punishments. One fundamental problem in reinforcement learning is the credit assignment problem, or how to properly assign credit to actions that lead to reward or punishment following a delay. Temporal difference learning solves this problem, but its efficiency can be significantly improved by the addition of eligibility traces (ET). In essence, ETs function as decaying memories of previous choices that are used to scale synaptic weight changes. It has been shown in theoretical studies that ETs spanning a number of actions may improve the performance of reinforcement learning. However, it remains an open question whether including ETs that persist over sequences of actions allows reinforcement learning models to better fit empirical data regarding the behaviors of humans and other animals. Here, we report an experiment in which human subjects performed a sequential economic decision game in which the long-term optimal strategy differed from the strategy that leads to the greatest short-term return. We demonstrate that human subjects' performance in the task is significantly affected by the time between choices in a surprising and seemingly counterintuitive way. However, this behavior is naturally explained by a temporal difference learning model which includes ETs persisting across actions. Furthermore, we review recent findings that suggest that short-term synaptic plasticity in dopamine neurons may provide a realistic biophysical mechanism for producing ETs that persist on a timescale consistent with behavioral observations.

Entities: Chemical Species

Mesh：

Year: 2007 PMID： 17459346 DOI： 10.1016/j.brainres.2007.03.057

Source DB: PubMed Journal: Brain Res ISSN： 0006-8993 Impact factor: 3.252

Keyword Cloud
Cited

29 in total

1. With age comes wisdom: decision making in younger and older adults.

Authors: Darrell A Worthy; Marissa A Gorlick; Jennifer L Pacheco; David M Schnyer; W Todd Maddox
Journal: Psychol Sci Date: 2011-09-29

Review 2. Navigating complex decision spaces: Problems and paradigms in sequential choice.

Authors: Matthew M Walsh; John R Anderson
Journal: Psychol Bull Date: 2013-07-08 Impact factor: 17.737

Review 3. Neurocomputational mechanisms of reinforcement-guided learning in humans: a review.

Authors: Michael X Cohen
Journal: Cogn Affect Behav Neurosci Date: 2008-06 Impact factor: 3.282

4. A simple computational algorithm of model-based choice preference.

Authors: Asako Toyama; Kentaro Katahira; Hideki Ohira
Journal: Cogn Affect Behav Neurosci Date: 2017-08 Impact factor: 3.282

5. To not settle for small losses: evidence for an ecological aspiration level of zero in dynamic decision-making.

Authors: Bo Pang; Nathaniel J Blanco; W Todd Maddox; Darrell A Worthy
Journal: Psychon Bull Rev Date: 2017-04

6. Optimizing vs. matching: response strategy in a probabilistic learning task is associated with negative symptoms of schizophrenia.

Authors: Zuzana Kasanova; James A Waltz; Gregory P Strauss; Michael J Frank; James M Gold
Journal: Schizophr Res Date: 2011-01-15 Impact factor: 4.939

Short-term memory traces for action bias in human reinforcement learning.

1. With age comes wisdom: decision making in younger and older adults.

Review 2. Navigating complex decision spaces: Problems and paradigms in sequential choice.

Review 3. Neurocomputational mechanisms of reinforcement-guided learning in humans: a review.

4. A simple computational algorithm of model-based choice preference.

5. To not settle for small losses: evidence for an ecological aspiration level of zero in dynamic decision-making.

6. Optimizing vs. matching: response strategy in a probabilistic learning task is associated with negative symptoms of schizophrenia.

7. How instructed knowledge modulates the neural systems of reward learning.

8. Learning from delayed feedback: neural responses in temporal credit assignment.

9. Scaffolding across the lifespan in history-dependent decision-making.

10. Short-term gains, long-term pains: how cues about state aid learning in dynamic environments.