Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.

Literature DB >> 18624657

Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.

Elliot A Ludvig¹, Richard S Sutton, E James Kehoe.

Abstract

The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.

Entities: Chemical Disease

Mesh：

Substances：
Dopamine

Year: 2008 PMID： 18624657 DOI： 10.1162/neco.2008.11-07-654

Source DB: PubMed Journal: Neural Comput ISSN： 0899-7667 Impact factor: 2.026

Keyword Cloud
Cited

48 in total

Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.

1. Selective maintenance of value information helps resolve the exploration/exploitation dilemma.

Review 2. Reinforcement learning models and their neural correlates: An activation likelihood estimation meta-analysis.

3. The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty.

Review 4. Striatal action-learning based on dopamine concentration.

Review 5. Reinforcement learning, conditioning, and the brain: Successes and challenges.

6. Alternative time representation in dopamine models.

7. A model of interval timing by neural integration.

8. Rethinking dopamine as generalized prediction error.

9. Learning to represent reward structure: a key to adapting to complex environments.

10. Temporal-difference reinforcement learning with distributed representations.