Literature DB >> 10391468

A neural network model with dopamine-like reinforcement signal that learns a spatial delayed response task.

R E Suri1, W Schultz.   

Abstract

This study investigated how the simulated response of dopamine neurons to reward-related stimuli could be used as reinforcement signal for learning a spatial delayed response task. Spatial delayed response tasks assess the functions of frontal cortex and basal ganglia in short-term memory, movement preparation and expectation of environmental events. In these tasks, a stimulus appears for a short period at a particular location, and after a delay the subject moves to the location indicated. Dopamine neurons are activated by unpredicted rewards and reward-predicting stimuli, are not influenced by fully predicted rewards, and are depressed by omitted rewards. Thus, they appear to report an error in the prediction of reward, which is the crucial reinforcement term in formal learning theories. Theoretical studies on reinforcement learning have shown that signals similar to dopamine responses can be used as effective teaching signals for learning. A neural network model implementing the temporal difference algorithm was trained to perform a simulated spatial delayed response task. The reinforcement signal was modeled according to the basic characteristics of dopamine responses to novel stimuli, primary rewards and reward-predicting stimuli. A Critic component analogous to dopamine neurons computed a temporal error in the prediction of reinforcement and emitted this signal to an Actor component which mediated the behavioral output. The spatial delayed response task was learned via two subtasks introducing spatial choices and temporal delays, in the same manner as monkeys in the laboratory. In all three tasks, the reinforcement signal of the Critic developed in a similar manner to the responses of natural dopamine neurons in comparable learning situations, and the learning curves of the Actor replicated the progress of learning observed in the animals. Several manipulations demonstrated further the efficacy of the particular characteristics of the dopamine-like reinforcement signal. Omission of reward induced a phasic reduction of the reinforcement signal at the time of the reward and led to extinction of learned actions. A reinforcement signal without prediction error resulted in impaired learning because of perseverative errors. Loss of learned behavior was seen with sustained reductions of the reinforcement signal, a situation in general comparable to the loss of dopamine innervation in Parkinsonian patients and experimentally lesioned animals. The striking similarities in teaching signals and learning behavior between the computational and biological results suggest that dopamine-like reward responses may serve as effective teaching signals for learning behavioral tasks that are typical for primate cognitive behavior, such as spatial delayed responding.

Entities:  

Mesh:

Substances:

Year:  1999        PMID: 10391468     DOI: 10.1016/s0306-4522(98)00697-6

Source DB:  PubMed          Journal:  Neuroscience        ISSN: 0306-4522            Impact factor:   3.590


  81 in total

1.  Conditional routing of information to the cortex: a model of the basal ganglia's role in cognitive coordination.

Authors:  Andrea Stocco; Christian Lebiere; John R Anderson
Journal:  Psychol Rev       Date:  2010-04       Impact factor: 8.934

Review 2.  Opponency revisited: competition and cooperation between dopamine and serotonin.

Authors:  Y-Lan Boureau; Peter Dayan
Journal:  Neuropsychopharmacology       Date:  2010-09-29       Impact factor: 7.853

3.  A neural model of hippocampal-striatal interactions in associative learning and transfer generalization in various neurological and psychiatric patients.

Authors:  Ahmed A Moustafa; Szabolcs Keri; Mohammad M Herzallah; Catherine E Myers; Mark A Gluck
Journal:  Brain Cogn       Date:  2010-08-21       Impact factor: 2.310

4.  Comparison of population activity in the dorsal premotor cortex and putamen during the learning of arbitrary visuomotor mappings.

Authors:  Ethan R Buch; Peter J Brasted; Steven P Wise
Journal:  Exp Brain Res       Date:  2005-11-12       Impact factor: 1.972

5.  A neural circuit model of flexible sensorimotor mapping: learning and forgetting on multiple timescales.

Authors:  Stefano Fusi; Wael F Asaad; Earl K Miller; Xiao-Jing Wang
Journal:  Neuron       Date:  2007-04-19       Impact factor: 17.173

6.  A model of reward choice based on the theory of reinforcement learning.

Authors:  I A Smirnitskaya; A A Frolov; G Kh Merzhanova
Journal:  Neurosci Behav Physiol       Date:  2008-03

7.  Human dorsal striatum encodes prediction errors during observational learning of instrumental actions.

Authors:  Jeffrey C Cooper; Simon Dunne; Teresa Furey; John P O'Doherty
Journal:  J Cogn Neurosci       Date:  2011-08-03       Impact factor: 3.225

8.  Dopaminergic drugs modulate learning rates and perseveration in Parkinson's patients in a dynamic foraging task.

Authors:  Robb B Rutledge; Stephanie C Lazzaro; Brian Lau; Catherine E Myers; Mark A Gluck; Paul W Glimcher
Journal:  J Neurosci       Date:  2009-12-02       Impact factor: 6.167

Review 9.  Plasticity at hippocampal to prefrontal cortex synapses is impaired by loss of dopamine and stress: importance for psychiatric diseases.

Authors:  Thérèse M Jay; Cyril Rocher; Maïte Hotte; Laurent Naudon; Hirac Gurden; Michael Spedding
Journal:  Neurotox Res       Date:  2004       Impact factor: 3.911

10.  Involvement of dopaminergic processes in the striatum during the effects of corticoliberin on the behavior of active and passive rats.

Authors:  V G Shalyapina; V V Rakitskaya; G G Rodionov
Journal:  Neurosci Behav Physiol       Date:  2003-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.