Literature DB >> 18280108

Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback.

Adam Ponzi1.   

Abstract

A simple working memory model based on recurrent network activation is proposed and its application to selection and reinforcement of an action is demonstrated as a solution to the temporal credit assignment problem. Reactivation of recent salient cue states is generated and maintained as a type of salience gated recurrently active working memory, while lower salience distractors are ignored. Cue reactivation during the action selection period allows the cue to select an action while its reactivation at the reward period allows the reinforcement of the action selected by the reactivated state, which is necessarily the action which led to the reward being found. A down-gating of the external input during the reactivation and maintenance prevents interference. A double winner-take-all system which selects only one cue and only one action allows the targeting of the cue-action allocation to be modified. This targeting works both to reinforce a correct cue-action allocation and to punish the allocation when cue-action allocations change. Here we suggest a firing rate neural network implementation of this system based on the basal ganglia anatomy with input from a cortical association layer where reactivations are generated by signals from the thalamus. Striatum medium spiny neurons represent actions. Auto-catalytic feedback from a dopamine reward signal modulates three-way Hebbian long term potentiation and depression at the cortical-striatal synapses which represent the cue-action associations. The model is illustrated by the numerical simulations of a simple example--that of associating a cue signal to a correct action to obtain reward after a delay period, typical of primate cue reward tasks. Through learning, the model shows a transition from an exploratory phase where actions are generated randomly, to a stable directed phase where the animal always chooses the correct action for each experienced state. When cue-action allocations change, we show that this is noticed by the model, the incorrect cue-action allocations are punished and the correct ones discovered.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 18280108     DOI: 10.1016/j.neunet.2007.12.040

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  4 in total

Review 1.  Adaptation, expertise, and giftedness: towards an understanding of cortical, subcortical, and cerebellar network contributions.

Authors:  Leonard F Koziol; Deborah Ely Budding; Dana Chidekel
Journal:  Cerebellum       Date:  2010-12       Impact factor: 3.847

2.  Striatal activity during intentional switching depends on pattern stability.

Authors:  Cinzia De Luca; Kelly J Jantzen; Silvia Comani; Maurizio Bertollo; J A Scott Kelso
Journal:  J Neurosci       Date:  2010-03-03       Impact factor: 6.167

3.  Basal ganglia neurons dynamically facilitate exploration during associative learning.

Authors:  Sameer A Sheth; Tarek Abuelem; John T Gale; Emad N Eskandar
Journal:  J Neurosci       Date:  2011-03-30       Impact factor: 6.167

4.  From Focused Thought to Reveries: A Memory System for a Conscious Robot.

Authors:  Christian Balkenius; Trond A Tjøstheim; Birger Johansson; Peter Gärdenfors
Journal:  Front Robot AI       Date:  2018-04-04
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.