Literature DB >> 16764517

Representation and timing in theories of the dopamine system.

Nathaniel D Daw1, Aaron C Courville, David S Tourtezky, David S Touretzky.   

Abstract

Although the responses of dopamine neurons in the primate midbrain are well characterized as carrying a temporal difference (TD) error signal for reward prediction, existing theories do not offer a credible account of how the brain keeps track of past sensory events that may be relevant to predicting future reward. Empirically, these shortcomings of previous theories are particularly evident in their account of experiments in which animals were exposed to variation in the timing of events. The original theories mispredicted the results of such experiments due to their use of a representational device called a tapped delay line. Here we propose that a richer understanding of history representation and a better account of these experiments can be given by considering TD algorithms for a formal setting that incorporates two features not originally considered in theories of the dopaminergic response: partial observability (a distinction between the animal's sensory experience and the true underlying state of the world) and semi-Markov dynamics (an explicit account of variation in the intervals between events). The new theory situates the dopaminergic system in a richer functional and anatomical context, since it assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data. It also accounts for a range of experimental data, including the experiments involving programmed temporal variability and other previously unmodeled dopaminergic response phenomena, which we suggest are related to subjective noise in animals' interval timing. Finally, it offers new experimental predictions and a rich theoretical framework for designing future experiments.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16764517     DOI: 10.1162/neco.2006.18.7.1637

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  66 in total

1.  A pallidus-habenula-dopamine pathway signals inferred stimulus values.

Authors:  Ethan S Bromberg-Martin; Masayuki Matsumoto; Simon Hong; Okihide Hikosaka
Journal:  J Neurophysiol       Date:  2010-06-10       Impact factor: 2.714

2.  A Neural Circuit Mechanism for Encoding Aversive Stimuli in the Mesolimbic Dopamine System.

Authors:  Johannes W de Jong; Seyedeh Atiyeh Afjei; Iskra Pollak Dorocic; James R Peck; Christine Liu; Christina K Kim; Lin Tian; Karl Deisseroth; Stephan Lammel
Journal:  Neuron       Date:  2018-11-29       Impact factor: 17.173

Review 3.  Decision theory, reinforcement learning, and the brain.

Authors:  Peter Dayan; Nathaniel D Daw
Journal:  Cogn Affect Behav Neurosci       Date:  2008-12       Impact factor: 3.282

4.  The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty.

Authors:  Clara Kwon Starkweather; Samuel J Gershman; Naoshige Uchida
Journal:  Neuron       Date:  2018-04-12       Impact factor: 17.173

5.  Two-factor theory, the actor-critic model, and conditioned avoidance.

Authors:  Tiago V Maia
Journal:  Learn Behav       Date:  2010-02       Impact factor: 1.986

Review 6.  Reinforcement learning, conditioning, and the brain: Successes and challenges.

Authors:  Tiago V Maia
Journal:  Cogn Affect Behav Neurosci       Date:  2009-12       Impact factor: 3.282

7.  Alternative time representation in dopamine models.

Authors:  François Rivest; John F Kalaska; Yoshua Bengio
Journal:  J Comput Neurosci       Date:  2009-10-22       Impact factor: 1.621

8.  Computational models of reinforcement learning: the role of dopamine as a reward signal.

Authors:  R D Samson; M J Frank; Jean-Marc Fellous
Journal:  Cogn Neurodyn       Date:  2010-03-21       Impact factor: 5.082

9.  Learning to represent reward structure: a key to adapting to complex environments.

Authors:  Hiroyuki Nakahara; Okihide Hikosaka
Journal:  Neurosci Res       Date:  2012-10-13       Impact factor: 3.304

10.  Temporal-difference reinforcement learning with distributed representations.

Authors:  Zeb Kurth-Nelson; A David Redish
Journal:  PLoS One       Date:  2009-10-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.