Literature DB >> 12433290

Long-term reward prediction in TD models of the dopamine system.

Nathaniel D Daw1, David S Touretzky.   

Abstract

This article addresses the relationship between long-term reward predictions and slow-timescale neural activity in temporal difference (TD) models of the dopamine system. Such models attempt to explain how the activity of dopamine (DA) neurons relates to errors in the prediction of future rewards. Previous models have been mostly restricted to short-term predictions of rewards expected during a single, somewhat artificially defined trial. Also, the models focused exclusively on the phasic pause-and-burst activity of primate DA neurons; the neurons' slower, tonic background activity was assumed to be constant. This has led to difficulty in explaining the results of neurochemical experiments that measure indications of DA release on a slow timescale, results that seem at first glance inconsistent with a reward prediction model. In this article, we investigate a TD model of DA activity modified so as to enable it to make longer-term predictions about rewards expected far in the future. We show that these predictions manifest themselves as slow changes in the baseline error signal, which we associate with tonic DA activity. Using this model, we make new predictions about the behavior of the DA system in a number of experimental situations. Some of these predictions suggest new computational explanations for previously puzzling data, such as indications from microdialysis studies of elevated DA activity triggered by aversive events.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 12433290     DOI: 10.1162/089976602760407973

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  19 in total

Review 1.  Opponency revisited: competition and cooperation between dopamine and serotonin.

Authors:  Y-Lan Boureau; Peter Dayan
Journal:  Neuropsychopharmacology       Date:  2010-09-29       Impact factor: 7.853

2.  Two-factor theory, the actor-critic model, and conditioned avoidance.

Authors:  Tiago V Maia
Journal:  Learn Behav       Date:  2010-02       Impact factor: 1.986

3.  The Successor Representation: Its Computational Logic and Neural Substrates.

Authors:  Samuel J Gershman
Journal:  J Neurosci       Date:  2018-07-13       Impact factor: 6.167

Review 4.  Serotonin and dopamine: unifying affective, activational, and decision functions.

Authors:  Roshan Cools; Kae Nakamura; Nathaniel D Daw
Journal:  Neuropsychopharmacology       Date:  2010-08-25       Impact factor: 7.853

5.  Tonic dopamine: opportunity costs and the control of response vigor.

Authors:  Yael Niv; Nathaniel D Daw; Daphna Joel; Peter Dayan
Journal:  Psychopharmacology (Berl)       Date:  2006-10-10       Impact factor: 4.530

Review 6.  Habits, action sequences and reinforcement learning.

Authors:  Amir Dezfouli; Bernard W Balleine
Journal:  Eur J Neurosci       Date:  2012-04       Impact factor: 3.386

7.  Learning the opportunity cost of time in a patch-foraging task.

Authors:  Sara M Constantino; Nathaniel D Daw
Journal:  Cogn Affect Behav Neurosci       Date:  2015-12       Impact factor: 3.282

8.  The anatomy of choice: active inference and agency.

Authors:  Karl Friston; Philipp Schwartenbeck; Thomas Fitzgerald; Michael Moutoussis; Timothy Behrens; Raymond J Dolan
Journal:  Front Hum Neurosci       Date:  2013-09-25       Impact factor: 3.169

9.  The New Robotics-towards human-centered machines.

Authors:  Stefan Schaal
Journal:  HFSP J       Date:  2007-07-16

10.  Short-term gains, long-term pains: how cues about state aid learning in dynamic environments.

Authors:  Todd M Gureckis; Bradley C Love
Journal:  Cognition       Date:  2009-05-08
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.