Literature DB >> 21896766

Dopamine neurons learn to encode the long-term value of multiple future rewards.

Kazuki Enomoto1, Naoyuki Matsumoto, Sadamu Nakai, Takemasa Satoh, Tatsuo K Sato, Yasumasa Ueda, Hitoshi Inokawa, Masahiko Haruno, Minoru Kimura.   

Abstract

Midbrain dopamine neurons signal reward value, their prediction error, and the salience of events. If they play a critical role in achieving specific distant goals, long-term future rewards should also be encoded as suggested in reinforcement learning theories. Here, we address this experimentally untested issue. We recorded 185 dopamine neurons in three monkeys that performed a multistep choice task in which they explored a reward target among alternatives and then exploited that knowledge to receive one or two additional rewards by choosing the same target in a set of subsequent trials. An analysis of anticipatory licking for reward water indicated that the monkeys did not anticipate an immediately expected reward in individual trials; rather, they anticipated the sum of immediate and multiple future rewards. In accordance with this behavioral observation, the dopamine responses to the start cues and reinforcer beeps reflected the expected values of the multiple future rewards and their errors, respectively. More specifically, when monkeys learned the multistep choice task over the course of several weeks, the responses of dopamine neurons encoded the sum of the immediate and expected multiple future rewards. The dopamine responses were quantitatively predicted by theoretical descriptions of the value function with time discounting in reinforcement learning. These findings demonstrate that dopamine neurons learn to encode the long-term value of multiple future rewards with distant rewards discounted.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21896766      PMCID: PMC3174584          DOI: 10.1073/pnas.1014457108

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  31 in total

Review 1.  Is the short-latency dopamine response too short to signal reward error?

Authors:  P Redgrave; T J Prescott; K Gurney
Journal:  Trends Neurosci       Date:  1999-04       Impact factor: 13.837

2.  Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops.

Authors:  Saori C Tanaka; Kenji Doya; Go Okada; Kazutaka Ueda; Yasumasa Okamoto; Shigeto Yamawaki
Journal:  Nat Neurosci       Date:  2004-07-04       Impact factor: 24.884

3.  Midbrain dopamine neurons encode decisions for future action.

Authors:  Genela Morris; Alon Nevet; David Arkadir; Eilon Vaadia; Hagai Bergman
Journal:  Nat Neurosci       Date:  2006-07-23       Impact factor: 24.884

4.  An "as soon as possible" effect in human intertemporal decision making: behavioral evidence and neural mechanisms.

Authors:  Joseph W Kable; Paul W Glimcher
Journal:  J Neurophysiol       Date:  2010-02-24       Impact factor: 2.714

Review 5.  A neural substrate of prediction and reward.

Authors:  W Schultz; P Dayan; P R Montague
Journal:  Science       Date:  1997-03-14       Impact factor: 47.728

6.  Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort.

Authors:  F Denk; M E Walton; K A Jennings; T Sharp; M F S Rushworth; D M Bannerman
Journal:  Psychopharmacology (Berl)       Date:  2004-12-10       Impact factor: 4.530

7.  Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards.

Authors:  Matthew R Roesch; Donna J Calu; Geoffrey Schoenbaum
Journal:  Nat Neurosci       Date:  2007-11-18       Impact factor: 24.884

8.  Hyperbolically discounted temporal difference learning.

Authors:  William H Alexander; Joshua W Brown
Journal:  Neural Comput       Date:  2010-06       Impact factor: 2.026

9.  The temporal precision of reward prediction in dopamine neurons.

Authors:  Christopher D Fiorillo; William T Newsome; Wolfram Schultz
Journal:  Nat Neurosci       Date:  2008-08       Impact factor: 24.884

10.  Dopamine, time, and impulsivity in humans.

Authors:  Alex Pine; Tamara Shiner; Ben Seymour; Raymond J Dolan
Journal:  J Neurosci       Date:  2010-06-30       Impact factor: 6.167

View more
  32 in total

Review 1.  Components and characteristics of the dopamine reward utility signal.

Authors:  William R Stauffer; Armin Lak; Shunsuke Kobayashi; Wolfram Schultz
Journal:  J Comp Neurol       Date:  2015-09-08       Impact factor: 3.215

2.  Separate mesocortical and mesolimbic pathways encode effort and reward learning signals.

Authors:  Tobias U Hauser; Eran Eldar; Raymond J Dolan
Journal:  Proc Natl Acad Sci U S A       Date:  2017-08-14       Impact factor: 11.205

3.  Rethinking dopamine as generalized prediction error.

Authors:  Matthew P H Gardner; Geoffrey Schoenbaum; Samuel J Gershman
Journal:  Proc Biol Sci       Date:  2018-11-21       Impact factor: 5.349

4.  Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior.

Authors:  Kenji Morita; Mieko Morishima; Katsuyuki Sakai; Yasuo Kawaguchi
Journal:  J Neurosci       Date:  2013-05-15       Impact factor: 6.167

Review 5.  Dopamine reward prediction-error signalling: a two-component response.

Authors:  Wolfram Schultz
Journal:  Nat Rev Neurosci       Date:  2016-02-11       Impact factor: 34.870

Review 6.  Evaluating the role of the alpha-7 nicotinic acetylcholine receptor in the pathophysiology and treatment of schizophrenia.

Authors:  Jared W Young; Mark A Geyer
Journal:  Biochem Pharmacol       Date:  2013-07-12       Impact factor: 5.858

Review 7.  Roles of centromedian parafascicular nuclei of thalamus and cholinergic interneurons in the dorsal striatum in associative learning of environmental events.

Authors:  Ko Yamanaka; Yukiko Hori; Takafumi Minamimoto; Hiroshi Yamada; Naoyuki Matsumoto; Kazuki Enomoto; Toshihiko Aosaki; Ann M Graybiel; Minoru Kimura
Journal:  J Neural Transm (Vienna)       Date:  2017-03-21       Impact factor: 3.575

Review 8.  The power of price compels you: Behavioral economic insights into dopamine-based valuation of rewarding and aversively motivated behavior.

Authors:  Erik B Oleson; Jonté B Roberts
Journal:  Brain Res       Date:  2018-12-11       Impact factor: 3.252

9.  Disrupted expected value and prediction error signaling in youths with disruptive behavior disorders during a passive avoidance task.

Authors:  Stuart F White; Kayla Pope; Stephen Sinclair; Katherine A Fowler; Sarah J Brislin; W Craig Williams; Daniel S Pine; R James R Blair
Journal:  Am J Psychiatry       Date:  2013-03       Impact factor: 18.112

10.  Learning to represent reward structure: a key to adapting to complex environments.

Authors:  Hiroyuki Nakahara; Okihide Hikosaka
Journal:  Neurosci Res       Date:  2012-10-13       Impact factor: 3.304

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.