Literature DB >> 20100071

Hyperbolically discounted temporal difference learning.

William H Alexander1, Joshua W Brown.   

Abstract

Hyperbolic discounting of future outcomes is widely observed to underlie choice behavior in animals. Additionally, recent studies (Kobayashi & Schultz, 2008) have reported that hyperbolic discounting is observed even in neural systems underlying choice. However, the most prevalent models of temporal discounting, such as temporal difference learning, assume that future outcomes are discounted exponentially. Exponential discounting has been preferred largely because it can be expressed recursively, whereas hyperbolic discounting has heretofore been thought not to have a recursive definition. In this letter, we define a learning algorithm, hyperbolically discounted temporal difference (HDTD) learning, which constitutes a recursive formulation of the hyperbolic model.

Entities:  

Mesh:

Year:  2010        PMID: 20100071      PMCID: PMC3005720          DOI: 10.1162/neco.2010.08-09-1080

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  10 in total

1.  Temporal difference model reproduces anticipatory neural activity.

Authors:  R E Suri; W Schultz
Journal:  Neural Comput       Date:  2001-04       Impact factor: 2.026

2.  Reward-predicting and reward-detecting neuronal activity in the primate supplementary eye field.

Authors:  N Amador; M Schlag-Rey; J Schlag
Journal:  J Neurophysiol       Date:  2000-10       Impact factor: 2.714

3.  Neuronal activity in monkey ventral striatum related to the expectation of reward.

Authors:  W Schultz; P Apicella; E Scarnati; T Ljungberg
Journal:  J Neurosci       Date:  1992-12       Impact factor: 6.167

4.  Discounting of delayed rewards: Models of individual choice.

Authors:  J Myerson; L Green
Journal:  J Exp Anal Behav       Date:  1995-11       Impact factor: 2.468

5.  Reward processing in primate orbitofrontal cortex and basal ganglia.

Authors:  W Schultz; L Tremblay; J R Hollerman
Journal:  Cereb Cortex       Date:  2000-03       Impact factor: 5.357

6.  Reward-related activity in the monkey striatum and substantia nigra.

Authors:  W Schultz; P Apicella; T Ljungberg; R Romo; E Scarnati
Journal:  Prog Brain Res       Date:  1993       Impact factor: 2.453

7.  Long-term reward prediction in TD models of the dopamine system.

Authors:  Nathaniel D Daw; David S Touretzky
Journal:  Neural Comput       Date:  2002-11       Impact factor: 2.026

8.  Low-serotonin levels increase delayed reward discounting in humans.

Authors:  Nicolas Schweighofer; Mathieu Bertin; Kazuhiro Shishida; Yasumasa Okamoto; Saori C Tanaka; Shigeto Yamawaki; Kenji Doya
Journal:  J Neurosci       Date:  2008-04-23       Impact factor: 6.167

9.  Preference for sequences of rewards: further tests of a parallel discounting model.

Authors:  D Brunner
Journal:  Behav Processes       Date:  1999-04       Impact factor: 1.777

10.  Influence of reward delays on responses of dopamine neurons.

Authors:  Shunsuke Kobayashi; Wolfram Schultz
Journal:  J Neurosci       Date:  2008-07-30       Impact factor: 6.167

  10 in total
  6 in total

1.  Dopamine neurons learn to encode the long-term value of multiple future rewards.

Authors:  Kazuki Enomoto; Naoyuki Matsumoto; Sadamu Nakai; Takemasa Satoh; Tatsuo K Sato; Yasumasa Ueda; Hitoshi Inokawa; Masahiko Haruno; Minoru Kimura
Journal:  Proc Natl Acad Sci U S A       Date:  2011-09-06       Impact factor: 11.205

2.  A reinforcement learning model of precommitment in decision making.

Authors:  Zeb Kurth-Nelson; A David Redish
Journal:  Front Behav Neurosci       Date:  2010-12-14       Impact factor: 3.617

3.  Discounting of reward sequences: a test of competing formal models of hyperbolic discounting.

Authors:  Noah Zarr; William H Alexander; Joshua W Brown
Journal:  Front Psychol       Date:  2014-03-06

Review 4.  Does temporal discounting explain unhealthy behavior? A systematic review and reinforcement learning perspective.

Authors:  Giles W Story; Ivo Vlaev; Ben Seymour; Ara Darzi; Raymond J Dolan
Journal:  Front Behav Neurosci       Date:  2014-03-12       Impact factor: 3.558

5.  A Neural Network Framework for Cognitive Bias.

Authors:  Johan E Korteling; Anne-Marie Brouwer; Alexander Toet
Journal:  Front Psychol       Date:  2018-09-03

6.  Don'T let me do that! - models of precommitment.

Authors:  Zeb Kurth-Nelson; A David Redish
Journal:  Front Neurosci       Date:  2012-10-08       Impact factor: 4.677

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.