Literature DB >> 31430606

Learning reward frequency over reward probability: A tale of two learning rules.

Hilary J Don1, A Ross Otto2, Astin C Cornwall3, Tyler Davis4, Darrell A Worthy3.   

Abstract

Learning about the expected value of choice alternatives associated with reward is critical for adaptive behavior. Although human choice preferences are affected by the presentation frequency of reward-related alternatives, this may not be captured by some dominant models of value learning, such as the delta rule. In this study, we examined whether reward learning is driven more by learning the probability of reward provided by each option, or how frequently each option has been rewarded, and assess how well models based on average reward (e.g. the delta model) and models based on cumulative reward (e.g. the decay model) can account for choice preferences. In a binary-outcome choice task, participants selected between pairs of options that had reward probabilities of 0.65 (A) versus 0.35 (B) or 0.75 (C) versus 0.25 (D). Crucially, during training there were twice the number of AB trials as CD trials, such that option A was associated with higher cumulative reward, while option C gave higher average reward. Participants then decided between novel combinations of options (e.g., AC). Most participants preferred option A over C, a result predicted by the Decay model, but not the Delta model. We also compared the Delta and Decay models to both more simplified as well as more complex models that assumed additional mechanisms, such as representation of uncertainty. Overall, models that assume learning about cumulative reward provided the best account of the data.
Copyright © 2019 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Decay rule; Delta rule; Prediction error; Probability learning; Reinforcement learning; Reward frequency

Mesh:

Year:  2019        PMID: 31430606      PMCID: PMC6814570          DOI: 10.1016/j.cognition.2019.104042

Source DB:  PubMed          Journal:  Cognition        ISSN: 0010-0277


  26 in total

Review 1.  Neuronal coding of prediction errors.

Authors:  W Schultz; A Dickinson
Journal:  Annu Rev Neurosci       Date:  2000       Impact factor: 12.449

2.  Stimulus recognition and the mere exposure effect.

Authors:  R F Bornstein; P R D'Agostino
Journal:  J Pers Soc Psychol       Date:  1992-10

3.  Cortical substrates for exploratory decisions in humans.

Authors:  Nathaniel D Daw; John P O'Doherty; Peter Dayan; Ben Seymour; Raymond J Dolan
Journal:  Nature       Date:  2006-06-15       Impact factor: 49.962

4.  How persuasive is a good fit? A comment on theory testing.

Authors:  S Roberts; H Pashler
Journal:  Psychol Rev       Date:  2000-04       Impact factor: 8.934

5.  Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans.

Authors:  Mathias Pessiglione; Ben Seymour; Guillaume Flandin; Raymond J Dolan; Chris D Frith
Journal:  Nature       Date:  2006-08-23       Impact factor: 49.962

6.  Working-memory load and temporal myopia in dynamic decision making.

Authors:  Darrell A Worthy; A Ross Otto; W Todd Maddox
Journal:  J Exp Psychol Learn Mem Cogn       Date:  2012-04-30       Impact factor: 3.051

7.  How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis.

Authors:  Anne G E Collins; Michael J Frank
Journal:  Eur J Neurosci       Date:  2012-04       Impact factor: 3.386

8.  A Comparison Model of Reinforcement-Learning and Win-Stay-Lose-Shift Decision-Making Processes: A Tribute to W.K. Estes.

Authors:  Darrell A Worthy; W Todd Maddox
Journal:  J Math Psychol       Date:  2014-04-01       Impact factor: 2.223

9.  Adult age differences in frontostriatal representation of prediction error but not reward outcome.

Authors:  Gregory R Samanez-Larkin; Darrell A Worthy; Rui Mata; Samuel M McClure; Brian Knutson
Journal:  Cogn Affect Behav Neurosci       Date:  2014-06       Impact factor: 3.282

10.  A Unifying Probabilistic View of Associative Learning.

Authors:  Samuel J Gershman
Journal:  PLoS Comput Biol       Date:  2015-11-04       Impact factor: 4.475

View more
  3 in total

Review 1.  Hearing hooves, thinking zebras: A review of the inverse base-rate effect.

Authors:  Hilary J Don; Darrell A Worthy; Evan J Livesey
Journal:  Psychon Bull Rev       Date:  2021-02-10

2.  The more, the merrier: Treatment frequency influences effectiveness perception and further treatment choice.

Authors:  Itxaso Barberia; Fernando Blanco; Javier Rodríguez-Ferreiro
Journal:  Psychon Bull Rev       Date:  2020-10-29

3.  Choice perseverance underlies pursuing a hard-to-get target in an avatar choice task.

Authors:  Michiyo Sugawara; Kentaro Katahira
Journal:  Front Psychol       Date:  2022-09-06
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.