| Literature DB >> 33937915 |
Darrell A Worthy1, A Ross Otto2, Astin C Cornwall1, Hilary J Don3, Tyler Davis4.
Abstract
The Delta and Decay rules are two learning rules used to update expected values in reinforcement learning (RL) models. The delta rule learns average rewards, whereas the decay rule learns cumulative rewards for each option. Participants learned to select between pairs of options that had reward probabilities of .65 (option A) versus .35 (option B) or .75 (option C) versus .25 (option D) on separate trials in a binary-outcome choice task. Crucially, during training there were twice as AB trials as CD trials, therefore participants experienced more cumulative reward from option A even though option C had a higher average reward rate (.75 versus .65). Participants then decided between novel combinations of options (e.g, A versus C). The Decay model predicted more A choices, but the Delta model predicted more C choices, because those respective options had higher cumulative versus average reward values. Results were more in line with the Decay model's predictions. This suggests that people may retrieve memories of cumulative reward to compute expected value instead of learning average rewards for each option.Entities:
Keywords: base rates; decay rule; delta rule; prediction error; probability learning; reinforcement learning
Year: 2018 PMID: 33937915 PMCID: PMC8086699
Source DB: PubMed Journal: Cogsci