Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.

Literature DB >> 34368809

Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.

Abstract

Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.

Entities: Chemical

Keywords: Bayesian modeling; decision making; multi-armed bandit; reinforcement learning

Year: 2021 PMID： 34368809 PMCID： PMC8341546

Source DB: PubMed Journal: Cogsci

Keyword Cloud
References

17 in total

10. Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.

Authors: Corey Yishan Zhou; Dalin Guo; Angela J Yu
Journal: Cogsci Date: 2020 Jul-Aug

Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.

1. Relative and absolute strength of response as a function of frequency of reinforcement.

2. Sequential effects: Superstition or rational behavior?

3. Uncertainty and exploration in a restless bandit problem.

4. Humans use directed and random exploration to solve the explore-exploit dilemma.

5. Cortical substrates for exploratory decisions in humans.

Review 6. Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration.

7. Validation of decision-making models and analysis of decision variables in the rat basal ganglia.

8. Learning the value of information and reward over time when solving exploration-exploitation problems.

9. Dopamine blockade impairs the exploration-exploitation trade-off in rats.

10. Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.