Literature DB >> 31659343

Computational noise in reward-guided learning drives behavioral variability in volatile environments.

Charles Findling1,2, Vasilisa Skvortsova1, Rémi Dromnelle1,3, Stefano Palminteri1, Valentin Wyart4.   

Abstract

When learning the value of actions in volatile environments, humans often make seemingly irrational decisions that fail to maximize expected value. We reasoned that these 'non-greedy' decisions, instead of reflecting information seeking during choice, may be caused by computational noise in the learning of action values. Here using reinforcement learning models of behavior and multimodal neurophysiological data, we show that the majority of non-greedy decisions stem from this learning noise. The trial-to-trial variability of sequential learning steps and their impact on behavior could be predicted both by blood oxygen level-dependent responses to obtained rewards in the dorsal anterior cingulate cortex and by phasic pupillary dilation, suggestive of neuromodulatory fluctuations driven by the locus coeruleus-norepinephrine system. Together, these findings indicate that most behavioral variability, rather than reflecting human exploration, is due to the limited computational precision of reward-guided learning.

Entities:  

Mesh:

Year:  2019        PMID: 31659343     DOI: 10.1038/s41593-019-0518-9

Source DB:  PubMed          Journal:  Nat Neurosci        ISSN: 1097-6256            Impact factor:   24.884


  34 in total

Review 1.  Neural coding and the basic law of psychophysics.

Authors:  Kenneth O Johnson; Steven S Hsiao; Takashi Yoshioka
Journal:  Neuroscientist       Date:  2002-04       Impact factor: 7.519

2.  Dynamic response-by-response models of matching behavior in rhesus monkeys.

Authors:  Brian Lau; Paul W Glimcher
Journal:  J Exp Anal Behav       Date:  2005-11       Impact factor: 2.468

3.  How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action.

Authors:  Erie D Boorman; Timothy E J Behrens; Mark W Woolrich; Matthew F S Rushworth
Journal:  Neuron       Date:  2009-06-11       Impact factor: 17.173

Review 4.  The Importance of Falsification in Computational Cognitive Modeling.

Authors:  Stefano Palminteri; Valentin Wyart; Etienne Koechlin
Journal:  Trends Cogn Sci       Date:  2017-05-02       Impact factor: 20.229

5.  Humans use directed and random exploration to solve the explore-exploit dilemma.

Authors:  Robert C Wilson; Andra Geana; John M White; Elliot A Ludvig; Jonathan D Cohen
Journal:  J Exp Psychol Gen       Date:  2014-10-27

6.  Cortical substrates for exploratory decisions in humans.

Authors:  Nathaniel D Daw; John P O'Doherty; Peter Dayan; Ben Seymour; Raymond J Dolan
Journal:  Nature       Date:  2006-06-15       Impact factor: 49.962

7.  Computational Precision of Mental Inference as Critical Source of Human Choice Suboptimality.

Authors:  Jan Drugowitsch; Valentin Wyart; Anne-Dominique Devauchelle; Etienne Koechlin
Journal:  Neuron       Date:  2016-12-01       Impact factor: 17.173

8.  Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Authors:  Samuel J Gershman; Bijan Pesaran; Nathaniel D Daw
Journal:  J Neurosci       Date:  2009-10-28       Impact factor: 6.167

9.  Stimulus onset quenches neural variability: a widespread cortical phenomenon.

Authors:  Mark M Churchland; Byron M Yu; John P Cunningham; Leo P Sugrue; Marlene R Cohen; Greg S Corrado; William T Newsome; Andrew M Clark; Paymon Hosseini; Benjamin B Scott; David C Bradley; Matthew A Smith; Adam Kohn; J Anthony Movshon; Katherine M Armstrong; Tirin Moore; Steve W Chang; Lawrence H Snyder; Stephen G Lisberger; Nicholas J Priebe; Ian M Finn; David Ferster; Stephen I Ryu; Gopal Santhanam; Maneesh Sahani; Krishna V Shenoy
Journal:  Nat Neurosci       Date:  2010-02-21       Impact factor: 24.884

10.  Contextual modulation of value signals in reward and punishment learning.

Authors:  Stefano Palminteri; Mehdi Khamassi; Mateus Joffily; Giorgio Coricelli
Journal:  Nat Commun       Date:  2015-08-25       Impact factor: 14.919

View more
  29 in total

1.  Optimal utility and probability functions for agents with finite computational precision.

Authors:  Keno Juechems; Jan Balaguer; Bernhard Spitzer; Christopher Summerfield
Journal:  Proc Natl Acad Sci U S A       Date:  2021-01-12       Impact factor: 11.205

2.  Computational mechanisms of curiosity and goal-directed exploration.

Authors:  Philipp Schwartenbeck; Johannes Passecker; Tobias U Hauser; Thomas Hb FitzGerald; Martin Kronbichler; Karl J Friston
Journal:  Elife       Date:  2019-05-10       Impact factor: 8.140

3.  Gated recurrence enables simple and accurate sequence prediction in stochastic, changing, and structured environments.

Authors:  Cédric Foucault; Florent Meyniel
Journal:  Elife       Date:  2021-12-02       Impact factor: 8.140

4.  Imprecise action selection in substance use disorder: Evidence for active learning impairments when solving the explore-exploit dilemma.

Authors:  Ryan Smith; Philipp Schwartenbeck; Jennifer L Stewart; Rayus Kuplicki; Hamed Ekhtiari; Martin P Paulus
Journal:  Drug Alcohol Depend       Date:  2020-08-06       Impact factor: 4.492

5.  Balancing exploration and exploitation with information and randomization.

Authors:  Robert C Wilson; Elizabeth Bonawitz; Vincent D Costa; R Becket Ebitz
Journal:  Curr Opin Behav Sci       Date:  2020-11-06

6.  Lapses in perceptual decisions reflect exploration.

Authors:  Sashank Pisupati; Lital Chartarifsky-Lynn; Anup Khanal; Anne K Churchland
Journal:  Elife       Date:  2021-01-11       Impact factor: 8.140

7.  All or nothing belief updating in patients with schizophrenia reduces precision and flexibility of beliefs.

Authors:  Matthew R Nassar; James A Waltz; Matthew A Albrecht; James M Gold; Michael J Frank
Journal:  Brain       Date:  2021-04-12       Impact factor: 13.501

8.  Pupil Dilation and the Slow Wave ERP Reflect Surprise about Choice Outcome Resulting from Intrinsic Variability in Decision Confidence.

Authors:  Jan Willem de Gee; Camile M C Correa; Matthew Weaver; Tobias H Donner; Simon van Gaal
Journal:  Cereb Cortex       Date:  2021-06-10       Impact factor: 5.357

9.  Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex.

Authors:  Marco K Wittmann; Matthew F S Rushworth; Nadescha Trudel; Jacqueline Scholl; Miriam C Klein-Flügge; Elsa Fouragnan; Lev Tankelevitch
Journal:  Nat Hum Behav       Date:  2020-08-31

10.  Variability in Action Selection Relates to Striatal Dopamine 2/3 Receptor Availability in Humans: A PET Neuroimaging Study Using Reinforcement Learning and Active Inference Models.

Authors:  Rick A Adams; Michael Moutoussis; Matthew M Nour; Tarik Dahoun; Declan Lewis; Benjamin Illingworth; Mattia Veronese; Christoph Mathys; Lieke de Boer; Marc Guitart-Masip; Karl J Friston; Oliver D Howes; Jonathan P Roiser
Journal:  Cereb Cortex       Date:  2020-05-18       Impact factor: 5.357

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.