Literature DB >> 31942076

A distributional code for value in dopamine-based reinforcement learning.

Will Dabney1, Zeb Kurth-Nelson2,3, Matthew Botvinick2,4, Naoshige Uchida5, Clara Kwon Starkweather5, Demis Hassabis2, Rémi Munos2.   

Abstract

Since its introduction, the reward prediction error theory of dopamine has explained a wealth of empirical phenomena, providing a unifying framework for understanding the representation of reward and value in the brain1-3. According to the now canonical theory, reward predictions are represented as a single scalar quantity, which supports learning about the expectation, or mean, of stochastic outcomes. Here we propose an account of dopamine-based reinforcement learning inspired by recent artificial intelligence research on distributional reinforcement learning4-6. We hypothesized that the brain represents possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel. This idea implies a set of empirical predictions, which we tested using single-unit recordings from mouse ventral tegmental area. Our findings provide strong evidence for a neural realization of distributional reinforcement learning.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 31942076      PMCID: PMC7476215          DOI: 10.1038/s41586-019-1924-6

Source DB:  PubMed          Journal:  Nature        ISSN: 0028-0836            Impact factor:   49.962


  53 in total

1.  Beyond the Average View of Dopamine.

Authors:  Angela J Langdon; Nathaniel D Daw
Journal:  Trends Cogn Sci       Date:  2020-05-15       Impact factor: 20.229

Review 2.  Distributional Reinforcement Learning in the Brain.

Authors:  Adam S Lowet; Qiao Zheng; Sara Matias; Jan Drugowitsch; Naoshige Uchida
Journal:  Trends Neurosci       Date:  2020-10-19       Impact factor: 13.837

3.  Response-based outcome predictions and confidence regulate feedback processing and learning.

Authors:  Romy Frömer; Matthew R Nassar; Rasmus Bruckner; Birgit Stürmer; Werner Sommer; Nick Yeung
Journal:  Elife       Date:  2021-04-30       Impact factor: 8.140

4.  Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task.

Authors:  Iku Tsutsui-Kimura; Hideyuki Matsumoto; Korleki Akiti; Melissa M Yamada; Naoshige Uchida; Mitsuko Watabe-Uchida
Journal:  Elife       Date:  2020-12-21       Impact factor: 8.140

5.  Adiposity covaries with signatures of asymmetric feedback learning during adaptive decisions.

Authors:  Timothy Verstynen; Kyle Dunovan; Catherine Walsh; Chieh-Hsin Kuan; Stephen B Manuck; Peter J Gianaros
Journal:  Soc Cogn Affect Neurosci       Date:  2020-11-10       Impact factor: 3.436

6.  Transforming task representations to perform novel tasks.

Authors:  Andrew K Lampinen; James L McClelland
Journal:  Proc Natl Acad Sci U S A       Date:  2020-12-10       Impact factor: 11.205

Review 7.  How Outcome Uncertainty Mediates Attention, Learning, and Decision-Making.

Authors:  Ilya E Monosov
Journal:  Trends Neurosci       Date:  2020-07-28       Impact factor: 13.837

8.  Context-Dependent Multiplexing by Individual VTA Dopamine Neurons.

Authors:  Yves Kremer; Jérôme Flakowski; Clément Rohner; Christian Lüscher
Journal:  J Neurosci       Date:  2020-08-28       Impact factor: 6.167

9.  A Unified Framework for Dopamine Signals across Timescales.

Authors:  HyungGoo R Kim; Athar N Malik; John G Mikhael; Pol Bech; Iku Tsutsui-Kimura; Fangmiao Sun; Yajun Zhang; Yulong Li; Mitsuko Watabe-Uchida; Samuel J Gershman; Naoshige Uchida
Journal:  Cell       Date:  2020-11-27       Impact factor: 41.582

Review 10.  Dopamine, Updated: Reward Prediction Error and Beyond.

Authors:  Talia N Lerner; Ashley L Holloway; Jillian L Seiler
Journal:  Curr Opin Neurobiol       Date:  2020-11-14       Impact factor: 6.627

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.