Literature DB >> 33092893

Distributional Reinforcement Learning in the Brain.

Adam S Lowet1, Qiao Zheng2, Sara Matias1, Jan Drugowitsch3, Naoshige Uchida4.   

Abstract

Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in the mammalian midbrain and the reward prediction errors of reinforcement learning algorithms, which express the difference between actual reward and predicted mean reward. However, it may be advantageous to learn not only the mean but also the complete distribution of potential rewards. Recent advances in machine learning have revealed a biologically plausible set of algorithms for reconstructing this reward distribution from experience. Here, we review the mathematical foundations of these algorithms as well as initial evidence for their neurobiological implementation. We conclude by highlighting outstanding questions regarding the circuit computation and behavioral readout of these distributional codes.
Copyright © 2020 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Entities:  

Keywords:  artificial intelligence; deep neural networks; dopamine; machine learning; population coding; reward

Year:  2020        PMID: 33092893      PMCID: PMC8073212          DOI: 10.1016/j.tins.2020.09.004

Source DB:  PubMed          Journal:  Trends Neurosci        ISSN: 0166-2236            Impact factor:   13.837


  69 in total

Review 1.  Metalearning and neuromodulation.

Authors:  Kenji Doya
Journal:  Neural Netw       Date:  2002 Jun-Jul

2.  Risk preference following adolescent alcohol use is associated with corrupted encoding of costs but not rewards by mesolimbic dopamine.

Authors:  Nicholas A Nasrallah; Jeremy J Clark; Annie L Collins; Christina A Akers; Paul E Phillips; Ilene L Bernstein
Journal:  Proc Natl Acad Sci U S A       Date:  2011-03-14       Impact factor: 11.205

Review 3.  Striatal circuits for reward learning and decision-making.

Authors:  Julia Cox; Ilana B Witten
Journal:  Nat Rev Neurosci       Date:  2019-08       Impact factor: 34.870

Review 4.  Neural Circuitry of Reward Prediction Error.

Authors:  Mitsuko Watabe-Uchida; Neir Eshel; Naoshige Uchida
Journal:  Annu Rev Neurosci       Date:  2017-04-24       Impact factor: 12.449

5.  Performance-optimized hierarchical models predict neural responses in higher visual cortex.

Authors:  Daniel L K Yamins; Ha Hong; Charles F Cadieu; Ethan A Solomon; Darren Seibert; James J DiCarlo
Journal:  Proc Natl Acad Sci U S A       Date:  2014-05-08       Impact factor: 11.205

Review 6.  Reinforcement Learning, Fast and Slow.

Authors:  Matthew Botvinick; Sam Ritter; Jane X Wang; Zeb Kurth-Nelson; Charles Blundell; Demis Hassabis
Journal:  Trends Cogn Sci       Date:  2019-04-16       Impact factor: 20.229

Review 7.  Adaptive learning under expected and unexpected uncertainty.

Authors:  Alireza Soltani; Alicia Izquierdo
Journal:  Nat Rev Neurosci       Date:  2019-10       Impact factor: 34.870

8.  Activity patterns of serotonin neurons underlying cognitive flexibility.

Authors:  Sara Matias; Eran Lottem; Guillaume P Dugué; Zachary F Mainen
Journal:  Elife       Date:  2017-03-21       Impact factor: 8.140

9.  The Mesoaccumbens Pathway: A Retrograde Labeling and Single-Cell Axon Tracing Analysis in the Mouse.

Authors:  Claudia Rodríguez-López; Francisco Clascá; Lucía Prensa
Journal:  Front Neuroanat       Date:  2017-03-27       Impact factor: 3.856

10.  Arithmetic and local circuitry underlying dopamine prediction errors.

Authors:  Neir Eshel; Michael Bukwich; Vinod Rao; Vivian Hemmelder; Ju Tian; Naoshige Uchida
Journal:  Nature       Date:  2015-08-31       Impact factor: 49.962

View more
  5 in total

1.  Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task.

Authors:  Iku Tsutsui-Kimura; Hideyuki Matsumoto; Korleki Akiti; Melissa M Yamada; Naoshige Uchida; Mitsuko Watabe-Uchida
Journal:  Elife       Date:  2020-12-21       Impact factor: 8.140

2.  Task-induced neural covariability as a signature of approximate Bayesian learning and inference.

Authors:  Richard D Lange; Ralf M Haefner
Journal:  PLoS Comput Biol       Date:  2022-03-08       Impact factor: 4.475

Review 3.  Mesoaccumbal Dopamine Heterogeneity: What Do Dopamine Firing and Release Have to Do with It?

Authors:  Johannes W de Jong; Kurt M Fraser; Stephan Lammel
Journal:  Annu Rev Neurosci       Date:  2022-02-28       Impact factor: 15.553

4.  Interoception as modeling, allostasis as control.

Authors:  Eli Sennesh; Jordan Theriault; Dana Brooks; Jan-Willem van de Meent; Lisa Feldman Barrett; Karen S Quigley
Journal:  Biol Psychol       Date:  2021-12-20       Impact factor: 3.111

5.  Rational inattention in mice.

Authors:  Nikola Grujic; Jeroen Brus; Denis Burdakov; Rafael Polania
Journal:  Sci Adv       Date:  2022-03-04       Impact factor: 14.136

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.