Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Distributional Reinforcement Learning in the Brain.

Literature DB >> 33092893

Distributional Reinforcement Learning in the Brain.

Adam S Lowet¹, Qiao Zheng², Sara Matias¹, Jan Drugowitsch³, Naoshige Uchida⁴.

Abstract

Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in the mammalian midbrain and the reward prediction errors of reinforcement learning algorithms, which express the difference between actual reward and predicted mean reward. However, it may be advantageous to learn not only the mean but also the complete distribution of potential rewards. Recent advances in machine learning have revealed a biologically plausible set of algorithms for reconstructing this reward distribution from experience. Here, we review the mathematical foundations of these algorithms as well as initial evidence for their neurobiological implementation. We conclude by highlighting outstanding questions regarding the circuit computation and behavioral readout of these distributional codes.

Entities: Chemical Disease Species

Keywords: artificial intelligence; deep neural networks; dopamine; machine learning; population coding; reward

Year: 2020 PMID： 33092893 PMCID： PMC8073212 DOI： 10.1016/j.tins.2020.09.004

Source DB: PubMed Journal: Trends Neurosci ISSN： 0166-2236 Impact factor: 13.837

69 in total

Review 1. Metalearning and neuromodulation.

Authors: Kenji Doya
Journal: Neural Netw Date: 2002 Jun-Jul

2. Risk preference following adolescent alcohol use is associated with corrupted encoding of costs but not rewards by mesolimbic dopamine.

Authors: Nicholas A Nasrallah; Jeremy J Clark; Annie L Collins; Christina A Akers; Paul E Phillips; Ilene L Bernstein
Journal: Proc Natl Acad Sci U S A Date: 2011-03-14 Impact factor: 11.205

Review 3. Striatal circuits for reward learning and decision-making.

Authors: Julia Cox; Ilana B Witten
Journal: Nat Rev Neurosci Date: 2019-08 Impact factor: 34.870

Review 4. Neural Circuitry of Reward Prediction Error.

Authors: Mitsuko Watabe-Uchida; Neir Eshel; Naoshige Uchida
Journal: Annu Rev Neurosci Date: 2017-04-24 Impact factor: 12.449

5. Performance-optimized hierarchical models predict neural responses in higher visual cortex.

Authors: Daniel L K Yamins; Ha Hong; Charles F Cadieu; Ethan A Solomon; Darren Seibert; James J DiCarlo
Journal: Proc Natl Acad Sci U S A Date: 2014-05-08 Impact factor: 11.205

Review 6. Reinforcement Learning, Fast and Slow.

Authors: Matthew Botvinick; Sam Ritter; Jane X Wang; Zeb Kurth-Nelson; Charles Blundell; Demis Hassabis
Journal: Trends Cogn Sci Date: 2019-04-16 Impact factor: 20.229

Review 7. Adaptive learning under expected and unexpected uncertainty.

Authors: Alireza Soltani; Alicia Izquierdo
Journal: Nat Rev Neurosci Date: 2019-10 Impact factor: 34.870

8. Activity patterns of serotonin neurons underlying cognitive flexibility.

Authors: Sara Matias; Eran Lottem; Guillaume P Dugué; Zachary F Mainen
Journal: Elife Date: 2017-03-21 Impact factor: 8.140

9. The Mesoaccumbens Pathway: A Retrograde Labeling and Single-Cell Axon Tracing Analysis in the Mouse.

Authors: Claudia Rodríguez-López; Francisco Clascá; Lucía Prensa
Journal: Front Neuroanat Date: 2017-03-27 Impact factor: 3.856

10. Arithmetic and local circuitry underlying dopamine prediction errors.

Authors: Neir Eshel; Michael Bukwich; Vinod Rao; Vivian Hemmelder; Ju Tian; Naoshige Uchida
Journal: Nature Date: 2015-08-31 Impact factor: 49.962

5 in total

1. Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task.

Authors: Iku Tsutsui-Kimura; Hideyuki Matsumoto; Korleki Akiti; Melissa M Yamada; Naoshige Uchida; Mitsuko Watabe-Uchida
Journal: Elife Date: 2020-12-21 Impact factor: 8.140