Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An information-theoretic approach to curiosity-driven reinforcement learning.

Literature DB >> 22791268

An information-theoretic approach to curiosity-driven reinforcement learning.

Abstract

We provide a fresh look at the problem of exploration in reinforcement learning, drawing on ideas from information theory. First, we show that Boltzmann-style exploration, one of the main exploration methods used in reinforcement learning, is optimal from an information-theoretic point of view, in that it optimally trades expected return for the coding cost of the policy. Second, we address the problem of curiosity-driven learning. We propose that, in addition to maximizing the expected return, a learner should choose a policy that also maximizes the learner's predictive power. This makes the world both interesting and exploitable. Optimal policies then have the form of Boltzmann-style exploration with a bonus, containing a novel exploration-exploitation trade-off which emerges naturally from the proposed optimization principle. Importantly, this exploration-exploitation trade-off persists in the optimal deterministic policy, i.e., when there is no exploration due to randomness. As a result, exploration is understood as an emerging behavior that optimizes information gain, rather than being modeled as pure randomization of action choices.

Entities: Disease

Mesh：

Year: 2012 PMID： 22791268 DOI： 10.1007/s12064-011-0142-z

Source DB: PubMed Journal: Theory Biosci ISSN： 1431-7613 Impact factor: 1.919

6 in total

1. Predictability, complexity, and learning.

Authors: W Bialek; I Nemenman; N Tishby
Journal: Neural Comput Date: 2001-11 Impact factor: 2.026

2. Regularities unseen, randomness observed: levels of entropy convergence.

Authors: James P Crutchfield; David P Feldman
Journal: Chaos Date: 2003-03 Impact factor: 3.642

3. Statistical mechanics and phase transitions in clustering.

Authors:
Journal: Phys Rev Lett Date: 1990-08-20 Impact factor: 9.161

4. How many clusters? An information-theoretic perspective.

Authors: Susanne Still; William Bialek
Journal: Neural Comput Date: 2004-12 Impact factor: 2.026

5. Efficient computation of optimal actions.

Authors: Emanuel Todorov
Journal: Proc Natl Acad Sci U S A Date: 2009-07-02 Impact factor: 11.205

6. Reinforcement learning of motor skills with policy gradients.

Authors: Jan Peters; Stefan Schaal
Journal: Neural Netw Date: 2008-04-26

6 in total

12 in total

1. Guided self-organization: perception-action loops of embodied systems.

Authors: Nihat Ay; Ralf Der; Mikhail Prokopenko
Journal: Theory Biosci Date: 2012-09 Impact factor: 1.919

2. Computational mechanisms of curiosity and goal-directed exploration.

Authors: Philipp Schwartenbeck; Johannes Passecker; Tobias U Hauser; Thomas Hb FitzGerald; Martin Kronbichler; Karl J Friston
Journal: Elife Date: 2019-05-10 Impact factor: 8.140

Review 3. Extrinsic rewards, intrinsic rewards, and non-optimal behavior.

Authors: Mousa Karayanni; Israel Nelken
Journal: J Comput Neurosci Date: 2022-02-05 Impact factor: 1.621

Review 4. Deep temporal models and active inference.

Authors: Karl J Friston; Richard Rosch; Thomas Parr; Cathy Price; Howard Bowman
Journal: Neurosci Biobehav Rev Date: 2017-04-14 Impact factor: 8.989

5. Sensory substitution reveals a manipulation bias.

Authors: Anja T Zai; Sophie Cavé-Lopez; Manon Rolland; Nicolas Giret; Richard H R Hahnloser
Journal: Nat Commun Date: 2020-11-23 Impact factor: 14.919

6. Entropic Regularization of Markov Decision Processes.

Authors: Boris Belousov; Jan Peters
Journal: Entropy (Basel) Date: 2019-07-10 Impact factor: 2.524

7. Scene Construction, Visual Foraging, and Active Inference.

Authors: M Berk Mirza; Rick A Adams; Christoph D Mathys; Karl J Friston
Journal: Front Comput Neurosci Date: 2016-06-14 Impact factor: 2.380

Review 8. Active inference and learning.

Authors: Karl Friston; Thomas FitzGerald; Francesco Rigoli; Philipp Schwartenbeck; John O Doherty; Giovanni Pezzulo
Journal: Neurosci Biobehav Rev Date: 2016-06-29 Impact factor: 8.989

9. Deep temporal models and active inference.

Authors: Karl J Friston; Richard Rosch; Thomas Parr; Cathy Price; Howard Bowman
Journal: Neurosci Biobehav Rev Date: 2018-05-08 Impact factor: 8.989

10. Active Inference and Cognitive Consistency.

Authors: Karl J Friston
Journal: Psychol Inq Date: 2018-10-10