Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning.

Literature DB >> 27639720

From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning.

Stefan Elfwing¹, Eiji Uchibe², Kenji Doya³.

Abstract

Free-energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state and action spaces. However, the FERL method does only really work well with binary, or close to binary, state input, where the number of active states is fewer than the number of non-active states. In the FERL method, the value function is approximated by the negative free energy of a restricted Boltzmann machine (RBM). In our earlier study, we demonstrated that the performance and the robustness of the FERL method can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that RBM function approximation can be further improved by approximating the value function by the negative expected energy (EERL), instead of the negative free energy, as well as being able to handle continuous state input. We validate our proposed method by demonstrating that EERL: (1) outperforms FERL, as well as standard neural network and linear function approximation, for three versions of a gridworld task with high-dimensional image state input; (2) achieves new state-of-the-art results in stochastic SZ-Tetris in both model-free and model-based learning settings; and (3) significantly outperforms FERL and standard neural network function approximation for a robot navigation task with raw and noisy RGB images as state input and a large number of actions.

Entities: Chemical

Keywords: Expected energy; Function approximation; Reinforcement learning; Restricted Boltzmann machine; SZ-Tetris

Mesh：

Year: 2016 PMID： 27639720 DOI： 10.1016/j.neunet.2016.07.013

Source DB: PubMed Journal: Neural Netw ISSN： 0893-6080

Keyword Cloud
Cited

4 in total

From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning.

1. Bayesian mechanics of perceptual inference and motor control in the brain.

2. Dark control: The default mode network as a reinforcement learning agent.

Review 3. Variational ecology and the physics of sentient systems.

4. Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning.