Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Efficient exploration through active learning for value function approximation in reinforcement learning.

Literature DB >> 20080026

Efficient exploration through active learning for value function approximation in reinforcement learning.

Takayuki Akiyama¹, Hirotaka Hachiya, Masashi Sugiyama.

Abstract

Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares policy iteration (LSPI) framework allows us to employ statistical active learning methods for linear regression. Then we propose a design method of good sampling policies for efficient exploration, which is particularly useful when the sampling cost of immediate rewards is high. The effectiveness of the proposed method, which we call active policy iteration (API), is demonstrated through simulations with a batting robot. Copyright 2010 Elsevier Ltd. All rights reserved.

Mesh：

Year: 2010 PMID： 20080026 DOI： 10.1016/j.neunet.2009.12.010

Source DB: PubMed Journal: Neural Netw ISSN： 0893-6080

Keyword Cloud
Cited

1 in total

1. Learning Inverse Statics Models Efficiently With Symmetry-Based Exploration.

Authors: Rania Rayyes; Daniel Kubus; Jochen Steil
Journal: Front Neurorobot Date: 2018-10-23 Impact factor: 2.650

1 in total