Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Algorithmic survey of parametric value function approximation.

Literature DB >> 24808468

Algorithmic survey of parametric value function approximation.

Abstract

Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of RL concerns computing an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.

Year: 2013 PMID： 24808468 DOI： 10.1109/TNNLS.2013.2247418

Source DB: PubMed Journal: IEEE Trans Neural Netw Learn Syst ISSN： 2162-237X Impact factor: 10.451

Keyword Cloud
Cited

1 in total

1. Multi-Agent Reinforcement Learning via Adaptive Kalman Temporal Difference and Successor Representation.

Authors: Mohammad Salimibeni; Arash Mohammadi; Parvin Malekzadeh; Konstantinos N Plataniotis
Journal: Sensors (Basel) Date: 2022-02-11 Impact factor: 3.576

1 in total