Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A unified analysis of value-function-based reinforcement- learning algorithms.

Literature DB >> 10578043

A unified analysis of value-function-based reinforcement- learning algorithms.

Abstract

Reinforcement learning is the problem of generating optimal behavior in a sequential decision-making environment given the opportunity of interacting with it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. We extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can provide a unified analysis of such value-function-based reinforcement-learning algorithms. The usefulness of the theorem lies in how it allows the convergence of a complex asynchronous reinforcement-learning algorithm to be proved by verifying that a simpler synchronous algorithm converges. We illustrate the application of the theorem by analyzing the convergence of Q-learning, model-based reinforcement learning, Q-learning with multistate updates, Q-learning for Markov games, and risk-sensitive reinforcement learning.

Entities: Disease

Mesh：

Year: 1999 PMID： 10578043 DOI： 10.1162/089976699300016070

Source DB: PubMed Journal: Neural Comput ISSN： 0899-7667 Impact factor: 2.026

Keyword Cloud
Cited

1 in total

1. DRL-RNP: Deep Reinforcement Learning-Based Optimized RNP Flight Procedure Execution.

Authors: Longtao Zhu; Jinlin Wang; Yi Wang; Yulong Ji; Jinchang Ren
Journal: Sensors (Basel) Date: 2022-08-28 Impact factor: 3.847

1 in total