Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis.

Literature DB >> 27093714

Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis.

Qinglai Wei, Frank L Lewis, Qiuye Sun, Pengfei Yan, Ruizhuo Song.

Abstract

In this paper, a novel discrete-time deterministic Q -learning algorithm is developed. In each iteration of the developed Q -learning algorithm, the iterative Q function is updated for all the state and control spaces, instead of updating for a single state and a single control in traditional Q -learning algorithm. A new convergence criterion is established to guarantee that the iterative Q function converges to the optimum, where the convergence criterion of the learning rates for traditional Q -learning algorithms is simplified. During the convergence analysis, the upper and lower bounds of the iterative Q function are analyzed to obtain the convergence criterion, instead of analyzing the iterative Q function itself. For convenience of analysis, the convergence properties for undiscounted case of the deterministic Q -learning algorithm are first developed. Then, considering the discounted factor, the convergence criterion for the discounted case is established. Neural networks are used to approximate the iterative Q function and compute the iterative control law, respectively, for facilitating the implementation of the deterministic Q -learning algorithm. Finally, simulation results and comparisons are given to illustrate the performance of the developed algorithm.

Year: 2016 PMID： 27093714 DOI： 10.1109/TCYB.2016.2542923

Source DB: PubMed Journal: IEEE Trans Cybern ISSN： 2168-2267 Impact factor: 11.448

Keyword Cloud
Cited

1 in total

1. Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment.

Authors: Wei-Cheng Jiang; Vignesh Narayanan; Jr-Shin Li
Journal: IEEE Trans Cybern Date: 2021-12-22 Impact factor: 11.448

1 in total