Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems.

Literature DB >> 27831897

Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems.

Jan Skach, Bahare Kiumarsi, Frank L Lewis, Ondrej Straka.

Abstract

In this paper, motivated by human neurocognitive experiments, a model-free off-policy reinforcement learning algorithm is developed to solve the optimal tracking control of multiple-model linear discrete-time systems. First, an adaptive self-organizing map neural network is used to determine the system behavior from measured data and to assign a responsibility signal to each of system possible behaviors. A new model is added if a sudden change of system behavior is detected from the measured data and the behavior has not been previously detected. A value function is represented by partially weighted value functions. Then, the off-policy iteration algorithm is generalized to multiple-model learning to find a solution without any knowledge about the system dynamics or reference trajectory dynamics. The off-policy approach helps to increase data efficiency and speed of tuning since a stream of experiences obtained from executing a behavior policy is reused to update several value functions corresponding to different learning policies sequentially. Two numerical examples serve as a demonstration of the off-policy algorithm performance.

Entities: Disease Species

Year: 2016 PMID： 27831897 DOI： 10.1109/TCYB.2016.2618926

Source DB: PubMed Journal: IEEE Trans Cybern ISSN： 2168-2267 Impact factor: 11.448

Keyword Cloud
Cited

1 in total

1. Reinforcement Learning-Based End-to-End Parking for Automatic Parking System.

Authors: Peizhi Zhang; Lu Xiong; Zhuoping Yu; Peiyuan Fang; Senwei Yan; Jie Yao; Yi Zhou
Journal: Sensors (Basel) Date: 2019-09-16 Impact factor: 3.576

1 in total