Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Incremental Reinforcement Learning in Continuous Spaces via Policy Relaxation and Importance Weighting.

Literature DB >> 31395556

Incremental Reinforcement Learning in Continuous Spaces via Policy Relaxation and Importance Weighting.

Abstract

In this paper, a systematic incremental learning method is presented for reinforcement learning in continuous spaces where the learning environment is dynamic. The goal is to adjust the previously learned policy in the original environment to a new one incrementally whenever the environment changes. To improve the adaptability to the ever-changing environment, we propose a two-step solution incorporated with the incremental learning procedure: policy relaxation and importance weighting. First, the behavior policy is relaxed to a random one in the initial learning episodes to encourage a proper exploration in the new environment. It alleviates the conflict between the new information and the existing knowledge for a better adaptation in the long term. Second, it is observed that episodes receiving higher returns are more in line with the new environment, and hence contain more new information. During parameter updating, we assign higher importance weights to the learning episodes that contain more new information, thus encouraging the previous optimal policy to be faster adapted to a new one that fits in the new environment. Empirical studies on continuous controlling tasks with varying configurations verify that the proposed method achieves a significantly faster adaptation to various dynamic environments than the baselines.

Entities: Disease

Year: 2019 PMID： 31395556 DOI： 10.1109/TNNLS.2019.2927320

Source DB: PubMed Journal: IEEE Trans Neural Netw Learn Syst ISSN： 2162-237X Impact factor: 10.451

Keyword Cloud
Cited

1 in total

Review 1. Deep Reinforcement Learning for Resource Management on Network Slicing: A Survey.

Authors: Johanna Andrea Hurtado Sánchez; Katherine Casilimas; Oscar Mauricio Caicedo Rendon
Journal: Sensors (Basel) Date: 2022-04-15 Impact factor: 3.847

1 in total