Literature DB >> 29771673

Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning.

Zhipeng Ren, Daoyi Dong, Huaxiong Li, Chunlin Chen, Daoyi Dong, Huaxiong Li, Chunlin Chen, Zhipeng Ren.   

Abstract

In this paper, a new training paradigm is proposed for deep reinforcement learning using self-paced prioritized curriculum learning with coverage penalty. The proposed deep curriculum reinforcement learning (DCRL) takes the most advantage of experience replay by adaptively selecting appropriate transitions from replay memory based on the complexity of each transition. The criteria of complexity in DCRL consist of self-paced priority as well as coverage penalty. The self-paced priority reflects the relationship between the temporal-difference error and the difficulty of the current curriculum for sample efficiency. The coverage penalty is taken into account for sample diversity. With comparison to deep Q network (DQN) and prioritized experience replay (PER) methods, the DCRL algorithm is evaluated on Atari 2600 games, and the experimental results show that DCRL outperforms DQN and PER on most of these games. More results further show that the proposed curriculum training paradigm of DCRL is also applicable and effective for other memory-based deep reinforcement learning approaches, such as double DQN and dueling network. All the experimental results demonstrate that DCRL can achieve improved training efficiency and robustness for deep reinforcement learning.

Entities:  

Year:  2018        PMID: 29771673     DOI: 10.1109/TNNLS.2018.2790981

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  1 in total

1.  Fast prediction of blood flow in stenosed arteries using machine learning and immersed boundary-lattice Boltzmann method.

Authors:  Li Wang; Daoyi Dong; Fang-Bao Tian
Journal:  Front Physiol       Date:  2022-08-26       Impact factor: 4.755

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.