Literature DB >> 33501289

Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment.

Quang Dang Nguyen1, Mikhail Prokopenko1.   

Abstract

We describe and evaluate a neural network-based architecture aimed to imitate and improve the performance of a fully autonomous soccer team in RoboCup Soccer 2D Simulation environment. The approach utilizes deep Q-network architecture for action determination and a deep neural network for parameter learning. The proposed solution is shown to be feasible for replacing a selected behavioral module in a well-established RoboCup base team, Gliders2d, in which behavioral modules have been evolved with human experts in the loop. Furthermore, we introduce an additional performance-correlated signal (a delayed reward signal), enabling a search for local maxima during a training phase. The extension is compared against a known benchmark. Finally, we investigate the extent to which preserving the structure of expert-designed behaviors affects the performance of a neural network-based solution.
Copyright © 2020 Nguyen and Prokopenko.

Entities:  

Keywords:  deep learning; deep reinforcement learning; end-to-end learning; imitation learning; learning with delayed reward; learning with structure preservation

Year:  2020        PMID: 33501289      PMCID: PMC7805756          DOI: 10.3389/frobt.2020.00123

Source DB:  PubMed          Journal:  Front Robot AI        ISSN: 2296-9144


  3 in total

1.  Quantifying Long-Range Interactions and Coherent Structure in Multi-Agent Dynamics.

Authors:  Oliver M Cliff; Joseph T Lizier; X Rosalind Wang; Peter Wang; Oliver Obst; Mikhail Prokopenko
Journal:  Artif Life       Date:  2017-01-31       Impact factor: 0.667

2.  Supervised Learning for Dynamical System Learning.

Authors:  Ahmed Hefny; Carlton Downey; Geoffrey J Gordon
Journal:  Adv Neural Inf Process Syst       Date:  2015

3.  Human-level control through deep reinforcement learning.

Authors:  Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A Rusu; Joel Veness; Marc G Bellemare; Alex Graves; Martin Riedmiller; Andreas K Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis
Journal:  Nature       Date:  2015-02-26       Impact factor: 49.962

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.