Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment.

Literature DB >> 33501289

Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment.

Quang Dang Nguyen¹, Mikhail Prokopenko¹.

Abstract

We describe and evaluate a neural network-based architecture aimed to imitate and improve the performance of a fully autonomous soccer team in RoboCup Soccer 2D Simulation environment. The approach utilizes deep Q-network architecture for action determination and a deep neural network for parameter learning. The proposed solution is shown to be feasible for replacing a selected behavioral module in a well-established RoboCup base team, Gliders2d, in which behavioral modules have been evolved with human experts in the loop. Furthermore, we introduce an additional performance-correlated signal (a delayed reward signal), enabling a search for local maxima during a training phase. The extension is compared against a known benchmark. Finally, we investigate the extent to which preserving the structure of expert-designed behaviors affects the performance of a neural network-based solution.

Entities: Chemical Disease Gene Species

Keywords: deep learning; deep reinforcement learning; end-to-end learning; imitation learning; learning with delayed reward; learning with structure preservation

Year: 2020 PMID： 33501289 PMCID： PMC7805756 DOI： 10.3389/frobt.2020.00123

Source DB: PubMed Journal: Front Robot AI ISSN： 2296-9144

3 in total

1. Quantifying Long-Range Interactions and Coherent Structure in Multi-Agent Dynamics.

Authors: Oliver M Cliff; Joseph T Lizier; X Rosalind Wang; Peter Wang; Oliver Obst; Mikhail Prokopenko
Journal: Artif Life Date: 2017-01-31 Impact factor: 0.667

2. Supervised Learning for Dynamical System Learning.

Authors: Ahmed Hefny; Carlton Downey; Geoffrey J Gordon
Journal: Adv Neural Inf Process Syst Date: 2015

3. Human-level control through deep reinforcement learning.

Authors: Volodymyr Mnih; Koray Kavukcuoglu; David Silver; Andrei A Rusu; Joel Veness; Marc G Bellemare; Alex Graves; Martin Riedmiller; Andreas K Fidjeland; Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou; Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; Demis Hassabis
Journal: Nature Date: 2015-02-26 Impact factor: 49.962

3 in total