| Literature DB >> 30613463 |
Md Muhaimin Rahman1, S M Hasanur Rashid1, M M Hossain2.
Abstract
In this paper, the implementations of two reinforcement learnings namely, Q learning and deep Q network (DQN) on the Gazebo model of a self balancing robot have been discussed. The goal of the experiments is to make the robot model learn the best actions for staying balanced in an environment. The more time it can remain within a specified limit, the more reward it accumulates and hence more balanced it is. We did various tests with many hyperparameters and demonstrated the performance curves.Entities:
Year: 2018 PMID: 30613463 PMCID: PMC6302870 DOI: 10.1186/s40638-018-0091-9
Source DB: PubMed Journal: Robotics Biomim ISSN: 2197-3768
Fig. 1Simple block diagram of the model
Fig. 2Gazebo model
Fig. 3Controller block diagram
Fig. 4Rewards for different
Fig. 5Sample deep Q network architecture
Fig. 6Schematic diagram of DQN architecture used
Fig. 7Rewards for three different s with 0.999
Fig. 8Rewards versus episodes for new architecture
Fig. 9Performance curve for PID, fuzzy logic and LQR