| Literature DB >> 28559954 |
Fengyun Zhang1,2, Shukai Duan1,2, Lidan Wang1,2.
Abstract
In this paper, an improved and much stronger RNH-QL method based on RBF network and heuristic Q-learning was put forward for route searching in a larger state space. Firstly, it solves the problem of inefficiency of reinforcement learning if a given problem's state space is increased and there is a lack of prior information on the environment. Secondly, RBF network as weight updating rule, reward shaping can give an additional feedback to the agent in some intermediate states, which will help to guide the agent towards the goal state in a more controlled fashion. Meanwhile, with the process of Q-learning, it is accessible to the underlying dynamic knowledge, instead of the need of background knowledge of an upper level RBF network. Thirdly, it improves the learning efficiency by incorporating the greedy exploitation strategy to train the neural network, which has been testified by the experimental results.Entities:
Keywords: Greedy exploitation; Heuristic reinforcement learning; Neural network; Route searching
Year: 2017 PMID: 28559954 PMCID: PMC5430242 DOI: 10.1007/s11571-017-9423-7
Source DB: PubMed Journal: Cogn Neurodyn ISSN: 1871-4080 Impact factor: 5.082