| Literature DB >> 34122533 |
Yu Du1, Shaochuan Fu1, Changxiang Lu1, Qiang Zhou1, Chunfang Li2,3,1.
Abstract
This paper presents a simultaneous pickup and delivery route designing model, which considers the use of express lockers. Unlike the traditional traveling salesman problem (TSP), this model analyzes the scenario that a courier serves a neighborhood with multiple trips. Considering the locker and vehicle capacity, the total cost is constituted of back order, lost sale, and traveling time. We aim to minimize the total cost when satisfying all requests. A modified deep Q-learning network is designed to get the optimal results from our model, leveraging masked multi-head attention to select the courier paths. Our algorithm outperforms other stochastic optimization methods with better optimal solutions and O(n) computational time in evaluation processes. The experiment has shown that reinforcement learning is a better choice than traditional stochastic optimization methods, consuming less power and time during evaluation processes, which indicates that this approach fits better for large-scale data and broad deployment.Entities:
Year: 2021 PMID: 34122533 PMCID: PMC8166504 DOI: 10.1155/2021/5590758
Source DB: PubMed Journal: Comput Intell Neurosci