Reinforcement learning has potential in the area of intelligent transportation due to its generality and real-time feature. The Q-learning algorithm, which is an early proposed algorithm, has its own merits to solve the train timetable rescheduling (TTR) problem. However, it has shortage in two aspects: Dimensional limits of action and a slow convergence rate. In this paper, a deep deterministic policy gradient (DDPG) algorithm is applied to solve the energy-aimed train timetable rescheduling (ETTR) problem. This algorithm belongs to reinforcement learning, which fulfills real-time requirements of the ETTR problem, and has adaptability on random disturbances. Superior to the Q-learning, DDPG has a continuous state space and action space. After enough training, the learning agent based on DDPG takes proper action by adjusting the cruising speed and the dwelling time continuously for each train in a metro network when random disturbances happen. Although training needs an iteration for thousands of episodes, the policy decision during each testing episode takes a very short time. Models for the metro network, based on a real case of the Shanghai Metro Line 1, are established as a training and testing environment. To validate the energy-saving effect and the real-time feature of the proposed algorithm, four experiments are designed and conducted. Compared with the no action strategy, results show that the proposed algorithm has real-time performance, and saves a significant percentage of energy under random disturbances.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.