Next Article in Journal
Sliding Mode Control of the Isolated Bridgeless SEPIC High Power Factor Rectifier Interfacing an AC Source with a LVDC Distribution Bus
Previous Article in Journal
Control of Hybrid Diesel/PV/Battery/Ultra-Capacitor Systems for Future Shipboard Microgrids
 
 
Article

Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem

1
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dongchuan RD, Shanghai 200240, China
2
School of Electrical and Computer Engineering, Georgia Institute of Technology, 85 5th Street NW, Atlanta, GA 30308, USA
*
Author to whom correspondence should be addressed.
Energies 2019, 12(18), 3461; https://doi.org/10.3390/en12183461
Received: 17 July 2019 / Revised: 30 August 2019 / Accepted: 2 September 2019 / Published: 7 September 2019
Reinforcement learning has potential in the area of intelligent transportation due to its generality and real-time feature. The Q-learning algorithm, which is an early proposed algorithm, has its own merits to solve the train timetable rescheduling (TTR) problem. However, it has shortage in two aspects: Dimensional limits of action and a slow convergence rate. In this paper, a deep deterministic policy gradient (DDPG) algorithm is applied to solve the energy-aimed train timetable rescheduling (ETTR) problem. This algorithm belongs to reinforcement learning, which fulfills real-time requirements of the ETTR problem, and has adaptability on random disturbances. Superior to the Q-learning, DDPG has a continuous state space and action space. After enough training, the learning agent based on DDPG takes proper action by adjusting the cruising speed and the dwelling time continuously for each train in a metro network when random disturbances happen. Although training needs an iteration for thousands of episodes, the policy decision during each testing episode takes a very short time. Models for the metro network, based on a real case of the Shanghai Metro Line 1, are established as a training and testing environment. To validate the energy-saving effect and the real-time feature of the proposed algorithm, four experiments are designed and conducted. Compared with the no action strategy, results show that the proposed algorithm has real-time performance, and saves a significant percentage of energy under random disturbances. View Full-Text
Keywords: deep deterministic policy gradient; reinforcement learning; random disturbances; train timetable rescheduling; timetable optimization deep deterministic policy gradient; reinforcement learning; random disturbances; train timetable rescheduling; timetable optimization
Show Figures

Figure 1

MDPI and ACS Style

Yang, G.; Zhang, F.; Gong, C.; Zhang, S. Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem. Energies 2019, 12, 3461. https://doi.org/10.3390/en12183461

AMA Style

Yang G, Zhang F, Gong C, Zhang S. Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem. Energies. 2019; 12(18):3461. https://doi.org/10.3390/en12183461

Chicago/Turabian Style

Yang, Guang, Feng Zhang, Cheng Gong, and Shiwen Zhang. 2019. "Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem" Energies 12, no. 18: 3461. https://doi.org/10.3390/en12183461

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop