This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Large-Scale Metro Train Timetable Rescheduling via Multi-Agent Deep Reinforcement Learning: A High-Dimensional Optimization Approach in Flatland Environment
by
Jufen Yang
Jufen Yang ,
Haozhe Yang
Haozhe Yang *
,
Weikang Wang
Weikang Wang and
Chengyang Xia
Chengyang Xia
School of Urban Rail Transportation, Shanghai University of Engineering Science, Shanghai 201620, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(7), 3338; https://doi.org/10.3390/app16073338 (registering DOI)
Submission received: 26 December 2025
/
Revised: 6 March 2026
/
Accepted: 21 March 2026
/
Published: 30 March 2026
Abstract
Metro train timetable rescheduling (TTR) is a critical task for ensuring the reliability of urban rail transit systems. However, with the increasing density of railway networks and the growing number of operational trains, TTR has evolved into a typical high-dimensional and large-scale optimization problem. Traditional mathematical programming and heuristic approaches often struggle with the “curse of dimensionality” and fail to provide real-time responses under stochastic disturbances. To address these challenges, this paper proposes a novel framework based on Multi-Agent Deep Reinforcement Learning (MADRL). Specifically, we model the TTR problem as a decentralized cooperative process and utilize the Multi-Agent Advantage Actor-Critic (MAA2C) algorithm to optimize train schedules dynamically. The proposed framework is implemented within the Flatland simulation environment, which allows for the representation of complex arbitrary topologies. We design a composite reward function that minimizes total delay deviation while maximizing passenger satisfaction, subject to constraints such as headway, operating time, and train capacity. Furthermore, to enhance the robustness of the model against high-dimensional state uncertainties, random disturbances following a negative exponential distribution are introduced during training. Experimental results across various scenarios—ranging from simple dual-track to complex random networks—demonstrate that the MAA2C-based approach significantly outperforms traditional baselines. It not only achieves faster convergence in small-scale scenarios but also demonstrates superior computational efficiency and scalability in large-scale environments, effectively minimizing passenger waiting times. This study validates the potential of MADRL in solving high-dimensional traffic control problems for intelligent transportation systems.
Share and Cite
MDPI and ACS Style
Yang, J.; Yang, H.; Wang, W.; Xia, C.
Large-Scale Metro Train Timetable Rescheduling via Multi-Agent Deep Reinforcement Learning: A High-Dimensional Optimization Approach in Flatland Environment. Appl. Sci. 2026, 16, 3338.
https://doi.org/10.3390/app16073338
AMA Style
Yang J, Yang H, Wang W, Xia C.
Large-Scale Metro Train Timetable Rescheduling via Multi-Agent Deep Reinforcement Learning: A High-Dimensional Optimization Approach in Flatland Environment. Applied Sciences. 2026; 16(7):3338.
https://doi.org/10.3390/app16073338
Chicago/Turabian Style
Yang, Jufen, Haozhe Yang, Weikang Wang, and Chengyang Xia.
2026. "Large-Scale Metro Train Timetable Rescheduling via Multi-Agent Deep Reinforcement Learning: A High-Dimensional Optimization Approach in Flatland Environment" Applied Sciences 16, no. 7: 3338.
https://doi.org/10.3390/app16073338
APA Style
Yang, J., Yang, H., Wang, W., & Xia, C.
(2026). Large-Scale Metro Train Timetable Rescheduling via Multi-Agent Deep Reinforcement Learning: A High-Dimensional Optimization Approach in Flatland Environment. Applied Sciences, 16(7), 3338.
https://doi.org/10.3390/app16073338
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.