Priority Control of Intelligent Connected Dedicated Bus Corridor Based on Deep Deterministic Policy Gradient
Abstract
1. Introduction
- (1)
- A chain-type green-wave control method is proposed to address the coordination challenges between the social vehicle arterial line and bus arterial line. This method involves alternating between executing the scheme of arterial coordinating demands of social vehicles and dedicated buses, thereby overcoming the limitations of a single arterial coordination scheme.
- (2)
- To tackle the issue of poor real-time decision-making in high continuous states, a deep reinforcement learning chain green-wave control model suitable for high-dimensional continuous traffic conditions is provided. This model enables real-time decision-making with continuous conditions and actions.
2. Literature Review
2.1. Bus Arterial-Line Priority Phase Adjustment
2.2. Bus Arterial-Line Green-Wave Control
2.3. Summary
3. Outline of Research Framework
3.1. Research Framework
3.2. Assumptions
- (1)
- The focus of this study is a segment of a local arterial road within the bus route network, which is fully equipped with dedicated bus lanes. This design aims to minimize interference from non-public vehicles during bus operations.
- (2)
- The presence of multi-modal vehicle detectors at the entrance and exit of all road sections is assumed in order to obtain the cumulative number of passing vehicles.
4. Analysis of Operating Characteristics for Dedicated Bus Arterials
4.1. Analysis of Intersection State Characteristics for Dedicated Buses
4.2. Adjustment Strategy for Arterial Signal Control
5. Arterial Signal Control Method
5.1. Stats Space
5.2. Action Space
5.3. Reward Function
5.4. Fault-Tolerance Mechanism
- Normal Operation: Fuse GPS positioning with multi-modal detector data (e.g., video/loop counts). Cross-validation minimizes measurement errors [27].
- Partial Failure: If one detector type fails (e.g., video sensor), retain functional sources (e.g., GPS + loops) and trigger an alert.
- Complete Failure: If all detectors malfunction,
- (a)
- Activate high-priority alarm;
- (b)
- Revert to the social vehicle arterial coordination scheme (baseline strategy);
- (c)
- Maintain current signal timing until recovery.
6. Deep Reinforcement Learning Model
6.1. Simulation Environment Construction
6.2. Deep Reinforcement Learning Model
Algorithm 1 Deep deterministic policy gradient algorithm. |
Input I: Status and reward Output I: Action 1. for each bus, from to ; 2. Initial test system parameters, including network parameters, reward and punishment functions, etc. 3. Combine with adjustment constraints and random factor, this model determined the action , , and transfer the action into the simulation part. 4. The simulation part executes action , and it will receive the reward value and the new status . 5. If the sample pool overflows, then delete the earliest sample records in chronological order. 6. The actor network will put the into experience playback, which supplies the train data for the online network. 7. Sampling from the experience pool, gain N sets of sample data as the training set for the online actor network and Q network. 8. Use the standard BP method to calculate the gradient of the online Q network. 9. Update the parameter of the online Q network. 10. Calculate the policy gradient (PG) of the actor network. 11. Update the parameter of the online actor network. 12. Update the parameters of the target network. 13. End for. |
7. Experiment Analysis
7.1. Hyperparameter Configuration
7.2. State Space Analysis and Training Optimization
- Gradient Clipping: Constrain policy gradients to to prevent explosion.
- Soft Updates: Target network parameters updated as .
- Convergence Criterion: Training terminates when for 20 consecutive iterations.
7.3. Deep Reinforcement Learning Model Training
7.4. Verification of Arterial Signal Switching Strategy
7.5. Saturation Scenario Applicability Analysis
8. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
DDPG | Deep Deterministic Policy Gradient |
MDP | Markov Decision Process |
GPS | Global Positioning System |
References
- Guler, S.I.; Menendez, M. Analytical formulation and empirical evaluation of pre-signals for bus priority. Transp. Res. Part B Methodol. 2014, 64, 41–53. [Google Scholar] [CrossRef]
- Truong, L.T.; Currie, G.; Wallace, M.; De Gruyter, C.; An, K. Coordinated Transit Signal Priority Model Considering Stochastic Bus Arrival Time. IEEE Trans. Intell. Transp. Syst. 2019, 20, 1269–1277. [Google Scholar] [CrossRef]
- Colombaroni, C.; Fusco, G.; Isaenko, N. A Simulation-Optimization Method for Signal Synchronization with Bus Priority and Driver Speed Advisory to Connected Vehicles. Transp. Res. Procedia 2020, 45, 890–897. [Google Scholar] [CrossRef]
- Zhang, X.; He, Z.; Zhu, Y.; You, L. DRL-based adaptive signal control for bus priority service under connected vehicle environment. Transp. B Transp. Dyn. 2023, 11, 1455–1477. [Google Scholar] [CrossRef]
- Li, G.; Li, S.; Li, S.; Qin, Y.; Cao, D.; Qu, X.; Cheng, B. Deep reinforcement learning enabled decision-making for autonomous driving at intersections. Automot. Innov. 2020, 3, 374–385. [Google Scholar] [CrossRef]
- Yoon, J.; Ahn, K.; Park, J.; Yeo, H. Transferable traffic signal control: Reinforcement learning with graph centric state representation. Transp. Res. Part C Emerg. Technol. 2021, 130, 103321. [Google Scholar] [CrossRef]
- Liang, X.; Du, X.; Wang, G.; Han, Z. A deep reinforcement learning network for traffic light cycle control. IEEE Trans. Veh. Technol. 2019, 68, 1243–1253. [Google Scholar] [CrossRef]
- Wang, Y.; Guo, Y. Signal Priority Control for Trams Using Deep Reinforcement Learning. Acta Autom. Sin. 2019, 45, 2366–2377. [Google Scholar]
- Shi, W.; Yu, C.; Ma, W.; Wang, L.; Nie, L. Simultaneous optimization of passive transit priority signals and lane allocation. KSCE J. Civ. Eng. 2020, 24, 624–634. [Google Scholar] [CrossRef]
- Bie, Y.; Liu, Z.; Wang, H. Integrating Bus Priority and Presignal Method at Signalized Intersection: Algorithm Development and Evaluation. J. Transp. Eng. Part A Syst. 2020, 146, 04020044. [Google Scholar] [CrossRef]
- Zeng, X.; Zhang, Y.; Jiao, J.; Yin, K. Route-based transit signal priority using connected vehicle technology to promote bus schedule adherence. IEEE Trans. Intell. Transp. Syst. 2020, 22, 1174–1184. [Google Scholar] [CrossRef]
- Li, J.; Liu, Y.; Zheng, N.; Tang, L.; Yi, H. Regional coordinated bus priority signal control considering pedestrian and vehicle delays at urban intersections. IEEE Trans. Intell. Transp. Syst. 2021, 23, 16690–16700. [Google Scholar] [CrossRef]
- Long, M.; Zou, X.; Zhou, Y.; Chung, E. Deep reinforcement learning for transit signal priority in a connected environment. Transp. Res. Part C Emerg. Technol. 2022, 142, 103814. [Google Scholar] [CrossRef]
- Li, H.; Li, S.; Zhang, X.; Tong, P.; Guo, Y. Dynamic signal priority of the self-driving bus at an isolated intersection considering private vehicles. Sci. Rep. 2023, 13, 17482. [Google Scholar] [CrossRef]
- Xu, M.; Zhai, X.; Sun, Z.; Zhou, X.; Chen, Y. Multiagent control approach with multiple traffic signal priority and coordination. Transp. Eng. Part A Syst. 2023, 149, 04022124. [Google Scholar] [CrossRef]
- Hu, X.; Chen, X.; Guo, J.; Dai, G.; Zhao, J.; Long, B.; Zhang, T.; Chen, S. Optimization model for bus priority control considering carbon emissions under non-bus lane conditions. J. Clean. Prod. 2023, 402, 136747. [Google Scholar] [CrossRef]
- Dai, G.; Wang, H.; Wang, W. A bandwidth approach to arterial signal optimisation with bus priority. Transp. A Transp. Sci. 2015, 11, 579–602. [Google Scholar] [CrossRef]
- Xu, M.; An, K.; Ye, Z.; Wang, Y.; Feng, J.; Zhao, J. A bi-level model to resolve conflicting transit priority requests at urban arterials. IEEE Trans. Intell. Transp. Syst. 2018, 20, 1353–1364. [Google Scholar] [CrossRef]
- Wu, K.; Lu, M.; Guler, S.I. Modeling and optimizing bus transit priority along an arterial: A moving bottleneck approach. Transp. Res. Part C Emerg. Technol. 2020, 121, 102873. [Google Scholar] [CrossRef]
- Chen, Y.H.; Cheng, Y.; Chang, G.L. Incorporating bus delay minimization in design of signal progression for arterials accommodating heavy mixed-traffic flows. J. Intell. Transp. Syst. 2023, 27, 187–216. [Google Scholar] [CrossRef]
- Ou, S.; An, K.; Ma, W.; Hegyi, A.; Van Arem, B. Stochastic-priority-integrated signal coordination considering connected bus operation uncertainties. Transp. B Transp. Dyn. 2024, 12, 2297152. [Google Scholar] [CrossRef]
- Shao, Y.; Sun, J.; Kan, Y.; Tian, Y. Operation of dedicated lanes with intermittent priority on highways: Conceptual development and simulation validation. J. Intell. Transp. Syst. 2024, 28, 69–83. [Google Scholar] [CrossRef]
- Seman, L.O.; Koehler, L.A.; Camponogara, E.; Kraus, W., Jr. Integrated headway and bus priority control in transit corridors with bidirectional lane segments. Transp. Res. Part C Emerg. Technol. 2020, 111, 114–134. [Google Scholar] [CrossRef]
- Anderson, P.; Daganzo, C.F. Effect of transit signal priority on bus service reliability. Transp. Res. Part B Methodol. 2020, 132, 2–14. [Google Scholar] [CrossRef]
- Pallela, S.S.; Mehar, A. Analysis of Time Headway Characteristics at the Curbside Bus Stop on Multi-Lane Divided Urban Arterials under Mixed Traffic Conditions. KSCE J. Civ. Eng. 2024, 28, 3520–3532. [Google Scholar] [CrossRef]
- Thodi, B.T.; Chilukuri, B.R.; Vanajakshi, L. An analytical approach to real-time bus signal priority system for isolated intersections. J. Intell. Transp. Syst. 2022, 26, 145–167. [Google Scholar] [CrossRef]
- DB50/T 1377-2023; Urban Road Traffic Operation Data Fusion Specification. Chongqing Local Standard. Chongqing Administration for Market Regulation: Chongqing, China, 2023.
- GA/T 527.6-2018; Road Traffic Signal Control Modes—Part 6: Control Rules for Priority Passage of Buses at Intersections. Public Security Industry Standard of China. Ministry of Public Security of China: Beijing, China, 2018.
- GB 50647-2011; Code for Planning of Urban Road Intersections. National Standard of China. Ministry of Housing and Urban-Rural Development of China: Beijing, China, 2011.
V1 | V2 | ||
---|---|---|---|
Cases | 400 | 400 | |
Normal parameter a,b | Average value | 0.183841272 | 0.107313444 |
Standard deviation | 0.142599558 | 0.117999785 | |
Most extreme difference | Absolute | 0.132 | 0.202 |
Positive | 0.132 | 0.202 | |
Negative | −0.107 | −0.182 | |
Test statistics | 0.132 | 0.202 | |
Asymptotically significant (two-tailed) | 0.076 | 0.101 |
NO. | Yellow | All Red | Morning Peak and Evening Peak | Daytime | Nighttime | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Cycle | Green | Red | Cycle | Green | Red | Cycle | Green | Red | |||
1 | 3 | 2 | 117 | 55 | 58 | 97 | 46 | 47 | 77 | 36 | 37 |
2 | 3 | 2 | 117 | 83 | 30 | 97 | 58 | 35 | 77 | 45 | 28 |
3 | 3 | 2 | 117 | 63 | 50 | 97 | 50 | 43 | 77 | 38 | 35 |
4 | 3 | 2 | 117 | 66 | 47 | 97 | 52 | 41 | 77 | 40 | 33 |
Phases | Type | Intersection 1 | Intersection 2 | Intersection 3 | Intersection 4 |
---|---|---|---|---|---|
Morning Peak and Evening Peak | Social vehicle | 0 | 43 | 10 | 53 |
Dedicated bus | 0 | 58 | 14 | 69 | |
Daytime | Social vehicle | 0 | 55 | 94 | 58 |
Dedicated bus | 0 | 68 | 2 | 72 | |
Nighttime | Social vehicle | 0 | 36 | 35 | 35 |
Dedicated bus | 0 | 48 | 40 | 49 |
Category | Symbol | Value | Description |
---|---|---|---|
Optimization | 0.9 | Discount factor for future rewards | |
Actor network learning rate | |||
Critic network learning rate | |||
T | 200 | Total training iterations | |
B | 32 | Batch size | |
Structural | Hidden layer configuration | ||
Experience replay buffer size | |||
0.95 | PCA variance threshold | ||
Stabilization | 1.80 | Initial OU noise variance | |
0.02 | Minimum OU noise variance | ||
Noise decay coefficient | |||
5.0 | Gradient clipping threshold | ||
0.01 | Target network update coefficient | ||
5% | Q-value fluctuation limit | ||
20 | Stable iterations required |
Parameter Category | Comparison Group | Experimental Group |
---|---|---|
Time Period | 8:00–9:00 | Same as comparison group |
Social Vehicle Flow | 1987.98 veh/h | 3219.33 veh/h |
Bus Flow | Same as Section 6.1 | Same as comparison group |
Road Configuration | Same as Section 6.1 | Same as comparison group |
Actual Saturation Flow Rate | 1463.33 veh/h/lane | Same as comparison group |
Calculated Saturation Ratio | 0.68 (slightly congested) | 1.1 (moderate saturation) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shang, C.; Zhu, F.; Xu, Y.; Zhu, G.; Tong, X. Priority Control of Intelligent Connected Dedicated Bus Corridor Based on Deep Deterministic Policy Gradient. Sensors 2025, 25, 4802. https://doi.org/10.3390/s25154802
Shang C, Zhu F, Xu Y, Zhu G, Tong X. Priority Control of Intelligent Connected Dedicated Bus Corridor Based on Deep Deterministic Policy Gradient. Sensors. 2025; 25(15):4802. https://doi.org/10.3390/s25154802
Chicago/Turabian StyleShang, Chunlin, Fenghua Zhu, Yancai Xu, Guiqing Zhu, and Xin Tong. 2025. "Priority Control of Intelligent Connected Dedicated Bus Corridor Based on Deep Deterministic Policy Gradient" Sensors 25, no. 15: 4802. https://doi.org/10.3390/s25154802
APA StyleShang, C., Zhu, F., Xu, Y., Zhu, G., & Tong, X. (2025). Priority Control of Intelligent Connected Dedicated Bus Corridor Based on Deep Deterministic Policy Gradient. Sensors, 25(15), 4802. https://doi.org/10.3390/s25154802