A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections

Guo, Wei; Yi, Fengyan; Zhou, Jiaming; Zhang, Jinming; Wang, Shuo; Gong, Hongtao; Wang, Shuaihua; Huang, Zongjing; Liu, Chunrui

doi:10.3390/wevj17020079

Open AccessArticle

A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections

by

Wei Guo

¹

,

Fengyan Yi

^1,*

,

Jiaming Zhou

²

,

Jinming Zhang

²

,

Shuo Wang

²

,

Hongtao Gong

¹

,

Shuaihua Wang

¹

,

Zongjing Huang

²

and

Chunrui Liu

²

¹

School of Automotive Engineering, Shandong Jiaotong University, Jinan 250357, China

²

School of Mechanical and Electrical Engineering, Weifang University of Science and Technology, Weifang 262700, China

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2026, 17(2), 79; https://doi.org/10.3390/wevj17020079

Submission received: 17 December 2025 / Revised: 30 January 2026 / Accepted: 4 February 2026 / Published: 5 February 2026

(This article belongs to the Section Vehicle Control and Management)

Download

Browse Figures

Versions Notes

Abstract

Urban bus operations under signalized traffic conditions are characterized by frequent stop-and-start behaviors which significantly degrade fuel economy, especially for fuel cell buses (FCB). In this paper, a collaborative optimization method is proposed that combines speed planning and energy management for FCB in this situation. The method calculates the target speed of FCB using traffic light phase information and the remaining signal time. With an intelligent driving model, the vehicle can adjust its speed in advance when approaching intersections so it can pass through intersections without stopping. At the same time, a learning-based energy management strategy is used to reasonably share power between the fuel cell and the battery. The results indicate that the method proposed in this paper reduces hydrogen consumption by approximately 11.3% compared to the standard method.

Keywords:

fuel cell electric vehicle; speed planning; energy management

1. Introduction

Climate change and air pollution mitigation have become global concerns, driving increasing efforts to reduce greenhouse gas and pollutant emissions from the transportation sector through a wide range of emission mitigation technologies [1]. As a clean energy, hydrogen further promotes fuel cell technologies as a promising power source for future transportation systems [2]. Fuel cell electric vehicles (FCEV) have advantages such as zero tailpipe emissions, long driving ranges, and fast refueling [3,4,5]. These features allow FCEV to balance environmental protection and practical use [6,7]. As a result, FCEV are an important option for green transportation development [8]. In recent years, fuel cell buses (FCB), as an important part of urban public transport systems, have been deployed and tested on a large scale in many countries and regions [9,10,11]. At the same time, their operating scenarios have become more varied and route coverage has continued to grow [12]. As urban traffic networks become more complex, FCB operating on roads with many signalized intersections often experience frequent stops and starts. These stops and starts caused by red lights not only reduce travel efficiency but also lead to frequent changes in driving power demand [13]. This reduces the efficiency of the fuel cell (FC) and increases hydrogen consumption. Therefore, achieving smooth intersection passing and efficient energy use in urban traffic light conditions has become a key issue for improving the economic performance of FCB.

To mitigate the energy losses induced by traffic lights, scholars have conducted extensive research on speed planning strategies. Suzuki et al. proposed a driving rule that assisted drivers in maintaining or decelerating their speed by real-time calculations of the minimum distance that can be traveled within the remaining green light duration and the feasible travel distance during the red light phase, thereby avoiding unnecessary stops [14]. Mintsis et al. enhanced driving comfort and traffic safety by introducing an early acceleration mechanism and a minimum speed limitation alongside maintaining speed and decelerating [15]. This approach improved the overall driving experience. Dong et al. applied Pontryagin’s minimum principle to optimize vehicle speed and trajectory, effectively reducing the operational cost [16]. However, these studies mainly focused on traditional vehicles and did not address the complex energy flow dynamics in multi-source power systems.

With the development of new energy vehicles, vehicle power systems are evolving towards a multi-energy coupled architecture. Especially in FCEV, energy distribution, recovery, and conversion among multiple power sources play a crucial role in overall energy consumption. Research on speed planning and energy management strategies (EMS) for traffic light scenarios has become a new research direction. Xu et al. employed model predictive control (MPC) and dynamic programming (DP) to achieve cooperative control of speed planning and energy management [17]. Zhang et al. solved the speed programming and energy management problems of FCEV by using Pontryagin’s minimum principle [18]. Wei et al. used DP to coarsely determine the optimal traversal time windows at each signalized intersection and then refined the vehicle speed profile through gradient-based optimization, all while dynamically allocating power between the FC and the battery using a DP-based strategy [19]. However, although these studies have achieved relatively mature results in speed planning for multi-signalized intersections, their EMS remain largely static and lack real-time adaptability, thereby failing to fully exploit the energy-saving potential of multi-energy-source systems.

Existing EMS can generally be classified into three categories: rule-based control strategies [20], optimization-based strategies [21,22], and learning-based strategies [23,24,25]. Rule-based EMS rely heavily on empirical tuning and therefore exhibit limited adaptability, while optimization-based methods are computationally intensive and difficult to implement in real time [26]. By contrast, learning-based strategies provide both flexibility and real-time performance, and they are gradually becoming the main solution for energy management. Jia et al. proposed EMS based on an improved Deep Deterministic Policy Gradient algorithm, which improved FC efficiency and extended its service life [27]. Li et al. developed EMS based on TD3 to reduce hydrogen consumption while extending the life of the power battery [28]. Wu et al. proposed basing EMS on the Soft Actor–Critic algorithm, which improved the economic performance of vehicle operation [29]. Together, these studies showed that reinforcement learning methods were effective and adaptable for vehicle energy management, and they have clear advantages in using the energy-saving potential of complex driving conditions. However, even though advanced EMS can reduce energy use and extend vehicle life, overall operating cost reduction does not depend only on powertrain control. It is also closely related to vehicle speed. Proper speed planning directly affects driving power demand and energy consumption, and it further shapes acceleration and deceleration behavior in complex traffic. As a result, more economical and smoother driving conditions can be achieved.

As shown in Table 1, although learning-based EMS have attracted increasing attention in the research of FCEV, most existing studies conducted training and validation under predefined or simplified speed profiles. The scenarios investigated have been mainly limited to standard driving cycles or randomly generated speed sequences. In signal intersection traffic environments, vehicle speed is strongly affected by the signal phase state and remaining time, resulting in a significant dynamic characteristic in power demand. These approaches generally neglect the strong coupling between vehicle speed and traffic signal information in real urban environments. Existing learning-based EMS methods, which operate independently of traffic signal awareness or upper-layer speed planning, are therefore unable to fully exploit the energy-saving potential induced by coordinated vehicle motion. As illustrated in Figure 1, the proposed collaborative optimization framework adopts a hierarchical yet coordinated architecture that integrates signal-aware speed planning with learning-based energy management. At the upper layer, the speed planning strategy exploits V2X-enabled traffic signal information, including the signal phase and remaining time, together with intersection location data. Based on this information, a green-wave-oriented speed guidance strategy can be developed using an improved IDM, which generates an optimal and smoothly varying reference velocity trajectory. At the lower layer, learning-based EMS can be trained to allocate fuel cell power in real time, with the objective of minimizing hydrogen consumption while maintaining battery SOC within a desired operating range. The main contributions of this paper are summarized as follows:

(1): A speed measurement planning method for multi-signal intersection scenarios is proposed, which integrates traffic signal phase status and remaining time information to achieve rapid passage and reduce vehicle power demand.
(2): A hierarchical framework integrating speed planning and EMS for FCB is developed, enabling the EMS to operate under more stable power input conditions and thereby reducing the hydrogen consumption of the FC.
(3): For the proposed learning-based EMS, an offline training and online testing scheme is adopted in which the policy performance is evaluated under unseen signalized traffic scenarios to assess the generalization capability of the proposed approach.

The remaining sections of this paper are structured as follows: Section 2 focuses on the operation scenario of a multi-signal urban intersection and performs system modeling of FCB to construct vehicle transmission system and power system models, formally describing the problem and clarifying the control objectives and constraints of the proposed method; Section 3 presents the specific implementation methods for vehicle speed planning strategies and EMS, and outlines the design process and implementation details of each module under the hierarchical control framework; Section 4 verifies and evaluates the performance of the proposed method, analyzing it from the aspects of speed planning effect and energy management effect, as well as comparing the energy consumption performance and system operation characteristics of different strategies in multi-signal traffic scenarios; and Section 5 summarizes the research content of the entire paper and its main conclusions, and looks forward to future research directions.

2. System Modeling

2.1. Multi-Signal Intersection Scene Description

In real-world traffic scenarios, intelligent and connected vehicles can access various types of traffic information, including traffic signal phase states and positional information. Table 2 summarizes all the information available to the FCEV considered in this study.

The traffic scenario constructed in this study consisted of multiple consecutively controlled signalized intersections. The signalized intersections were unevenly spaced along the route, as illustrated in Figure 2. The FCB were assumed to travel in the rightmost bus lane with no interfering vehicles present. This assumption was adopted to isolate the fundamental effects of traffic signal guidance on speed planning and energy management. In real-world mixed traffic, interactions with surrounding vehicles and headway constraints may affect the planned speed trajectory. Nevertheless, such factors can be incorporated into the proposed framework by extending the speed planning layer to account for vehicle interactions and safety constraints, while the EMS can adapt to the resulting power demand variations.

2.2. Modeling of Transmission and Power Systems for FCB

This study aimed to develop a model for FCB that provides an accurate and computationally tractable system foundation for subsequent control strategy design and optimization algorithms. The overall vehicle model included vehicle longitudinal dynamics, driving power calculation, modeling of the FC and the power battery, as well as methods for evaluating overall vehicle energy consumption.

The research object considered in this study were FCB, whose energy flow is illustrated in Figure 3, and the corresponding parameters are listed in Table 3. The parameters listed in Table 3 correspond to FCB and were selected based on manufacturer specifications and commonly adopted values reported in the literature [30].

2.2.1. Transmission System Modeling

To comprehensively describe the energy flow path from road operating conditions to energy supply, the longitudinal driving power demand of the vehicle was first analyzed. At any time instant as t, the required traction force of the vehicle should satisfy the dynamic equilibrium with various resistance components. The traction force demand consisted of multiple components, including rolling resistance, aerodynamic drag, and acceleration resistance. The total required traction force can therefore be expressed as Equation (1):

F_{t} = C_{r} m g + \frac{1}{2} ρ_{a} C_{d} A v_{t}^{2} + δ_{r} m a_{t}

(1)

where

C_{r}

denotes the rolling resistance coefficient,

δ_{r}

is the mass conversion factor,

m

represents the vehicle mass,

g

is the gravitational acceleration,

ρ_{a}

denotes the air density,

C_{d}

is the aerodynamic drag coefficient,

A

represents the frontal area of the vehicle,

v_{t}

is the vehicle speed at time

t

, and

a_{t}

is the corresponding vehicle acceleration.

The required traction force can be converted into the motor-required wheel-end torque through the driveline according to Equation (2):

T_{M} = \frac{F_{t}}{i_{g}} \cdot R_{w h e e l}

(2)

where

R_{wheel}

denotes the wheel radius and

i_{g}

represents the gear ratio of the driveline.

Subsequently, by combining this with Equation (3), the electric motor power demand can be obtained as follows:

P_{M} = ω_{M} \cdot T_{M} \cdot (η_{M})^{- k}

(3)

where

ω_{M}

denotes the motor rotational speed,

η_{M}

represents the motor efficiency, and

κ

is introduced to characterize the directionality of the efficiency model, accounting for the different efficiency effects under driving and regenerative braking conditions. The parameters were derived from experimentally reported data in the literature, as illustrated in Figure 4.

Based on the above formulations, the vehicle driving power demand at each time instant can be computed from the known speed profile, and the corresponding instantaneous motor power can be further derived. This provided a quantitative basis for power allocation and optimization between the FC and the power battery in the subsequent EMS.

2.2.2. Power System Modeling

In this study, a quasi-static modeling approach [31] was adopted for the fuel cell and battery subsystems. This choice was motivated by the fact that the proposed EMS operate at a supervisory level, where power demand variations occur at a time scale of seconds rather than the time scale of fast electrochemical or thermal dynamics. Therefore, transient fuel cell dynamics, air compressor behavior, temperature effects, and battery aging were not explicitly modeled. These effects would typically evolve over longer time scales and have a limited influence on instantaneous power allocation decisions under the urban driving conditions considered in this work.

In the FCB considered in this study, the driving electric motor is primarily powered by two energy sources, namely the FC and the power battery system. Accordingly, the power demand of the electric motor can satisfy the following energy balance equations, as expressed in Equation (4):

P_{M} = η_{d c a c} \cdot (P_{B a t t} + η_{d c d c} P_{F C})

(4)

where

P_{F C}

denotes the FC power,

P_{B a t t}

represents the battery power,

η_{d c a c}

denotes the DC/AC conversion efficiency, and

η_{d c d c}

denotes the DC/DC conversion efficiency.

In this study, a quasi-static model was employed to establish the mapping relationship between the output power of the FC and hydrogen consumption [31], as illustrated in Figure 5a. Accordingly, the hydrogen consumption

C_{h}

can be calculated using the following expression:

\frac{d C_{h}}{d t} = \frac{P_{F C}}{η_{F C} C_{H}}

(5)

where

η_{F C}

denotes the efficiency of the FC and

C_{H}

represents the chemical energy density of hydrogen.

The power battery system was modeled using an internal-resistance model, and the terminal voltage

V_{t e r}

and battery current

I_{B a t t}

can be calculated according to Equations (6)–(8).

P_{B a t t} = I_{B a t t} \cdot V_{t e r}

(6)

V_{o c} - I_{B a t t} \cdot R_{i n t} = V_{t e r}

(7)

where

V_{o c}

denotes the open-circuit voltage of the traction battery and

R_{i n t}

represents the internal resistance.

The electrical energy consumption of the battery during operation can be calculated according to Equation (8):

\frac{d S O C}{d t} = - \frac{I_{B a t t}}{3600 C_{n}}

(8)

where SOC and

C_{n}

denote the state of charge and the nominal capacity of the traction battery, respectively. The corresponding data are shown in Figure 5b.

The search boundaries of the model parameters are listed in Table 4.

2.2.3. Driving Cycle Verification

To verify the convergence and correctness of the transmission system modeling, a validation experiment was conducted under the standard CHTC-B driving cycle. Figure 6 illustrates the relationship between vehicle speed and motor power over time, where the motor power responded smoothly and consistently to speed variations during both traction and regenerative braking phases. Positive motor power corresponded to vehicle acceleration, while negative power appeared during deceleration, indicating effective regenerative braking behavior. Throughout the entire driving cycle, no numerical divergence, non-physical oscillations, or discontinuities were observed, which confirmed the stability and correctness of the transmission system modeling.

3. Vehicle Speed Planning and Energy Management Strategy Design

3.1. Vehicle Speed Planning Strategy Design

In multi-signal traffic scenarios, vehicles need to control their speed based on the upcoming traffic signal phase, the remaining signal time, and the distance to the stop line, with the goal of reducing unnecessary stops and improving traffic efficiency. To this end, a green-wave-based speed planning model was developed in this study. The model first determined a target vehicle speed according to the signal timing information and then embedded this obtained target speed into the intelligent driver model (IDM).

The core idea of the target speed determination rule is as follows: when the traffic signal is red, the vehicle is not allowed to pass the intersection; therefore, an average approaching speed is computed based on the remaining red-light duration, enabling the vehicle to gradually approach the stop line. When the signal is yellow, the handling method is the same as for a red light. When the signal is green, if the vehicle traveling at its current speed can reach the stop line within the green-light window, i.e.,

t_{t o} \leq τ

, the cruising speed

v_{0}

is maintained; otherwise, a speed is calculated based on the remaining green-light time. The corresponding formulation is given in Equation (9):

\hat{v} (d, τ, v, ϕ) = \{\begin{matrix} m i n (V_{m a x}, m a x (0, \frac{d}{m a x (τ, ε_{T})})), & ϕ = 3, \\ m i n (V_{m a x}, m a x (0, \frac{d}{m a x (τ, ε_{T})})), & ϕ = 2, \\ v_{0} & ϕ = 1, & t_{t o} \leq τ, \\ m i n (V_{m a x}, m a x (V_{m i n}, \frac{d}{m a x (τ, ε_{T})})), & ϕ = 1, & t_{t o} > τ . \end{matrix}

(9)

where

\hat{v} (d, τ, v, ϕ)

denotes the target vehicle speed at the current time instant;

d

represents the distance from the vehicle’s current position to the stop line of the nearest upcoming signalized intersection that has not yet been passed;

τ

denotes the remaining time of the current signal phase;

ϕ = (1, 2, 3)

corresponds to the green, yellow, and red signal phases, respectively;

v_{0}

is the current vehicle speed; and

t_{t o}

represents the time required for the vehicle to reach the stop line when traveling at the current speed.

V_{m a x}

indicates the maximum speed limit on the road.

V_{m i n}

is the minimum speed set for vehicles that are too close to the traffic light intersection when the light is green, resulting in a lower planned speed.

In the above formulation, the preceding stage primarily provided a target speed which was determined based on traffic signal timing, the distance to the stop line, and the remaining phase duration. However, these target speeds cannot be directly applied to longitudinal vehicle control. This is mainly because vehicle dynamics are subject to inertia and acceleration continuity constraints, such as speed variations that must satisfy ride comfort requirements and abrupt braking that may compromise safety and must be avoided. Therefore, to enable the FCB to physically operate both feasibly and smoothly, this paper introduced the method proposed in reference [32]. It was assumed that there was a virtual vehicle in front of the FCB, and the speed of the virtual vehicle was generated and strictly executed by Equation (9). The vehicle needed to follow the speed of the vehicle in front using the IDM. The IDM generated smooth, continuous, and physically realistic acceleration commands based on the target speed and surrounding conditions, thereby enabling a natural transition from target speed to actual vehicle control inputs, ensuring the safety, feasibility, and comfort of the overall control strategy. Under IDM-based control, the acceleration of the FCB can be given as Equation (10):

a_{F C B} = a_{0} [1− {(\frac{v (t)}{\hat{v} (d, τ, v, ϕ)})}^{4}− {(\frac{s^{*} (v (t), Δ v)}{s})}^{2}]

(10)

where

v (t)

represents the current vehicle speed,

{\hat{v}}^{(d, τ, v, ϕ)}

is the target speed, and

s^{*} (v (t), Δ v)

denotes the minimum expected spacing.

s

is the actual distance between FCB and the preceding virtual vehicle.

The functional of minimum expected spacing is shown in Equation (11):

s^{*} (v (t), Δ v) = s_{0} + v (t) T + \frac{v (t) Δ v}{2 \sqrt{a_{0} b_{0}}}

(11)

where

s_{0}

represents the minimum safe distance when stationary, T is the safe time distance,

Δ v

represents the difference between FCB speed and target vehicle speed, and

a_{0}

and

b_{0}

represent comfortable acceleration and deceleration, respectively.

Regarding the selection of IDM parameters for the FCB proposed in this paper, based on the references [31] and data from the CHTC-B operating conditions, the maximum comfort acceleration and deceleration were determined to be 1.25 and 1.1, respectively. Furthermore, we set

s_{0}

= 2.5 as a reasonable stationary safety margin for heavy vehicles, consistent with common microscopic traffic simulation practices. The desired spacing was set to T = 1.0 to maintain a sufficient response to green window guidance.

3.2. EMS Design

To address the energy management problem of FCB, a reinforcement learning-based intelligent control framework was developed, with the core objective of optimizing the power allocation between the FC and the battery while satisfying vehicle power demand. The proposed framework aimed to reduce hydrogen consumption and maintain the battery SOC within a target range. To achieve this goal, the agent observed the system state based on the current vehicle driving conditions and output control commands for FC power, affecting the overall energy flow of the vehicle.

In the design of the state space, four variables that were closely related to energy management were chosen as inputs. These were the current vehicle speed, vehicle acceleration, battery state of charge, and the current FC output power. The state vector can be defined as follows:

S = [v_{t}, a_{t}, S O C_{t}, P_{t}^{f c}]

(12)

where

v_{t}

denotes the vehicle speed,

a_{t}

represents the vehicle acceleration,

{S O C}_{t}

denotes the battery state of charge, and

P_{t}^{f c}

represents the output power of the FC.

The action space was defined as a one-dimensional continuous variable that controlled the increment of the FC output power. The action can be defined as follows:

A = {Δ P_{f c} | Δ P_{f c} \in [- 5 kW, 5 kW]}

(13)

The reward function was designed to minimize hydrogen consumption while maintaining battery SOC stability. Inspired by reference [31], the reward function was set as follows:

R = - (r_{H_{2}} + r_{S O C})

(14)

where

r_{H_{2}}

denotes the hydrogen consumption cost at the current time step and

r_{S O C}

represents the cost associated with SOC depletion.

The hydrogen consumption cost

r_{H_{2}}

can be defined as follows:

r_{H_{2}} = α {\times C}_{H_{2}}

(15)

where

C_{H_{2}}

denotes the hydrogen consumption, defined as the instantaneous hydrogen mass flow rate (kg/s) of the fuel cell system; it should be noted that hydrogen consumption here included not only the hydrogen consumption produced by the fuel cell, but also the fuel cell hydrogen consumption required to produce the same amount of electricity when the SOC decreased.

α

represents the weighting factor. In reference [31],

α

was set as the price of hydrogen.

The SOC depletion cost

r_{S O C}

can be defined as follows:

r_{S O C} = β \times (S O C (t) - S O C_{0})^{2}

(16)

where

S O C (t)

denotes the battery state of charge at time

t

,

{S O C}_{0}

represents the initial state of charge, which was set to 0.6 in this study, and

β

is the weighting factor.

The proposed strategy, including both the speed planning strategies and EMS, was implemented in Python 3.11 using the PyTorch framework. All experiments were conducted on a computer equipped with an AMD Ryzen 9 7845HX CPU and an NVIDIA GeForce RTX 4060 GPU (8 GB). The GPU was only required during the offline training stage, while online deployment relied on inference-only execution of lightweight neural networks, which can be efficiently supported by standard onboard computing units. The speed planning strategy assumed that traffic signal phase and timing information were obtained via vehicle-to-infrastructure (V2I) communication with negligible latency, which was consistent with current intelligent transportation systems. In the present study, the offline training process of the SAC-based EMS controller required approximately several hours on the specified hardware platform, while the online inference time was on the order of milliseconds, which was negligible compared with the control sampling interval.

4. Results and Discussion

To ensure the generalization capability and robustness of the proposed method, a strict training–testing separation strategy was adopted. During the training phase, diverse multi-signalized traffic scenarios were constructed by randomly generating different combinations of traffic signal phases, enabling the EMS to learn adaptive responses to variations in power demand. Subsequently, to evaluate the model performance under non-training conditions, a set of three signalized intersection test scenarios that were not included in the training process were selected. The signal phase patterns and temporal structures of these test scenarios differed from those of the training set, thereby generating new vehicle speed profiles. Based on this, a comprehensive evaluation was conducted on the speed planning strategy of the IDM and the integrated green wave speed planning intelligent driver model (IDM-G) and its corresponding Soft Actor–Critic (SAC) EMS to assess the power demand response and overall vehicle energy consumption under unseen real-world operating conditions.

4.1. Speed Planning Results

To evaluate the effectiveness of the proposed green-wave-guided strategy, the standard IDM was adopted as a baseline for comparison. The standard IDM did not incorporate green wave speed planning and its target speed was simply set to the maximum allowable vehicle speed. Figure 7 presents the complete driving trajectories of the ego vehicle under the two strategies in the signalized traffic scenario, where the background colors indicate traffic signal phase changes. It can be observed that the IDM-G strategy proactively adjusted the approaching speed to synchronize vehicle motion with the green-light window. In contrast, due to the lack of advance signal awareness, the standard IDM exhibited passive hard braking when approaching intersections, resulting in pronounced stop-and-go behavior.

Figure 8 shows the energy recovered by braking and consumed by driving for the two strategies. It can be seen that the cumulative drive energy of the IDM was 2.77 kWh, with regenerative energy recovery at 0.56 kWh; while the drive energy of the IDM-G was reduced to 2.67 kWh, with regenerative energy recovery slightly increasing to 0.57 kWh. This indicated that the IDM-G significantly reduced drive energy consumption by decreasing frequent strong acceleration demands while maintaining a similar level of regenerative energy. Ultimately, the total energy consumption of the IDM was 2.21 kWh, while the total energy consumption of the IDM-G was reduced to 2.10 kWh.

Table 5 presents a quantitative comparison of speed planning performance between the standard IDM and the proposed IDM-G strategy. Both methods resulted in the same travel time which ensures a fair comparison. However, IDM-G eliminated intermediate stops completely, whereas the standard IDM exhibited three full stops along the corridor. In terms of ride comfort, IDM-G reduced the RMS acceleration from 0.530 m/s² to 0.443 m/s², indicating smoother longitudinal speed variations. Although the RMS jerk under IDM-G was slightly higher, this can be attributed to a small number of proactive speed adjustments near signalized intersections, which enabled stop-free traversal while maintaining overall trajectory smoothness and efficiency.

4.2. EMS Results

To prevent overfitting during offline training, a strict training–testing separation strategy was adopted. The strategy trained with vehicle speeds in randomly generated signal scenarios and evaluated speeds in unseen scenarios. Figure 9 shows the evolution of the average reward of the agent over 150 training episodes. It can be observed that the average reward exhibited relatively large fluctuations in the early training stage as the number of episodes increased, with particularly pronounced oscillations occurring between episodes 20 and 40. This happened because the agent was in the exploration stage at this time. In this stage, action choices were more random and the balance between system behavior and energy limits had not yet been learned, so the rewards changed a lot. As training continued, the agent slowly gained experience and learned better ways to share power between the FC and the battery. After about 50 trials, the average reward started to rise clearly and then became stable, which showed that the agent had learned how to allocate FC and battery power in a reasonable way.

This paper used the DP strategy and the rule-based strategy to evaluate the SAC-based EMS, and the results are shown in Figure 10 and Figure 11. In Figure 10, the left panel presents the comparison of FC power profiles between SAC and DP under the standard IDM, while the right panel shows the corresponding results under the IDM-G. It can be observed that, although SAC is a real-time control strategy that does not rely on future driving information, its power trajectories in both scenarios closely approximated the optimal power allocation obtained by DP, demonstrating strong learning capability and policy generalization. Under the speed curve generated by a standard IDM, more aggressive speed changes led to frequent peak-to-trough shifts in power demand. Therefore, the SAC strategy exhibited slightly greater local variability in transient power regulation than the smooth optimal trajectory generated by DP. In contrast, the speed change was significantly smoother under the speed curve based on IDM-G, and the SAC output was closer to the steady-state regulation characteristics of DP, indicating that SAC was closer to the optimal control behavior. For comparison, the rule-based EMS still lack the ability to finely adjust power output in response to dynamic operating conditions. This contrast further highlighted the advantage of the learning-based SAC strategy in capturing complex power-demand patterns and achieving a closer approximation to the optimal DP solution.

Figure 11 shows the SOC curve and the total hydrogen use for different strategies. Under both speed profiles, the SAC-based strategy produced SOC changes and energy use trends like those of DP. However, differences in speed behavior led to different power demand patterns, and this clearly affected the performance of both strategies. Under the speed profile from the standard IDM, the FCB slowed down and sped up strongly when approaching intersections. As a result, power demand changed sharply. Because of this, the SOC curve under SAC control showed an increase in small fluctuations, and the total hydrogen use was slightly higher than that of the IDM–DP reference. In contrast, the speed profile from IDM-G led to smoother speed changes and gentler acceleration and deceleration; this created a more continuous power demand pattern. With these features, the SOC curve showed a smoother and more stable downward trend during the whole trip, and the total hydrogen consumption was lower. For the rule-based EMS, the SOC trajectory exhibited larger deviations and less consistent regulation under both speed profiles, reflecting a limited ability to adapt to varying power demand conditions. Under the standard IDM speed profile, frequent stop-and-go behavior led to pronounced SOC fluctuations and higher cumulative hydrogen consumption. Although the smoother IDM-G speed profile alleviated these effects to some extent, the rule-based strategy still resulted in greater hydrogen usage, highlighting its inherent limitations in coordinated energy management.

Table 6 summarizes the hydrogen consumption results of different EMS under the standard IDM and proposed IDM-G speed profiles, with DP serving as the optimal benchmark. Under the standard IDM driving conditions, strong deceleration and re-acceleration led to a hydrogen use of 0.1594 kg for the IDM–SAC strategy, which was about 12.02% higher than the DP result of 0.1423 kg. In contrast, under the more stable IDM-G speed curve, the SAC strategy reduced hydrogen use to 0.1414 kg, or about 11.3% lower. These results showed that when the power demand pattern was more stable, the SAC-based EMS could greatly improve economic performance.

To further evaluate the generalization performance of the proposed strategy, we conducted additional tests on four heterogeneous and previously unseen traffic scenarios. Table 7 summarizes the hydrogen consumption results for the baseline IDM–SAC strategy and the proposed IDM-G–SAC strategy. As shown in the table, the proposed IDM-G–SAC strategy consistently reduced hydrogen consumption in all test scenarios. On average, hydrogen consumption decreased from 0.1433 kg for IDM–SAC to 0.1326 kg for IDM-G–SAC, representing an average reduction of approximately 7.5%.

From a theoretical perspective, green-wave-guided speed planning altered the characteristics of the power demand profile by suppressing high-frequency and high-magnitude motor power fluctuations. As a result, the energy management problem became less aggressive and more structured, allowing the SAC-based EMS to achieve a solution closer to the optimal power allocation.

5. Conclusions

This study examined how green wave guidance affected the economy of FCEV on urban roads with many traffic lights by considering both speed planning and energy management. First, an IDM-G was proposed. It used traffic light phase and remaining time information to calculate the target speed of FCB and combined this with the IDM to meet vehicle dynamic limits. Second, SAC-based EMS were introduced to share power between the FC and the battery in a reasonable way.

For speed planning, the results showed that frequent hard braking and strong acceleration under the standard IDM caused large changes in motor power and higher driving energy use. In contrast, IDM-G reduced driving energy use by about 5.0% by adjusting speed over a longer distance.

At the energy management level, the results showed that although SAC does not use future driving information, its FC power control was close to the DP optimal result. Under the IDM speed curve, strong slowing down and speeding up near traffic lights caused large changes in power demand, so hydrogen use for the IDM–SAC strategy reached 0.1594 kg. In contrast, under the smoother IDM-G speed curve, acceleration and deceleration were more gradual, and power demand changes were smaller. Because of this, SAC could learn a more stable power sharing pattern. As a result, hydrogen use for the IDM-G–SAC strategy dropped to 0.1414 kg, which was about 11.3% lower than that of IDM–SAC.

In summary, the proposed strategy allowed for smooth passing through intersections at the speed planning level and clearly reduced hydrogen use at the energy management level. The combined IDM-G and SAC control framework worked well together and improved overall vehicle economy. These results showed that coordinated speed planning and energy management have strong potential in complex urban traffic and provided useful guidance for energy-saving controls in FCEV. Compared with existing studies on fuel cell vehicle energy management and eco-driving control, this study emphasized the coordination between traffic-light-aware speed planning and learning-based energy management, enabling smoother vehicle operation at signalized intersections.

It should be noted that the present study focused on a deterministic simulation setting and several real-world factors were not explicitly modeled, including stochastic signal fluctuations, traffic queues induced by surrounding vehicles, passenger load variations, adverse weather conditions, and long-term fuel cell aging effects. Moreover, although the simulations were conducted on a specific FCB configuration and a signalized urban corridor, the framework itself was not vehicle or route specific. With appropriate parameter adaptation, the same hierarchical structure can be extended to other vehicle classes, such as passenger cars or hybrid buses, as well as to different traffic layouts, including corridors with varying signal densities or mixed traffic conditions.

Author Contributions

Conceptualization, W.G. and F.Y.; methodology, J.Z. (Jinming Zhang).; software, W.G.; validation, J.Z. (Jiaming Zhou), S.W. (Shuo Wang) and S.W. (Shuaihua Wang); formal analysis, F.Y. and C.L.; investigation, Z.H.; resources, W.G.; data curation, C.L.; writing—original draft preparation, W.G.; writing—review and editing, W.G.; visualization, H.G.; supervision, F.Y.; project administration, F.Y.; funding acquisition, J.Z. (Jiaming Zhou) and J.Z. (Jinming Zhang). All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by Weifang University of Science and Technology High-level Talent Research Start-up Fund Project (KJRC2023001), Campus level Project of Weifang University of Science and Technology (2023KJ02 and 2023KJ03) and Weifang City Science and Technology Development Plan Project (College and University Section) (2024GX031 and 2025GX037).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hassan, S.S.; Mohamed, N.R.; Saad, M.M.; Salem, A.M.; Ibrahim, Y.H.; Elshakour, A.A.; Fathy, M.A. A novel non-woven fabric sandwich filter with activated carbon/polypyrrole nanocomposite for the removal of CO, SO₂ and NOx emitted from gasoline engines. Fuel 2025, 401, 135838. [Google Scholar] [CrossRef]
Menesy, A.S.; Sultan, H.M.; Zayed, M.E.; Habiballah, I.O.; Dmitriev, S.; Safaraliev, M.; Kamel, S. A modified slime mold algorithm for parameter identification of hydrogen-powered proton exchange membrane fuel cells. Int. J. Hydrogen Energy 2024, 86, 853–874. [Google Scholar] [CrossRef]
Li, M.; Wan, X.; Wu, J.; Yan, M.; He, H. Enhancing Fuel Cell Electric Vehicle Efficiency With an Information-Bridged Hierarchical Reinforcement Learning Method. IEEE Trans. Intell. Transp. Syst. 2025, 26, 16076–16089. [Google Scholar] [CrossRef]
Lu, D.; Hu, D.; Wang, J.; Wei, W.; Zhang, X. A data-driven vehicle speed prediction transfer learning method with improved adaptability across working conditions for intelligent fuel cell vehicle. IEEE Trans. Intell. Transp. Syst. 2025, 26, 10881–10891. [Google Scholar] [CrossRef]
Wang, X.; Ji, J.; Li, J.; Zhao, Z.; Ni, H.; Zhu, Y. Review and Outlook of Fuel Cell Power Systems for Commercial Vehicles, Buses, and Heavy Trucks. Sustainability 2025, 17, 6170. [Google Scholar] [CrossRef]
Huang, Y.; Kang, Z.; Mao, X.; Hu, H.; Tan, J.; Xuan, D. Deep reinforcement learning based energy management strategy considering running costs and energy source aging for fuel cell hybrid electric vehicle. Energy 2023, 283, 129177. [Google Scholar] [CrossRef]
Huo, W.; Zhao, T.; Yang, F.; Chen, Y. An improved soft actor-critic based energy management strategy of fuel cell hybrid electric vehicle. J. Energy Storage 2023, 72, 108243. [Google Scholar] [CrossRef]
Lu, D.; Yi, F.; Hu, D.; Li, J.; Yang, Q.; Wang, J. Online optimization of energy management strategy for FCV control parameters considering dual power source lifespan decay synergy. Appl. Energy 2023, 348, 121516. [Google Scholar] [CrossRef]
Ajanovic, A.; Glatt, A.; Haas, R. Prospects and impediments for hydrogen fuel cell buses. Energy 2021, 235, 121340. [Google Scholar] [CrossRef]
Caponi, R.; Ferrario, A.M.; Del Zotto, L.; Bocci, E. Hydrogen refueling stations and fuel cell buses four year operational analysis under real-world conditions. Int. J. Hydrogen Energy 2023, 48, 20957–20970. [Google Scholar] [CrossRef]
Danielis, R.; Scorrano, M.; Masutti, M.; Awan, A.M.; Niazi, A.M.K. Fuel cell electric buses: A systematic literature review. Energies 2024, 17, 5096. [Google Scholar] [CrossRef]
Fakhreddine, O.; Gharbia, Y.; Derakhshandeh, J.F.; Amer, A. Challenges and solutions of hydrogen fuel cells in transportation systems: A review and prospects. World Electr. Veh. J. 2023, 14, 156. [Google Scholar] [CrossRef]
Blades, L.A.W.; MacNeill, R.; Zhang, Y.; Cunningham, G.; Early, J. Determining the Distribution of Battery Electric and Fuel Cell Electric Buses in a Metropolitan Public Transport Network. In SAE Technical Paper; SAE International: Warrendale, PA, USA, 2022. [Google Scholar]
Suzuki, H.; Marumo, Y. A new approach to green light optimal speed advisory (GLOSA) systems for high-density traffic flowe. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 362–367. [Google Scholar]
Mintsis, E.; Vlahogianni, E.I.; Mitsakis, E.; Ozkul, S. Enhanced speed advice for connected vehicles in the proximity of signalized intersections. Eur. Transp. Res. Rev. 2021, 13, 2. [Google Scholar] [CrossRef]
Dong, H.; Zhuang, W.; Wu, G.; Li, Z.; Yin, G.; Song, Z. Overtaking-enabled eco-approach control at signalized intersections for connected and automated vehicles. IEEE Trans. Intell. Transp. Syst. 2023, 25, 4527–4539. [Google Scholar] [CrossRef]
Xu, Y.; Xu, E.; Zheng, W.; Huang, Q. Hierarchical model-predictive-control-based energy management strategy for fuel cell hybrid commercial vehicles incorporating traffic information. Sustainability 2023, 15, 12833. [Google Scholar] [CrossRef]
Zhang, L.; Liao, R.; Wei, X.; Huang, W. PMP method with a cooperative optimization algorithm considering speed planning and energy management for fuel cell vehicles. Int. J. Hydrogen Energy 2024, 79, 434–447. [Google Scholar] [CrossRef]
Wei, X.; Leng, J.; Sun, C.; Huo, W.; Ren, Q.; Sun, F. Co-optimization method of speed planning and energy management for fuel cell vehicles through signalized intersections. J. Power Sources 2022, 518, 230598. [Google Scholar] [CrossRef]
Zhao, X.; Xu, L.; Yan, R. Rule-based Energy Management Strategy for Fuel Cell Vehicles. In Proceedings of the 2024 IEEE 25th China Conference on System Simulation Technology and Its Application (CCSSTA), Tianjin, China, 21–23 July 2024; pp. 516–520. [Google Scholar]
Chen, J.; He, H.; Wang, Y.-X.; Quan, S.; Zhang, Z.; Wei, Z.; Han, R. Research on energy management strategy for fuel cell hybrid electric vehicles based on improved dynamic programming and air supply optimization. Energy 2024, 300, 131567. [Google Scholar] [CrossRef]
Gómez-Barroso, Á.; Alonso Tejeda, A.; Vicente Makazaga, I.; Zulueta Guerrero, E.; Lopez-Guede, J.M. Dynamic Programming-Based ANFIS Energy Management System for Fuel Cell Hybrid Electric Vehicles. Sustainability 2024, 16, 8710. [Google Scholar] [CrossRef]
Jouda, B.; Al-Mahasneh, A.J.; Mallouh, M.A. Deep stochastic reinforcement learning-based energy management strategy for fuel cell hybrid electric vehicles. Energy Convers. Manag. 2024, 301, 117973. [Google Scholar] [CrossRef]
Ruan, J.; Wu, C.; Liang, Z.; Liu, K.; Li, B.; Li, W.; Li, T. The application of machine learning-based energy management strategy in a multi-mode plug-in hybrid electric vehicle, part II: Deep deterministic policy gradient algorithm design for electric mode. Energy 2023, 269, 126792. [Google Scholar] [CrossRef]
Tang, X.; Zhou, H.; Wang, F.; Wang, W.; Lin, X. Longevity-conscious energy management strategy of fuel cell hybrid electric Vehicle Based on deep reinforcement learning. Energy 2022, 238, 121593. [Google Scholar] [CrossRef]
Khalatbarisoltani, A.; Zhou, H.; Tang, X.; Kandidayeni, M.; Boulon, L.; Hu, X. Energy management strategies for fuel cell vehicles: A comprehensive review of the latest progress in modeling, strategies, and future prospects. IEEE Trans. Intell. Transp. Syst. 2023, 25, 14–32. [Google Scholar] [CrossRef]
Jia, C.; Liu, W.; He, H.; Chau, K. Superior energy management for fuel cell vehicles guided by improved DDPG algorithm: Integrating driving intention speed prediction and health-aware control. Appl. Energy 2025, 394, 126195. [Google Scholar] [CrossRef]
Li, K.; He, H.; Wu, J.; Wei, Z.; Zhou, Z.; Xing, B. Power-Aware Predictive Energy Management Integrating a Transformer-Based Hybrid Prediction Method for Fuel Cell Vehicles. IEEE Trans. Transp. Electrif. 2025, 11, 13339–13350. [Google Scholar] [CrossRef]
Wu, J.; Wei, Z.; He, H.; Wei, H.; Li, S.; Gao, F. Ensembled traffic-aware transformer-based predictive energy management for electrified vehicles. IEEE Trans. Intell. Transp. Syst. 2024, 25, 12333–12346. [Google Scholar] [CrossRef]
Yan, M.; Li, G.; Li, M.; He, H.; Xu, H.; Liu, H. Hierarchical predictive energy management of fuel cell buses with launch control integrating traffic information. Energy Convers. Manag. 2022, 256, 115397. [Google Scholar] [CrossRef]
Jia, C.; Liu, W.; Chau, K.; He, H.; Zhou, J.; Niu, S. Passenger-aware reinforcement learning for efficient and robust energy management of fuel cell buses. eTransportation 2026, 27, 100537. [Google Scholar] [CrossRef]
Li, Z.; Zhuang, W.; Yin, G.; Ju, F.; Wang, Q.; Ding, H. Learning-based eco-driving strategy design for connected power-split hybrid electric vehicles at signalized corridors. In Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV), Aachen, Germany, 4–9 June 2022; pp. 1226–1233. [Google Scholar]

Figure 1. Schematic diagram of the training and testing process for the collaborative optimization framework of speed planning and energy management.

Figure 2. Schematic of the multi-signalized corridor.

Figure 3. Schematic diagram of energy flow and power system structure of FCB.

Figure 4. MAP diagram of drive motor.

Figure 5. Power system characteristic diagram. (a) Efficiency and hydrogen consumption of FC; (b) battery internal resistance and voltage characteristics vary with SOC.

Figure 6. Transmission system model validation under the CHTC-B driving cycle.

Figure 7. Comparison of FCB trajectory and speed between IDM and IDM-G. (a) Trajectory curve; (b) speed curve.

Figure 8. The energy recovered by braking and consumed by driving.

Figure 9. Reward curve of SAC EMS in the training process.

Figure 10. Comparison of FC power allocation between SAC and DP under different speed planning conditions. (a) FC power fluctuation curve under IDM; (b) FC power fluctuation curve under IDM-G.

Figure 11. Comparison of SOC and hydrogen consumption under speed planning and EMS. (a) SOC curve of power battery; (b) cumulative hydrogen consumption curve.

Table 1. Summary of representative studies.

Category	Reference	Methodology	Main Finding
Speed planning	Suzuki et al. [14]	Rule-based	Improved intersection passing efficiency
	Mintsis et al. [15]	Rule-based	Improved driving smoothness
	Dong et al. [16]	Pontryagin’s minimum principle	Reduced energy cost through optimized speed
	Xu et al. [17]	MPC and DP	Improved energy efficiency
	Zhang et al. [18]	Pontryagin’s minimum principle	Reduced hydrogen consumption
	Wei et al. [19]	DP	Improved energy efficiency
EMS	Jia et al. [27]	DDPG	Enhanced FC efficiency and lifespan
	Li et al. [28]	TD3	Reduced hydrogen consumption and battery degradation
	Wu et al. [29]	SAC	Improved economic performance

Table 2. V2I information.

Category	Description	Unit
V2I	Phase of 1st next traffic light	--
	Remaining time of 1st next traffic light	s
	Distance to 1st next traffic light	m

Table 3. FCB parameters.

Parameter	Value	Unit
Vehicle mass	13,500	kg
Frontal Area	8.16	m²
Wheel Radius	0.47	m
Air resistance coefficient	0.55	--
Air density	1.226	kg/m³
Rolling Resistance	0.0085	--

Table 4. Boundaries of the model parameters.

Parameter	Lower Bound	Upper Bound	Unit
Fuel cell power	0	60	kW
Fuel cell efficiency	0	56	%
DC/DC efficiency	90	95	%
Power battery voltage	540	738	V
Drive motor power	0	200	kW
Drive motor efficiency	85	97	%

Table 5. Comparison of speed planning performance between IDM and IDM-G.

Metric	IDM	IDM-G	Unit
Travel time	299	299	s
Stop count	3	0	--
RMS acceleration	0.530	0.443	m/s²
RMS jerk	0.186	0.228	m/s³

Table 6. Comparison of hydrogen consumption under different EMS.

Method	Rule-Based	DP	SAC	DP/SAC
IDM	0.1914	0.1423	0.1594	89.3%
IDM-G	0.1852	0.1350	0.1414	95.5%

Table 7. Comparison of hydrogen consumption under unseen traffic scenarios.

Metric	IDM (Mean ± Std.)	IDM-G (Mean ± Std.)	Improvement
	0.1433 ± 0.0085	0.1326 ± 0.0119	↓ 7.5%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Guo, W.; Yi, F.; Zhou, J.; Zhang, J.; Wang, S.; Gong, H.; Wang, S.; Huang, Z.; Liu, C. A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections. World Electr. Veh. J. 2026, 17, 79. https://doi.org/10.3390/wevj17020079

AMA Style

Guo W, Yi F, Zhou J, Zhang J, Wang S, Gong H, Wang S, Huang Z, Liu C. A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections. World Electric Vehicle Journal. 2026; 17(2):79. https://doi.org/10.3390/wevj17020079

Chicago/Turabian Style

Guo, Wei, Fengyan Yi, Jiaming Zhou, Jinming Zhang, Shuo Wang, Hongtao Gong, Shuaihua Wang, Zongjing Huang, and Chunrui Liu. 2026. "A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections" World Electric Vehicle Journal 17, no. 2: 79. https://doi.org/10.3390/wevj17020079

APA Style

Guo, W., Yi, F., Zhou, J., Zhang, J., Wang, S., Gong, H., Wang, S., Huang, Z., & Liu, C. (2026). A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections. World Electric Vehicle Journal, 17(2), 79. https://doi.org/10.3390/wevj17020079

Article Menu

A Cooperative Optimization Method for Speed Planning and Energy Management of Fuel Cell Buses at Multi-Signalized Intersections

Abstract

1. Introduction

2. System Modeling

2.1. Multi-Signal Intersection Scene Description

2.2. Modeling of Transmission and Power Systems for FCB

2.2.1. Transmission System Modeling

2.2.2. Power System Modeling

2.2.3. Driving Cycle Verification

3. Vehicle Speed Planning and Energy Management Strategy Design

3.1. Vehicle Speed Planning Strategy Design

3.2. EMS Design

4. Results and Discussion

4.1. Speed Planning Results

4.2. EMS Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI