Research on Global Optimization Algorithm of Integrated Energy and Thermal Management for Plug-In Hybrid Electric Vehicles

Power distribution and battery thermal management are important technologies for improving the energy efficiency of plug-in hybrid electric vehicles (PHEVs). In response to the global optimization of integrated energy thermal management strategy (IETMS) for PHEVs, a dynamic programming algorithm based on adaptive grid optimization (AGO–DP) is proposed in this paper to improve optimization performance by reducing the optimization range of SOC and battery temperature, and adaptively adjusting the grid distribution of state variables according to the actual feasible region. The simulation results indicate that through AGO–DP optimization, the reduction ratio of the state feasible region is more than 30% under different driving conditions. Meanwhile, the algorithm can obtain better global optimal driving costs more rapidly and accurately than traditional dynamic programming algorithms (DP). The computation time is reduced by 33.29–84.67%, and the accuracy of the global optimal solution is improved by 0.94–16.85% compared to DP. The optimal control of the engine and air conditioning system is also more efficient and reasonable. Furthermore, AGO–DP is applied to explore IETMS energy-saving potential for PHEVs. It is found that the IETMS energy-saving potential range is 3.68–23.74% under various driving conditions, which increases the energy-saving potential by 0.55–3.26% compared to just doing the energy management.


Introduction
With the development of new energy technologies in the automotive industry, the plugin hybrid electric vehicle (PHEV) has become one of the important approaches to reducing carbon emissions and the usage of fossil fuels and achieving the electrification of transport and sustainable development [1,2]. The PHEV has multiple power sources, including engine and battery, which can greatly alleviate the problem of 'mileage anxiety' in battery electric vehicles [3]. The PHEV has a larger battery capacity and a longer pure electric range than other hybrid electric vehicles. Meanwhile, it can directly supplement electricity from the power grid, with more energy-saving performance. Energy management among multiple power sources and battery thermal management are the crucial technologies for improving the performance of PHEVs.
The energy management strategies (EMS) of PHEV are used to reasonably and effectively distribute the output power between the engine and battery to meet the vehicle demand power [4,5], which can be divided into three categories: rule-based strategy, instantaneous optimization-based strategy, and global optimization-based strategy [6,7]. Rule-based strategy has the lowest computational complexity compared with other strategies, so it is widely applied in online vehicle control. Its control performance depends on logical rule formulation and parameter selection [8]. Although some studies have improved rules and introduced fuzzy logic for power distribution [9,10], it is still difficult to adapt to various real-world driving conditions. Optimization-based strategies require establishing an objective function and usually combine with vehicle speed prediction [11,12] and driving condition identification [13] to minimize the objective function within the optimization prediction horizon. Equivalent consumption minimization strategy [7,14] and model prediction control [5,[15][16][17] are common instantaneous optimization-based strategies, which have better economic efficiency but much higher computational complexity than rule-based strategies. In particular, global optimization-based strategies consider the entire vehicle driving cycle as a predictive horizon for the optimization problem, which is committed to obtaining the global optimal solution. The global optimization-based strategy needs to know the actual driving condition in advance and take it as input to optimize the power distribution of PHEVs, such as dynamic programming (DP) [18], Pontriagin's minimum principle [19], and the breadth-first search [20]. However, it inevitably increases the computational burden significantly, making it difficult to use for real-time vehicle control, although the global theoretical optimal solution obtained by DP through offline calculation is often used as a reference benchmark to evaluate the energy-saving potential of other online strategies. Otherwise, the optimal results of DP under different driving conditions are also applied to guide the development of online strategies for PHEVs. Liu et al. summarized the load-adaptive strategy rules by studying the relationship between demand power and battery output power in the DP optimal results [21]. The optimal SOC allocation method is proposed by Yu according to the minimum fuel consumption distribution under different types of driving conditions obtained by DP [22].
Meanwhile, PHEVs are equipped with large-capacity batteries, so efficient battery thermal management can extend battery lifetime and ensure battery safety. Optimizing thermal management strategies (TMS) is also an important method to reduce vehicle energy consumption [23]. Wei et al. found that implementing integrated energy and thermal management strategy (IETMS) for hybrid systems can improve driving economy [24]. Amini et al. established a control-oriented air conditioning (AC) system model. They used a hierarchical model predictive control to achieve IETMS, which improved the fuel economy by 2.2-5.3% under different driving conditions [25]. Fabio et al. consider the battery degradation cost in the optimization objective function of IETMS and constructs Pareto curves through convex programming collaborative optimization system matching and real-time control to balance battery degradation and energy consumption, increasing battery lifetime by more than 15% [26]. The classification of IETMS is similar to that of EMS, including rule-based, instantaneous optimization-based, and global optimization-based strategies. However, existing research on IETMS mostly focuses on instantaneous optimization-based control, lacking global optimization research for this problem. This makes it difficult to quantitatively evaluate the energy-saving potential of the real-time IETMS under different driving conditions, nor to guide the development of online control strategies. DP algorithm is the most common solving algorithm to obtain the global optimal power distribution solution for PHEVs. Battery state-of-charge (SOC) and engine output power are usually taken as the PHEV system state variable and control variables, respectively. Xu et al. also treated the vehicle work mode as a system state variable and added a penalty function to reduce the number of vehicle work mode switches [18]. Zhou et al. unified the EMS global optimization problem into a standardized expression based on the powertrain structural characteristics of series, parallel, and series-parallel hybrid electric vehicles [27]. However, the state variables and control variables of the PHEV system need to be discretization when solving the global optimization problem by DP. With the increase of system state variables and grid point density, the computational burden of the DP algorithm increases exponentially, resulting in a 'Curse of dimensionality'. To alleviate this issue, Claudio et al. adaptively adjusted the grid number of SOC based on the probability distribution of battery output power under a standard driving cycle to reduce computational burden [28]. Xu et al. established a hierarchical global optimization framework of 'information layer-physical layer-energy layer-dynamic programming' to obtain travel information at various times. They improved the optimization performance of the DP algorithm by optimizing the actual feasible region of SOC. The calculation time was reduced by more than 25% under different driving conditions [29]. The research on improving the solution performance of the EMS global optimization problem by DP has achieved good results, but its results cannot be directly used in IETMS. This is because EMS needs to consider only one continuous state variable, SOC, while IETMS includes two continuous state variables, SOC and battery temperature, which increases computational complexity exponentially. Additionally, the state equation of the PHEV system and optimization method also have significant difference.
To address the above issues in existing research, a novel global optimization algorithm is proposed in this work to solve the IETMS global programming problem for PHEV. Firstly, the PHEV state equation with SOC and battery temperature as variables is established. Secondly, an improved dynamic programming algorithm based on adaptive grid optimization (AGO-DP) is proposed, which has a shorter computation time and higher optimization accuracy than the traditional DP algorithm. Finally, the IETMS energy-saving potential of PHEV under different driving conditions is analyzed.
The remainder of this paper is organized as follows. The control-oriented model of PHEV is developed in Section 2. Section 3 illustrates the global optimization principle based on AGO-DP. The simulation results and discussion are presented in Section 4. Finally, the conclusion is summarized in Section 5.

Vehicle Model
The powertrain architecture of the plug-in hybrid electric sport-utility vehicle (SUV) studied in this paper is shown in Figure 1. The vehicle includes two power sources: the engine and the battery, which can provide vehicle demand power alone or together. The main parameters of the vehicle and components are shown in Table 1. the DP algorithm by optimizing the actual feasible region of SOC. The was reduced by more than 25% under different driving conditions [29] improving the solution performance of the EMS global optimization pr achieved good results, but its results cannot be directly used in IETM EMS needs to consider only one continuous state variable, SOC, while two continuous state variables, SOC and battery temperature, which in tional complexity exponentially. Additionally, the state equation of the P optimization method also have significant difference.
To address the above issues in existing research, a novel global o rithm is proposed in this work to solve the IETMS global programm PHEV. Firstly, the PHEV state equation with SOC and battery temperat established. Secondly, an improved dynamic programming algorithm b grid optimization (AGO-DP) is proposed, which has a shorter comp higher optimization accuracy than the traditional DP algorithm. Finall ergy-saving potential of PHEV under different driving conditions is ana The remainder of this paper is organized as follows. The control-o PHEV is developed in Section 2. Section 3 illustrates the global optim based on AGO-DP. The simulation results and discussion are presente nally, the conclusion is summarized in Section 5.

Vehicle Model
The powertrain architecture of the plug-in hybrid electric sport-uti studied in this paper is shown in Figure 1. The vehicle includes two p engine and the battery, which can provide vehicle demand power alon main parameters of the vehicle and components are shown in Table 1.    The vehicle demand power of PHEV is used to provide energy for the drive motor, AC system, and auxiliary system. Therefore, the power balance equation of PHEV can be expressed as: P eng + P bat = P motor + P AC + P aux (1) where P eng and P bat denote the output power of the engine and battery. P motor , P AC and P aux are the demand power of the drive motor, AC system, and auxiliary system, respectively. The longitudinal driving demand power accounts for the majority of the demand power during driving, so lateral demand power is ignored, and the longitudinal driving demand power of the drive motor can be calculated as follows: where η T and η m are the efficiency of the transmission system and drive motor. P f , P i , P w and P j are the demand power to overcome rolling resistance, slope resistance, air resistance, and acceleration resistance, which can be written as: where m is vehicle mass, δ is the rotating mass conversion coefficient, including the translational mass and the rotating mass. C D is the aerodynamic drag coefficient, A is the equivalent windward area, f is rolling resistance coefficient, g is gravitational acceleration, θ is road grade, ρ is air density. v and a denote the vehicle's instantaneous speed and acceleration, respectively.
Due to the complexity of the transient model of the AC system, which involves numerous state variables, it is difficult to directly apply it to global optimization. Therefore, a steady-state model is used to describe the cooling characteristics of the AC system. Specifically, if the electric power of the AC system is constant, its actual cooling power also remains constant, satisfying the following relationship: The engine working point of the PHEV series studied in this paper can be adjusted arbitrarily. Hence, the minimum fuel consumption ̇ of the engine at different output powers can be obtained by integrating the generator efficiency, forming the optimal fuel consumption curve of the engine [30], as shown in Figure 3.

Equivalent Circuit Model
The internal resistance equivalent circuit model is used to describe the electrical characteristics of the battery, as shown in Figure 4. The expression of battery current can be calculated by Equation (5). The engine working point of the PHEV series studied in this paper can be adjusted arbitrarily. Hence, the minimum fuel consumption . m f uel of the engine at different output powers can be obtained by integrating the generator efficiency, forming the optimal fuel consumption curve of the engine [30], as shown in Figure 3. The engine working point of the PHEV series studied in this paper can be adjusted arbitrarily. Hence, the minimum fuel consumption ̇ of the engine at different output powers can be obtained by integrating the generator efficiency, forming the optimal fuel consumption curve of the engine [30], as shown in Figure 3.

Equivalent Circuit Model
The internal resistance equivalent circuit model is used to describe the electrical characteristics of the battery, as shown in Figure 4. The expression of battery current can be calculated by Equation (5).  The internal resistance equivalent circuit model is used to describe the electrical characteristics of the battery, as shown in Figure 4. The expression of battery current can be calculated by Equation (5).
where U t is the battery terminal voltage. U oc and R 0 are the battery's open-circuit voltage and internal resistance. They are crucial parameters of the battery cell and can be expressed as a function of SOC and battery temperature, as shown in Figure 5.

Equivalent Circuit Model
The internal resistance equivalent circuit model is used to describe the electrical characteristics of the battery, as shown in Figure 4. The expression of battery current can be calculated by Equation (5).
where is the battery terminal voltage. and 0 are the battery's open-circuit voltage and internal resistance. They are crucial parameters of the battery cell and can be expressed as a function of SOC and battery temperature, as shown in Figure 5. Moreover, according to the Ampere-hour integration method, the battery SOC can be expressed as Equation (6). Where is the battery capacity.

Thermal Model
The battery thermal model describes the heat generation, heat exchange, and temperature change processes of the battery [31]. According to Bernardi equation, the battery heat generation ̇ is expressed as Equation (7). The first term in Equation (7) is the heat generation caused by the current thermal effect. The second term is reversible heat, which accounts for a small proportion of the total heat generation, so it can be ignored.
The battery heat exchange process mainly includes the heat exchange power between the battery and the external environment, as well as the heat exchange power between the battery and the AC system. After determining the heat generation power and heat exchange power of the battery, its temperature can be described as: where is ambient temperature. ℎ, = 0.0625 °C/W is the thermal resistance between the battery and the environment. , and are the battery-specific heat and battery mass.

Degradation Model
The empirical model based on the Arrhenius equation proposed by Song et al. [32] is used to describe the degree of battery capacity degradation , which is written as: Moreover, according to the Ampere-hour integration method, the battery SOC can be expressed as Equation (6). Where Cap bat is the battery capacity.

Thermal Model
The battery thermal model describes the heat generation, heat exchange, and temperature change processes of the battery [31]. According to Bernardi equation, the battery heat generation . Q bat is expressed as Equation (7). The first term in Equation (7) is the heat generation caused by the current thermal effect. The second term is reversible heat, which accounts for a small proportion of the total heat generation, so it can be ignored.
The battery heat exchange process mainly includes the heat exchange power between the battery and the external environment, as well as the heat exchange power between the battery and the AC system. After determining the heat generation power and heat exchange power of the battery, its temperature can be described as: where T amb is ambient temperature. R th,bat = 0.0625 • C/W is the thermal resistance between the battery and the environment. c p,bat and m bat are the battery-specific heat and battery mass.

Degradation Model
The empirical model based on the Arrhenius equation proposed by Song et al. [32] is used to describe the degree of battery capacity degradation Cap loss , which is written as: where C rate = |I b (t)|/Q bat is the battery discharge rate. A h is the Ampere-hour throughput. R = 8.31445 J/(mol × • C) is the gas constant. E a is the activation energy. z is the power-law factor. T b and T c are the reference temperatures. A 1 and B 1 are the fitting coefficients. For the battery cell studied in this work, four degradation cycle test conditions were set up. The battery capacity degradation curve and parameter fitting results are shown in Figure 6.
is the activation energy. is the power-law factor. and are the reference temperatures. 1 and 1 are the fitting coefficients. For the battery cell studied in this work, four degradation cycle test conditions were set up. The battery capacity degradation curve and parameter fitting results are shown in Figure 6.

Statement of IETMS Global Optimization Problem
To study the global optimization problem of IETMS, it is necessary to determine the control and state variables of the PHEV system. Considering the need to ensure the thermal comfort of the cabin during the driving process in a high-temperature environment, the cabin temperature is generally maintained in a constant suitable temperature range, which decreases significantly only in the initial driving phase. Therefore, it is assumed that the cabin temperature and cabin cooling power ̇ are constant during the entire driving process. The cabin cooling power ̇ can be calculated according to the ambient temperature and the setpoint of the AC system [33]. Furthermore, SOC, battery temperature , and engine status are selected as the state variables . Among them, SOC and battery temperature are continuous variables, while the engine status is a discrete variable with values of 0 or 1. Control variables , include engine output power and battery cooling power ̇, as expressed in following equation: The goal of PHEV energy management is to improve vehicle fuel economy. During the actual driving process of the vehicle, the battery SOC decreases and consumes electricity, leading to battery life degradation. Therefore, the total driving costs of fuel consumption , electricity consumption , and battery degradation are considered in the cost-to-go function instead of just considering fuel consumption which is widely applied in the previous research [5,16]. The cost-to-go function can be described as: where is the engine start penalty term of USD 9.55 × 10 −4 . It means that if the engine start state changes from 0 to 1, the additional fuel consumption for engine start is

Statement of IETMS Global Optimization Problem
To study the global optimization problem of IETMS, it is necessary to determine the control and state variables of the PHEV system. Considering the need to ensure the thermal comfort of the cabin during the driving process in a high-temperature environment, the cabin temperature is generally maintained in a constant suitable temperature range, which decreases significantly only in the initial driving phase. Therefore, it is assumed that the cabin temperature and cabin cooling power . Q cab cool are constant during the entire driving process. The cabin cooling power . Q cab cool can be calculated according to the ambient temperature and the setpoint of the AC system [33]. Furthermore, SOC, battery temperature T bat , and engine status S eng are selected as the state variables x. Among them, SOC and battery temperature are continuous variables, while the engine status is a discrete variable with values of 0 or 1. Control variables u, include engine output power P eng and battery cooling power . Q bat cool , as expressed in following equation: The goal of PHEV energy management is to improve vehicle fuel economy. During the actual driving process of the vehicle, the battery SOC decreases and consumes electricity, leading to battery life degradation. Therefore, the total driving costs of fuel consumption L f uel , electricity consumption L ele , and battery degradation L bat are considered in the costto-go function instead of just considering fuel consumption which is widely applied in the previous research [5,16]. The cost-to-go function can be described as: where p eng is the engine start penalty term of USD 9.55 × 10 −4 . It means that if the engine start state S eng changes from 0 to 1, the additional fuel consumption for engine start is considered to be 0.55 g [18] to avoid frequent engine starts and stops. Furthermore, other costs in Equation (11) are defined as follows: where ρ f uel = 749 g/L is fuel density. α f uel = USD 1.3/L is the fuel cost. α ele = USD 0.265 /kWh is the electricity price. α bat = USD 8870.4 is the total price of the battery pack, whose unit price is set as USD 176/kWh [34]. Due to the system characteristics of vehicle components, the engine output power, battery output power, and battery cooling power are only taken within a certain range. For the state variables, their value range also needs to be limited according to the actual situation. Otherwise, the cabin thermal comfort is preferred to the battery cooling, so the maximum battery cooling power is the difference between the maximum cooling power of the AC system . Q veh cool_max and the cabin cooling power . Q cab cool . Then, the state equation of the PHEV system must meet the corresponding constraints:

Dynamic Programming Principle
When solving global optimization problems using the DP algorithm, it is needed to decompose multi-stage decisions into a series of single-stage optimization problems. The initial and terminal values of state variables are determined in advance. Subsequently, to formulate the optimal decision sequence, each single-stage optimization problem is sequentially solved in a backward direction to minimize the cost-to-go function. Finally, the optimal state of each stage is calculated forward from the initial state according to the optimal decision sequence. The solving principle of the DP algorithm is shown in Figure 7. The specific solution process is as follows.

1.
Calculate cost-to-go function J * x i t N of each state x i t N in the terminal state space: 2.
Calculate cost-to-go function J * x i t of each state x i t at time t: where J * x i t+1 is the optimal cost-to-go function from the state x i t+1 at the t + 1 time stage to the terminal state. The recursive relationship is as follows: 3.
According to the optimal cost-to-go function J * x i t at time t, solve the corresponding optimal control u * x i t+1 : 4.
Repeat the above steps until the initial state, and complete the backward optimization process of the DP algorithm. Then, through the state equation of the PHEV system, the optimal state and optimal control sequences of each time stage are solved in a forward direction to determine the global optimal solution.

Dynamic Programming Principle
When solving global optimization problems using the DP algorithm, it is needed to decompose multi-stage decisions into a series of single-stage optimization problems. The initial and terminal values of state variables are determined in advance. Subsequently, to formulate the optimal decision sequence, each single-stage optimization problem is sequentially solved in a backward direction to minimize the cost-to-go function. Finally, the optimal state of each stage is calculated forward from the initial state according to the optimal decision sequence. The solving principle of the DP algorithm is shown in Figure  7. The specific solution process is as follows.

Adaptive Grid Optimization of State Variables
The constraints in Equation (15) specify the range of the control variables and state variables, but there is a coupling relationship between these. Due to the finite value range of the control variables, the change of the state variable SOC and battery temperature in adjacent time stages is also limited according to Equations (6) and (8), and there are no sudden changes in state variables. Therefore, if the initial and terminal values of the state variables are determined in advance, the actual feasible region of the state at each time may be less than the constraint range in Equation (15). AGO-DP is proposed in this paper to determine the actual value range of the state variables at different time stages and remove the infeasible region in the state constraints by combining the system state equation. At the same time, according to the practical, feasible region of state variables at different time stages, the interval of state grid points is adaptively adjusted to improve the accuracy of global optimization by DP without increasing the additional computation time.
To reduce the range of SOC and optimize the grid distribution, it is necessary to recursively calculate the SOC variation ranges along the forward and backward directions. Furthermore, combine the upper and lower limits of SOC constraints to take the intersection of these ranges as the actual feasible region of SOC at each time stage. The specific steps are as follows: 1.
The variation range of SOC at each time stage is related to the maximum and minimum output power of the battery at that time. The output power limit of the battery at time t can be calculated according to Equations (20) and (21). A positive battery output power indicates that the battery is discharging and the current is also positive; On the contrary, a negative output power indicates that the battery is charging, and the current is also negative.
After determining the actual limit of the battery output power, the recursive calculation in a forward direction can be performed from the SOC initial state. Then the upper and lower limits of SOC at the next time stage correspond to the minimum output power and maximum output power of the battery, which is expressed as: where I P bat_max (t) and I P bat_min (t) are the corresponding battery output currents when operating at maximum and minimum output power at time t.

3.
The recursive calculation in a backward direction is performed from the SOC terminal state. The upper and lower limits of SOC at the previous time correspond to the maximum output power and minimum output power of the battery, respectively, which is expressed as: 4. According to the above method, the forward recursive boundary SOC f _max/min and the backward recursive boundary SOC b_max/min of SOC are obtained, respectively. The optimized limits of the SOC feasible region can be calculated by Equations (26) and (27), as shown in Figure 8.

5.
According to the SOC feasible region range at each time stage, the SOC grid interval is adaptively adjusted to ensure that all state grid points are within the feasible region. In this paper, the state grid points are evenly distributed, and the number of SOC grid points in the global optimization problem is set to N soc,grid . The position of SOC grid points SOC grid (t, i) can be calculated by: where _max (t) and _min (t) are the corresponding battery output currents when operating at maximum and minimum output power at time .
3. The recursive calculation in a backward direction is performed from the SOC terminal state. The upper and lower limits of SOC at the previous time correspond to the maximum output power and minimum output power of the battery, respectively, which is expressed as: 4. According to the above method, the forward recursive boundary _max/min and the backward recursive boundary _max/min of SOC are obtained, respectively. The optimized limits of the SOC feasible region can be calculated by Equations (26) and (27), as shown in Figure 8. When optimizing the feasible region of battery temperature during the vehicle driving process, the key lies in determining the real-time output current range, thereby calculating the maximum and minimum heat generation of the battery at each time stage. According to Equation (8), it can be seen that when the absolute value of the battery current is the highest, and there is no battery cooling power, the battery has the maximum temperature rise. Conversely, when the absolute value of the battery current is the smallest and the cooling power is the highest, the battery temperature decreases to the maximum allowable extent. The feasible region optimization steps of battery temperature are as follows: 1.
If the maximum output power P bat_max (t) and minimum output power P bat_min (t) of the battery are both positive or negative at time t, the battery must be discharged or charged at this time to meet the vehicle demand power. At this time, the maximum and minimum heat generation of the battery are: . .
If the plus-minus sign of the maximum battery output power and the minimum battery output power are opposite, it means that the battery does not need to be operated at this moment, and the demand power can be met only by the engine. Hence, the maximum heat generation can still be calculated according to Equation (29), while the minimum heat generation is 0.

2.
The battery heat exchange boundary at each moment can be calculated by: 3.
The recursive calculation in a forward direction is performed from the battery temperature's initial state. The upper and lower boundaries of battery temperature at each time can be calculated by: T bat, f _max (t + 1) = min T bat, f _max (t) + P ex,bat_min (t)·∆t c p,bat m bat , T bat_max (33) T bat, f _min (t + 1) = max T bat, f _min (t) + 4. The recursive calculation in a backward direction is performed from the battery temperature terminal state. The boundaries of battery temperature can be expressed as: T bat,b_min (t − 1) = max T bat,b_min (t) −

5.
According to the forward recursive boundary T bat, f _max/min and backward recursive boundary T bat,b_max/min , the limits of the battery temperature feasible region can be calculated by: T bat,opt_min (t) = max T bat, f _min (t), T bat,b_min (t)  6. Similarly, the state grid points of battery temperature are evenly distributed. The number of grid points of battery temperature is set to N Tbat,grid , and the position of grid points T bat,grid (t, i) is: i = 0, 1, 2, . . . , N Tbat,grid

Comparison of Global Optimization Algorithms
In order to verify the superiority of AGO-DP in solving the IETMS global optimization problem of PHEV, three standard driving cycles are selected as test cycles: urban dynamometer driving schedule (UDDS), highway fuel economy test driving cycle (HWFET) and world light-duty vehicle test cycle (WLTC), and the speed curves are shown in Figure 9. These test cycles include high-speed driving conditions, low-speed driving conditions, and a combination of high and low-speed driving conditions, which can better cover various scenarios in daily driving.

Comparison of Global Optimization Algorithms
In order to verify the superiority of AGO-DP in solving the IETMS global optim tion problem of PHEV, three standard driving cycles are selected as test cycles: urban namometer driving schedule (UDDS), highway fuel economy test driving cycle (HWF and world light-duty vehicle test cycle (WLTC), and the speed curves are shown in Fi 9. These test cycles include high-speed driving conditions, low-speed driving conditi and a combination of high and low-speed driving conditions, which can better cover ious scenarios in daily driving. The simulation time step is set to 1 s to discretize the driving condition. The amb temperature and the light intensity are set to 35 °C and 800 W/m 2 , and the setpoint perature of the cabin is 25 °C. According to the cabin thermal load expression propo by Zhao [33], the cabin cooling power is 2320 W at this condition. The initial battery perature is consistent with the ambient temperature, and the battery terminal tempera should be at an appropriate temperature to rapidly obtain a large current when fast ch ing after driving, so it is set to 25 °C [35]. The SOC at the end of the driving cond should be kept at a low level to fully consume electricity and improve the vehicle en efficiency, so the initial and terminal SOCs are set to 0.85 and 0.45, respectively. For IETMS global optimization problem of PHEV, the relevant parameters of state and con variables are listed in Table 2. All computations in this work are performed on an Core i7-11800H CPU at 2.30 GHz with 16 GB RAM and 64 bits system. The simulation time step is set to 1 s to discretize the driving condition. The ambient temperature and the light intensity are set to 35 • C and 800 W/m 2 , and the setpoint temperature of the cabin is 25 • C. According to the cabin thermal load expression proposed by Zhao [33], the cabin cooling power is 2320 W at this condition. The initial battery temperature is consistent with the ambient temperature, and the battery terminal temperature should be at an appropriate temperature to rapidly obtain a large current when fast charging after driving, so it is set to 25 • C [35]. The SOC at the end of the driving condition should be kept at a low level to fully consume electricity and improve the vehicle energy efficiency, so the initial and terminal SOCs are set to 0.85 and 0.45, respectively. For the IETMS global optimization problem of PHEV, the relevant parameters of state and control variables are listed in Table 2. All computations in this work are performed on an Intel Core i7-11800H CPU at 2.30 GHz with 16 GB RAM and 64 bits system. The difference between the AGO-DP algorithm and the traditional DP algorithm is mainly reflected in the feasible region of the state variables. Taking the simulation result of the 5 × HWFET driving condition as an example, the practical, feasible region of SOC and battery temperature during the entire driving process optimized by AGO-DP is shown in Figure 10. The reduction effect of the state feasible range is most significant during the initial and final stages of the driving condition, and its range is much smaller than the limits of state variables in Table 2. This is because the initial and final values of system state variables are determined, and the driving demand power, engine output power, and battery cooling power all have limited value ranges. Therefore, the change rates of SOC and battery temperature are finite without sudden change, and their feasible regions gradually expand until reaching the state variable limits. The difference between the AGO-DP algorithm and the traditional DP algorith mainly reflected in the feasible region of the state variables. Taking the simulation r of the 5 × HWFET driving condition as an example, the practical, feasible region of and battery temperature during the entire driving process optimized by AGO-D shown in Figure 10. The reduction effect of the state feasible range is most significant ing the initial and final stages of the driving condition, and its range is much smaller the limits of state variables in Table 2. This is because the initial and final values of sy state variables are determined, and the driving demand power, engine output power battery cooling power all have limited value ranges. Therefore, the change rates of and battery temperature are finite without sudden change, and their feasible regions ually expand until reaching the state variable limits. The three test cycles in Figure 9 are repeated several times. Meanwhile, AGOused for global optimization, and the reduction ratio of state feasible region can be c lated as follows: It can be seen from Figure 11 that AGO-DP can effectively reduce the feasible r of state variables under various driving conditions, but the reduction ratio varies am different types of driving conditions. The reduction ratio of HWFET is particularly s icant, with a maximum reduction ratio of over 70%. This is because when the energy sumption of the entire driving process is at the same level, the high-speed driving c tion requires higher driving power and lasts for a shorter time. It results in a shorter range between the upper and lower limits of the state variables and a higher propo of feasible region reduction. As the driving cycle number increases, the duration o state feasible region boundary consistent with the state limit is longer, so the effe The three test cycles in Figure 9 are repeated several times. Meanwhile, AGO-DP is used for global optimization, and the reduction ratio of state feasible region can be calculated as follows: It can be seen from Figure 11 that AGO-DP can effectively reduce the feasible region of state variables under various driving conditions, but the reduction ratio varies among different types of driving conditions. The reduction ratio of HWFET is particularly significant, with a maximum reduction ratio of over 70%. This is because when the energy consumption of the entire driving process is at the same level, the high-speed driving condition requires higher driving power and lasts for a shorter time. It results in a shorter time range between the upper and lower limits of the state variables and a higher proportion of feasible region reduction. As the driving cycle number increases, the duration of the state feasible region boundary consistent with the state limit is longer, so the effect of feasible region reduction is weakened. However, AGO-DP can still maintain a reduction ratio of nearly 30%.
Sensors 2023, 23, x FOR PEER REVIEW feasible region reduction is weakened. However, AGO-DP can still maintain a re ratio of nearly 30%. To verify the impact of state-feasible region reduction on global optimization mance, three sets of driving conditions are selected for simulation analysis under d grid point numbers of the state SOC and battery temperature variables. The tra DP and AGO-DP are used for IETMS global optimization, respectively. The glob mal solution and computation time vary with the state grid point numbers, as sh Figure 12. The global optimal solution is the optimal driving cost under the corresp driving condition, and to eliminate the impact of driving distance on the optimal s it is normalized to driving cost per 100 km. The global optimization performance of AGO-DP is significantly better than the traditional DP algorithm under various driving conditions. The optimal driv of AGO-DP decreases slightly with the increase of the grid point number, and its gence rate is faster. For the driving conditions of WLTC and HWFET, even if t To verify the impact of state-feasible region reduction on global optimization performance, three sets of driving conditions are selected for simulation analysis under different grid point numbers of the state SOC and battery temperature variables. The traditional DP and AGO-DP are used for IETMS global optimization, respectively. The global optimal solution and computation time vary with the state grid point numbers, as shown in Figure 12. The global optimal solution is the optimal driving cost under the corresponding driving condition, and to eliminate the impact of driving distance on the optimal solution, it is normalized to driving cost per 100 km.
Sensors 2023, 23, x FOR PEER REVIEW feasible region reduction is weakened. However, AGO-DP can still maintain a re ratio of nearly 30%. To verify the impact of state-feasible region reduction on global optimization mance, three sets of driving conditions are selected for simulation analysis under d grid point numbers of the state SOC and battery temperature variables. The tra DP and AGO-DP are used for IETMS global optimization, respectively. The glob mal solution and computation time vary with the state grid point numbers, as sh Figure 12. The global optimal solution is the optimal driving cost under the corresp driving condition, and to eliminate the impact of driving distance on the optimal s it is normalized to driving cost per 100 km. The global optimization performance of AGO-DP is significantly better than the traditional DP algorithm under various driving conditions. The optimal driv of AGO-DP decreases slightly with the increase of the grid point number, and its gence rate is faster. For the driving conditions of WLTC and HWFET, even if t variable grid is only 30, an ideal optimal cost can also be obtained. The optima basically stable when the grid point number reaches 50. However, DP has poor op The global optimization performance of AGO-DP is significantly better than that of the traditional DP algorithm under various driving conditions. The optimal driving cost of AGO-DP decreases slightly with the increase of the grid point number, and its convergence rate is faster. For the driving conditions of WLTC and HWFET, even if the state variable grid is only 30, an ideal optimal cost can also be obtained. The optimal cost is basically stable when the grid point number reaches 50. However, DP has poor optimization performance when the grid point number is less than 100, and its convergence rate of the optimal cost is also slow. Meanwhile, the optimal driving cost of DP in a stable stage is slightly higher than AGO-DP. This is because the state grid points of DP are uniformly distributed within the limits of state variables, resulting in a large number of state grid points of DP being outside the actual feasible region, especially in the initial and final stages of the driving conditions. Not only does it increase the unnecessary computational burden, but it also has varying degrees of impact on the global optimization accuracy of each driving condition.
The two algorithms have the greatest difference in global optimization performance for UDDS. Due to the low driving demand power of UDDS, the descent of SOC feasible region boundary is the slowest, as shown in Figure 13, which minimizes the state grid point number in the feasible region for global optimization. Meanwhile, UDDS has the longest driving time, which causes greater interpolation cumulative errors and reduces optimization accuracy. AGO-DP adaptively adjusts the distribution of the state grid points according to the actual feasible region at different times, ensuring that all grid points are within the feasible region. It also refines the grid interval of the feasible region, thereby reducing interpolation errors and making the cost-to-go function of backward optimization more accurate. Therefore, AGO-DP can obtain the global optimal solution more rapidly and accurately. The driving demand power of HWFET and WLTC is larger than that of UDDS, so the boundary of the SOC feasible region decreases rapidly. It makes the difference between the optimized feasible region and the limits of state variables not significant during the initial and final stages of the driving condition. The traditional DP algorithm also has a large number of state grid points falling in the feasible region. Hence, its interpolation error is smaller than that of UDDS, and the optimal driving cost is close to AGO-DP. However, the convergence rate of the optimal solution is still significantly slower than that of AGO-DP.
Sensors 2023, 23, x FOR PEER REVIEW distributed within the limits of state variables, resulting in a large number of sta points of DP being outside the actual feasible region, especially in the initial an stages of the driving conditions. Not only does it increase the unnecessary comput burden, but it also has varying degrees of impact on the global optimization accu each driving condition.
The two algorithms have the greatest difference in global optimization perfor for UDDS. Due to the low driving demand power of UDDS, the descent of SOC f region boundary is the slowest, as shown in Figure 13, which minimizes the sta point number in the feasible region for global optimization. Meanwhile, UDDS h longest driving time, which causes greater interpolation cumulative errors and r optimization accuracy. AGO-DP adaptively adjusts the distribution of the stat points according to the actual feasible region at different times, ensuring that a points are within the feasible region. It also refines the grid interval of the feasible thereby reducing interpolation errors and making the cost-to-go function of backwa timization more accurate. Therefore, AGO-DP can obtain the global optimal so more rapidly and accurately. The driving demand power of HWFET and WLTC is than that of UDDS, so the boundary of the SOC feasible region decreases rapidly. It the difference between the optimized feasible region and the limits of state variab significant during the initial and final stages of the driving condition. The traditio algorithm also has a large number of state grid points falling in the feasible region. its interpolation error is smaller than that of UDDS, and the optimal driving cost i to AGO-DP. However, the convergence rate of the optimal solution is still signifi slower than that of AGO-DP. The solid and dotted lines in Figure 13 are the optimal state trajectories obtai AGO-DP and DP with the state grid points of 250, respectively. Figure 14 shows th mal trajectories of engine output power and battery cooling power. The decline o optimal trajectory of DP is faster than that of AGO-DP, which indicates that the outputs most of the vehicle demand power in the early stage of the driving conditio the engine is less involved. The overall driving condition characteristics are not we sidered by DP. The battery temperature trajectory obtained by DP under UDDS significantly from AGO-DP. It is also due to the smaller number of state grid points The solid and dotted lines in Figure 13 are the optimal state trajectories obtained by AGO-DP and DP with the state grid points of 250, respectively. Figure 14 shows the optimal trajectories of engine output power and battery cooling power. The decline of SOC optimal trajectory of DP is faster than that of AGO-DP, which indicates that the battery outputs most of the vehicle demand power in the early stage of the driving condition, Sensors 2023, 23, 7149 16 of 21 and the engine is less involved. The overall driving condition characteristics are not well considered by DP. The battery temperature trajectory obtained by DP under UDDS differs significantly from AGO-DP. It is also due to the smaller number of state grid points in the feasible region during the initial and final stages, resulting in significant interpolation errors. The global optimization performance difference exceeds more than 10% compared to AGO-DP. Furthermore, most of the engine output power of AGO-DP is between 30-60 kW, which is a high-efficient engine range, according to Figure 3. However, in the optimal control trajectory of DP, the engine often operates with a power above 60 kW during the final stage, decreasing fuel efficiency. Otherwise, the efficiency of the AC system decreases with an increase in cooling power, according to Figure 2, so the battery cooling power is usually maintained at around 1.5 kW to reduce energy consumption. The battery cooling power operates at maximum power only in the DP optimization result under UDDS driving conditions, resulting in much worse economy. It can also be seen from Figure 12 that the calculation time of global optimization shows an exponential growth trend. When the state grid point number is the same, the calculation time of the two algorithms is not significantly different because AGO-DP only adjusts the grid point distribution within the practical state feasible region but does not reduce the number of grid points. The calculation amount of the two algorithms is It can also be seen from Figure 12 that the calculation time of global optimization shows an exponential growth trend. When the state grid point number is the same, the calculation time of the two algorithms is not significantly different because AGO-DP only adjusts the grid point distribution within the practical state feasible region but does not reduce the number of grid points. The calculation amount of the two algorithms is basically the same, so the calculation time is also close. However, AGO-DP only requires fewer grid points to achieve a stable optimal driving cost than DP. Under the three types of driving conditions, AGO-DP can obtain the optimal cost with the grid point numbers of 200, 50, and 50, respectively. However, DP needs 250, 150, and 150, and the time-saving ratio of AGO-DP compared to DP is 33.29%, 83.33%, and 84.67%, respectively. Meanwhile, the optimal driving cost solved by the two algorithms under different driving conditions is shown in Table 3, and the state grid point number is set to 150. The IETMS global optimization performance of AGO-DP is significantly better than that of DP, and the average difference between the two algorithms is 6.99%. UDDS has the largest reduction ratio of feasible regions, resulting in the largest reduction of 16.85% in driving costs. In other types of driving conditions, the advantage of AGO-DP has weakened, but it is still better than the traditional DP algorithm.

IETMS Energy-Saving Potential Analysis
To illustrate the importance of developing IETMS for PHEVs, the energy-saving potential of IETMS and EMS global optimization is compared in this section. IETMS-DP strategy conforms to the development of the IETMS global optimization mentioned above. EMS-DP strategy only does EMS global optimization and ignores battery thermal management. Meanwhile, rule-based EMS (EMS-RB) and rule-based IETMS (IETMS-RB) are introduced for comparative analysis. Rule-based strategy uses charge-depleting and charge-sustaining (CD/CS) control [36] and proportion-integral-differential (PID) control to achieve power distribution and battery thermal management for PHEVs, respectively. In the CD stage, the battery is regarded as the only power source unless the driving demand power exceeds the maximum battery output power. The goal of the CS stage is to stabilize the battery SOC at 0.45. The engine output power can be adjusted according to SOC, expressed in Equation (41). Figure 15 shows the driving cost under various driving conditions obtained by different strategies.
where P cs,chg = 50 kW is the desired output power during engine operation. SOC cs,hi and SOC cs,lo are the SOC threshold for correcting engine power, set to 0.6 and 0.3, respectively. The driving cost increases linearly with the number of driving cycles. This is the energy the battery provides is the same in different driving conditions. As the of cycles increases, the proportion of engine energy supply gradually increases, to a gradual increase in fuel consumption and driving cost. Although battery management improves battery efficiency and slows down battery aging, the ene sumption cost of battery thermal management is higher than the cost of reducing aging. Therefore, the cost of IETMS is higher than that of EMS under the same condition. However, battery thermal management is crucial for the long-term he safety of batteries, so although it increases the driving cost, it is indispensable. F shows the proportion of cost reduction based on the global optimization strate pared to the rule-based strategy, which can measure the energy-saving potential o ent strategies. The larger the difference in driving costs between the rule-based and the global optimization strategy, the greater the energy-saving potential. The d in the proportion of battery energy supply leads to a gradual decrease in battery h eration. The demand for battery thermal management is reduced. Therefore, the saving potential of IETMS compared to EMS gradually decreases.  The driving cost increases linearly with the number of driving cycles. This is because the energy the battery provides is the same in different driving conditions. As the number of cycles increases, the proportion of engine energy supply gradually increases, leading to a gradual increase in fuel consumption and driving cost. Although battery thermal management improves battery efficiency and slows down battery aging, the energy consumption cost of battery thermal management is higher than the cost of reducing battery aging. Therefore, the cost of IETMS is higher than that of EMS under the same driving condition. However, battery thermal management is crucial for the long-term health and safety of batteries, so although it increases the driving cost, it is indispensable. Figure 16 shows the proportion of cost reduction based on the global optimization strategy compared to the rule-based strategy, which can measure the energy-saving potential of different strategies. The larger the difference in driving costs between the rule-based strategy and the global optimization strategy, the greater the energy-saving potential. The decrease in the proportion of battery energy supply leads to a gradual decrease in battery heat generation. The demand for battery thermal management is reduced. Therefore, the energy-saving potential of IETMS compared to EMS gradually decreases. The driving cost increases linearly with the number of driving cycles. This is b the energy the battery provides is the same in different driving conditions. As the n of cycles increases, the proportion of engine energy supply gradually increases, to a gradual increase in fuel consumption and driving cost. Although battery t management improves battery efficiency and slows down battery aging, the ener sumption cost of battery thermal management is higher than the cost of reducing aging. Therefore, the cost of IETMS is higher than that of EMS under the same condition. However, battery thermal management is crucial for the long-term hea safety of batteries, so although it increases the driving cost, it is indispensable. Fi shows the proportion of cost reduction based on the global optimization strateg pared to the rule-based strategy, which can measure the energy-saving potential o ent strategies. The larger the difference in driving costs between the rule-based s and the global optimization strategy, the greater the energy-saving potential. The d in the proportion of battery energy supply leads to a gradual decrease in battery he eration. The demand for battery thermal management is reduced. Therefore, the saving potential of IETMS compared to EMS gradually decreases. The distribution of driving costs varies under different driving conditions. Th speed driving demand power of HWFET is relatively high, and the engine can op a high-efficient region in various strategies. The energy consumption for the batte mal management is relatively small compared to the driving energy consumption The distribution of driving costs varies under different driving conditions. The highspeed driving demand power of HWFET is relatively high, and the engine can operate in a high-efficient region in various strategies. The energy consumption for the battery thermal management is relatively small compared to the driving energy consumption, so the driving cost difference of various strategies is the smallest, and the energy-saving potential is also the smallest. Under this driving condition, the average energy-saving potential of EMS is only 2.28%, while the energy-saving potential of IETMS can reach over 4%. This is because, considering the thermal management, the battery can be cooled down to the appropriate temperature range, thereby reducing the battery capacity degradation caused by the high current and high-temperature operation, reducing driving costs, and improving energy-saving potential. The urban driving condition of UDDS is mainly low speed, with low driving demand power. The battery is the main power source, so the driving cost is the lowest. Moreover, the thermal management energy consumption under this condition accounts for a relatively large proportion, so the average energy-saving potential of IETMS has increased to 8.39%, which is 3.17% higher than EMS. The comprehensive driving conditions of WLTC include different speed ranges, making it difficult to achieve a good driving economy because of significant fluctuations in engine output power based on rule-based strategies. Global optimization can balance the output power of the engine and power battery based on global driving conditions to achieve optimal driving costs. This driving condition has the most obvious energy-saving potential, reaching a maximum of over 20%. Meanwhile, IETMS only improves by 0.55-1.57% compared to EMS.

Conclusions
An improved DP algorithm is proposed in this paper to solve IETMS global optimization problem for PHEVs by reducing the optimization range of SOC and battery temperature and adaptively adjusting the grid distribution of state variables according to the actual feasible region. The feasible region of state variables under different driving conditions is effectively optimized by AGO-DP. The maximum reduction ratio of the feasible region reaches 74.1% of HWFET. Although the effect of feasible region reduction is weakened as the increase of driving cycle number, the reduction ratio is still nearly 30%, which reduces the computational burden for IETMS global optimization. Meanwhile, AGO-DP can obtain better global optimal driving costs more rapidly and accurately than DP. The global optimization computation time is reduced by 33.29-84.67%, and the accuracy of the global optimal solution is improved by 0.94-16.85%. The engine and AC system in AGO-DP both operate in a more efficient region. Furthermore, the IETMS energy-saving potential range for PHEVs is 3.68-23.74% under different driving conditions, which increases the energy-saving potential by 0.55-3.26% compared to EMS. In future research, AGO-DP proposed in this work will be used as a reference benchmark to guide the development of an online control strategy and further improve vehicle energy efficiency.