Online Control Strategy for Plug-In Hybrid Electric Vehicles Based on an Improved Global Optimization Algorithm

: Neural networks are widely used in the learning of o ﬄ ine global optimization rules to reduce the fuel consumption and real-time performance of hybrid electric vehicles. Considering that the torque and transmission ratio are direct control variables, online recognition by a neural network of these two parameters is insu ﬃ ciently accurate. In the meanwhile, the dynamic program (DP) algorithm requires huge computing costs. Based on these problems, a fusion algorithm combining a dynamic programming algorithm and an approximate equivalent fuel consumption minimum control strategy (A-ECMS) is proposed in this paper. Taking the equivalent factor as the control variable, the global optimal sequence of the factor is obtained o ﬄ ine. The back propagation (BP) neural network is used to extract the sequence to form an online control strategy. The simulation results illustrate that, compared with the traditional dynamic programming algorithm, although the fuel consumption increases slightly, the computational cost of the fusion algorithm proposed in this paper is signiﬁcantly reduced. Moreover, because the optimal sequence of the equivalent factors is within a particular range, the online control strategy based on DP-A-ECMS has a high robustness. Compared with an online control strategy based on the torque and transmission ratio, the fuel economy is improved by 2.46%.


Introduction
As a product of the transition from traditional fuel to electric vehicles (EVs), plug-in hybrid electric vehicles (PHEVs) are currently a hot topic in research on new energy vehicles. As one of the keys to energy-saving technologies for hybrid electric vehicles, energy management strategies (EMS) can be divided into three main approaches: 1 Rule-based (RB) energy management strategies [1,2]; 2 optimization-based control strategies; and 3 intelligent control-based algorithm strategies. RB strategies commonly used in the research field include the charge-depleting-charge sustaining (CD-CS) strategy and fuzzy logic (FL)-based strategy. A rule-based strategy is simple and can be easily implemented in real time using look-up tables, although the fuel economy is relatively poor. Optimization-based EMS can be divided into instantaneous and global optimization. A representative of an instantaneous optimization algorithm is the equivalent fuel consumption minimization strategy (ECMS) [3][4][5], which achieves a good real-time performance, but cannot be operated stably under different working conditions. Global optimization algorithms mainly include dynamic programming (DP) [6,7], Pontryagin's minimum principle (PMP) [8,9], etc. Although global optimization algorithms can guarantee the minimum fuel consumption, all driving cycle information considers the influence of SOC on the equivalent factor and does not consider the driving conditions, which is too limited.
In [22][23][24][25][26], the authors proposed the approximate equivalent fuel consumption minimization strategy (A-ECMS) based on the approximate minimum principle (A-PMP). The motor output torque is constrained in a set of five optional values, which greatly simplifies the calculation cost, and its reliability has been proved in an actual vehicle experiment. Based on this, this study nests the A-ECMS into DP to form an improved dynamic programming algorithm with an equivalent factor as the single control variable. Firstly, the A-ECMS model is established, where the output torque of the motor is constrained to a limited number of optional values by numerically fitting the efficiency of the engine and the motor. The instantaneous energy distribution is obtained by A-ECMS of different equivalent factors at each instant, and the SOC variation under the energy distribution is calculated. Then, the DP global optimization model is established, in which the equivalent factor is the control variable, the battery SOC is the state variable, and the cumulative fuel consumption is the global optimization goal. Therefore, the DP-A-ECMS algorithm is formed. In order to verify the algorithm proposed in this paper, a PHEV computer simulation model was built to simulate under different driving conditions, and the optimal equivalent factor sequence was obtained. Finally, in order to realize the online application of DP-A-ECMS, a BP neural network was used to extract the nonlinear correlation between the optimal equivalent factor and real-time operating conditions, and the optimal equivalent factor sequence was transformed from a set of time-varying sequences to a set of state-varying sequences. By doing this, an online control strategy could be established.

Power System Model of the PHEV
The structure and parameters of the HEV considered in this study are presented in Figure 1 and Table 1, respectively. The main components include the engine, clutch C1, integrated starter and generator (ISG), CVT, main reducer, and battery pack. The engagement and disengagement of the wet multi-disc clutch C1 can realize the start of the engine and switch from pure electric mode to hybrid drive mode.
Appl. Sci. 2020, 10,8352 3 of 17 In [22][23][24][25][26], the authors proposed the approximate equivalent fuel consumption minimization strategy (A-ECMS) based on the approximate minimum principle (A-PMP). The motor output torque is constrained in a set of five optional values, which greatly simplifies the calculation cost, and its reliability has been proved in an actual vehicle experiment. Based on this, this study nests the A-ECMS into DP to form an improved dynamic programming algorithm with an equivalent factor as the single control variable. Firstly, the A-ECMS model is established, where the output torque of the motor is constrained to a limited number of optional values by numerically fitting the efficiency of the engine and the motor. The instantaneous energy distribution is obtained by A-ECMS of different equivalent factors at each instant, and the SOC variation under the energy distribution is calculated. Then, the DP global optimization model is established, in which the equivalent factor is the control variable, the battery SOC is the state variable, and the cumulative fuel consumption is the global optimization goal. Therefore, the DP-A-ECMS algorithm is formed. In order to verify the algorithm proposed in this paper, a PHEV computer simulation model was built to simulate under different driving conditions, and the optimal equivalent factor sequence was obtained. Finally, in order to realize the online application of DP-A-ECMS, a BP neural network was used to extract the nonlinear correlation between the optimal equivalent factor and real-time operating conditions, and the optimal equivalent factor sequence was transformed from a set of time-varying sequences to a set of statevarying sequences. By doing this, an online control strategy could be established.

Power System Model of the PHEV
The structure and parameters of the HEV considered in this study are presented in Figure 1 and Table 1, respectively. The main components include the engine, clutch C1, integrated starter and generator (ISG), CVT, main reducer, and battery pack. The engagement and disengagement of the wet multi-disc clutch C1 can realize the start of the engine and switch from pure electric mode to hybrid drive mode.

Vehicle Dynamics Model
The structure and dynamics of the vehicle transmission system are shown in Figure 1, and the required torque and speed of the vehicle power system must satisfy the following equations: where T r is the required torque of the CVT output; T e and T m are the torque of the engine and motor, respectively, where the motor torque is negative when in braking recovery mode; i cvt and i 0 are the speed ratio of the CVT and main reducer, respectively; η cvt is the transmission efficiency of the CVT; and ω e , ω m , and ω w are the speed of the engine, motor, and wheel, respectively.

Numerical Model of the Engine and Motor
An experimental numerical model was used to model the engine and motor in this study based on the experimental data, and a three-dimensional interpolation map was established, which can be described by Equations (3) and (4). Equation (3) is the interpolation function for the engine fuel consumption, and Equation (4) is the interpolation function for the motor efficiency. Both the engine fuel consumption and motor efficiency are functions of torque and speed, which can be obtained by interpolating the table in practical applications. .

CVT Model
The CVT speed ratio is a continuous variable which can control the working point of the engine and motor to ensure that the engine and motor work within the high-efficiency region by adjusting the speed ratio in real time. The transmission efficiency of the CVT is also a continuously varying function. In this study, an experimental numerical model was selected. The transmission efficiency of the CVT is a function of the speed ratio and torque, as shown in Equation (5), where T cvt is the input torque of the CVT.

Theoretical Model of the Battery
In the internal resistance model selected in this study, the current, SOC change rate, and power can be formulated as follows: .
where S .
OC is the battery SOC change rate; I b is the electric current; . m b represents the battery consumption/recovery power, which is positive when consumed and negative when recovered; Q 0 is the battery capacitance (A·h); V oc is the battery open-circuit voltage (V); R b is the battery internal resistance (Ω); and P b is the battery output power. The battery output power can be obtained using the following equation: The SOC change rate of the battery is expressed in terms of the battery power, and Equation (7) can be rewritten as follows:

Approximately Equivalent Fuel Consumption Minimum Strategy
Originating from the PMP algorithm, the ECMS algorithm was proposed by Paganellig [5] and applied to the energy management strategy of HEVs. The core idea is to select an appropriate charge and discharge equivalent factor to transform the electricity consumption into the fuel consumption, minimize the instantaneous equivalent fuel consumption, and obtain an approximate global optimal solution. The mathematical model can be formulated as where . J ecms is the instantaneous equivalent fuel consumption rate; soc is the battery capacity level; T e and i cvt are the engine torque and CVT speed ratio, respectively, and are system control variables; . m e , Appl. Sci. 2020, 10, 8352 6 of 17 Ouyang and Qin proposed the A-ECMS. By numerically fitting the fuel consumption rate of the engine and battery, the search space of the optimal control variable is reduced to shorten the optimization time. In this paper, A-ECMS is analyzed according to the simulation model in Chapter 1. Figure 2a shows the fitting curves of the instantaneous fuel consumption of the engine and Figure 2b shows the equivalent fuel consumption of the battery (the battery SOC is 0.5 and the equivalent factor is 1). It can be seen that, at a certain speed, the relationship between the instantaneous fuel consumption and the output torque of the engine can be divided into three sections. For HEVs, the actual working requirement of the engine is to avoid working in the low-load area (the first section). Therefore, the linear and quadratic functions are used to fit the second and third sections of the instantaneous fuel consumption curve of the engine. The intersection of the two curves was recorded as T opt . The battery equivalent fuel consumption curve can be divided into two sections. Each curve is composed of two quadratic functions, and the point where the motor output torque is zero is the connection point of the two quadratic curves. equivalent fuel consumption of the battery rate. The constraint conditions are as follows: Ouyang and Qin proposed the A-ECMS. By numerically fitting the fuel consumption rate of the engine and battery, the search space of the optimal control variable is reduced to shorten the optimization time. In this paper, A-ECMS is analyzed according to the simulation model in Chapter 1. Figure 2a shows the fitting curves of the instantaneous fuel consumption of the engine and Figure  2b shows the equivalent fuel consumption of the battery (the battery SOC is 0.5 and the equivalent factor is 1). It can be seen that, at a certain speed, the relationship between the instantaneous fuel consumption and the output torque of the engine can be divided into three sections. For HEVs, the actual working requirement of the engine is to avoid working in the low-load area (the first section). Therefore, the linear and quadratic functions are used to fit the second and third sections of the instantaneous fuel consumption curve of the engine. The intersection of the two curves was recorded as opt T . The battery equivalent fuel consumption curve can be divided into two sections. Each curve is composed of two quadratic functions, and the point where the motor output torque is zero is the connection point of the two quadratic curves.  Equations (13) and (14) are fitting functions, where a i , b i , c i , and d i (i = 0, 1, 2) are the fitting parameters. According to Figure 2 and the definition of the convex function, it can be determined that, at a certain speed, the instantaneous fuel consumption of the engine is a convex function on [T e,min , T e,max ], and the instantaneous equivalent fuel consumption of the battery is a convex function on [T m,min , T m,max ]. . .
The goal of ECMS is to minimize the equivalent fuel at each instant. Because the instantaneous fuel consumption of the engine and the instantaneous equivalent fuel consumption of the battery are convex functions, the nature of the convex function shows that the minimum value can only be obtained in 0, T r , T r − T opt , U max , U max , and among them, Appl. Sci. 2020, 10, 8352 7 of 17 Therefore, the constraint in the ECMS mathematical model (Equation (12)) is transformed into Equation (16), and the A-ECMS is then formed. (16) At this point, in the A-ECMS calculation process, the optimal solution can only be determined by comparing the equivalent fuel function values of these five points. This reduces the search area of the optimal solution from the allowable reachable set of the entire control variable to five search points, significantly reducing the calculation time and calculation storage space. This process also leads to a discontinuous relationship between the optimal energy allocation and the equivalent factor. Figure 3 shows the change in the optimal distribution of the power of the engine and motor corresponding to different equivalent factors when the vehicle speed is 60 km/h and the required torque is 30 Nm. It can be seen that the optimal power distribution curve is stepped; in other words, an interval of equivalent factors corresponds to an optimal power distribution. Therefore, in the A-ECMS solution process, no matter how the equivalent factor changes, the power of the engine and motor corresponding to the optimal equivalent fuel consumption is often concentrated at a few specific values.
convex functions, the nature of the convex function shows that the minimum value can only be obtained in max max 0, , , , Therefore, the constraint in the ECMS mathematical model (Equation (12)) is transformed into Equation (16), and the A-ECMS is then formed.
At this point, in the A-ECMS calculation process, the optimal solution can only be determined by comparing the equivalent fuel function values of these five points. This reduces the search area of the optimal solution from the allowable reachable set of the entire control variable to five search points, significantly reducing the calculation time and calculation storage space. This process also leads to a discontinuous relationship between the optimal energy allocation and the equivalent factor. Figure 3 shows the change in the optimal distribution of the power of the engine and motor corresponding to different equivalent factors when the vehicle speed is 60 km/h and the required torque is 30 Nm. It can be seen that the optimal power distribution curve is stepped; in other words, an interval of equivalent factors corresponds to an optimal power distribution. Therefore, in the A-ECMS solution process, no matter how the equivalent factor changes, the power of the engine and motor corresponding to the optimal equivalent fuel consumption is often concentrated at a few specific values.

Improved Dynamic Programming Algorithm
In a traditional dynamic programming algorithm used in the energy management strategy of a PHEV, the battery SOC and the transmission speed ratio are usually taken as state variables, and the motor torque is taken as a control variable. For CVT transmissions, owing to the continuously changing characteristics of the speed ratio, to maximize the transmission efficiency of the vehicle, the CVT speed ratio needs to be discretized into a series of values within its feasible range. In addition, for the motor torque, intensive dispersion is also required. Therefore, the direct application of a dynamic programming algorithm to global optimization of the EMS requires significant computational costs.
As mentioned above, the equivalent minimum fuel strategy can solve the instantaneous optimal solution through the equivalent factor, and the optimal solution can only be obtained from a limited number of specific values after a numerical fitting of the fuel consumption rates of the engine and motor. In view of this, in this study, the A-ECMS is combined with the DP algorithm to form an improved dynamic programming algorithm. In this method, the A-ECMS is added to each step of the DP process. Equivalent factors are taken as the control variables to calculate the instantaneous energy distribution of the A-ECMS, and the variation of SOC and the instantaneous fuel consumption of the engine under the energy distribution are then calculated. Taking the battery SOC as the state variable and the cumulative instantaneous fuel consumption as the optimization objective of DP, the optimal control law of the equivalent factor is obtained by offline global optimization. A flow chart of the algorithm is illustrated in Figure 4. number of specific values after a numerical fitting of the fuel consumption rates of the engine and motor. In view of this, in this study, the A-ECMS is combined with the DP algorithm to form an improved dynamic programming algorithm. In this method, the A-ECMS is added to each step of the DP process. Equivalent factors are taken as the control variables to calculate the instantaneous energy distribution of the A-ECMS, and the variation of SOC and the instantaneous fuel consumption of the engine under the energy distribution are then calculated. Taking the battery SOC as the state variable and the cumulative instantaneous fuel consumption as the optimization objective of DP, the optimal control law of the equivalent factor is obtained by offline global optimization. A flow chart of the algorithm is illustrated in Figure 4.   The state transition equation and DP cost function are as follows: Starting from the determined end state of the system, the optimal equivalent factor s * k of each stage and each state is obtained in reverse order, according to Equations (17) and (18), respectively, and all optimal equivalent factors are stored to obtain the optimal control variable matrix S * dp . Then, starting from the initial state of the system, the optimal equivalent factor s * k at this moment in the optimal control variable matrix is determined, and the engine fuel consumption under this equivalent factor is calculated using the ECMS model. The best equivalent factor is then recorded at each instant such that the best equivalent factor sequence and the best fuel consumption are obtained. Because of the discontinuous relationship between the optimal energy distribution and the equivalence factor, the optimal equivalent factor at each instant is within an interval composed of the upper and lower limits of the optimal equivalent factor, and the sequence of the optimal equivalent factor in the entire cycle is divided into the upper limit sequence of the effect factor and its lower limit sequence.

Offline Optimization Simulation Results
To cover the different driving conditions including highways, urban suburbs, and congested streets, this article uses six standard driving cycles to simulate the operating conditions of the vehicle:  Figure 5 shows the battery SOC change curve under different initial conditions for the 11-SC03 (a) and 9-HWFET (b) operating conditions. It can be seen that each curve always fluctuates slightly near the straight line formed by the initial SOC and the final SOC. Under this condition, the battery can be discharged smoothly throughout the cycle, and the vehicle can fully use battery energy to reduce the engine fuel consumption.  Figure 6 shows the optimal sequence of the upper and lower limits of the equivalent factor with initial SOCs of 0.9, 0.6, and 0.3 under the 11-SC03 and 9-HWFET operating conditions, in which meaningless points (such as speed and demand torque values of zero) and the point of pure electric driving are removed. In the SC03 driving cycle, the optimal sequence of the equivalent factor increases as the initial SOC decreases, and the effective data points increase. This is because, when the battery capacity is sufficient, the vehicle tends to drive in pure electric mode. In contrast, the time of the hybrid mode is increased when the SOC is low, and the equivalent factor is maintained at a higher level to ensure the SOC balance. Under the HWFET operating condition, owing to the higher average vehicle speed, the equivalent factor is always larger, and the engine thus needs to output a higher power.  Table 2 shows a comparison of the fuel economy and simulation calculation time for DP-A-ECMS and DP strategies under different initial SOCs, and the gap with DP-A-ECMS is calculated based on the DP. Because of the previous numerical fitting of the DP-A-ECMS algorithm, fuel consumption is increased by 0.18%-1.35% compared with DP. However, the control variable of DP-A-ECMS is limited to specific values instead of the entire feasible range, and its calculation cost is therefore reduced by more than 50%.  Table 2 shows a comparison of the fuel economy and simulation calculation time for DP-A-ECMS and DP strategies under different initial SOCs, and the gap with DP-A-ECMS is calculated based on the DP. Because of the previous numerical fitting of the DP-A-ECMS algorithm, fuel consumption is increased by 0.18%-1.35% compared with DP. However, the control variable of DP-A-ECMS is limited to specific values instead of the entire feasible range, and its calculation cost is therefore reduced by more than 50%.

Online Control Strategy
The global optimization-based algorithm needs to predict all the information of the working conditions and has a large amount of computation, which limits the real-time application of the algorithm. However, the optimal equivalent factor sequence obtained by the DP-A-ECMS method can be used as a reference for the online controller. Considering that energy management strategies are influenced by many different factors, it is difficult to obtain deterministic equations or relationships. Neural networks can not only approximate any complex nonlinear mapping with any degree of accuracy, but also have the ability of learning and generalization [24]. Therefore, the nonlinear relationship between the optimal equivalent factor and state variables was learned based on the neural network in this study, and the relationship was then used to generate a real-time control strategy, such that the vehicle can obtain the optimal fuel economy performance during practical application.

Neural Network Training
In this study, a BP neural network was used. The input parameters of the network should contain sufficient information to reflect output parameters. The current time, speed, torque requirement, battery SOC, total mileage, residual mileage, average speed, maximum speed, minimum speed, maximum acceleration, and minimum acceleration were chosen as the input parameters, according to a large number of previous studies [13,[27][28][29][30]. The time intervals used to calculate the average speed, the maximum and minimum speeds, and the maximum and minimum accelerations were all 60 s. The target output parameter takes the median of the upper and lower limits of the optimal equivalent factor. Figure 7a shows the network structure. To compare it with the traditional neural network method, this study used the same neural network to fit the CVT speed ratio and motor torque obtained by the dynamic programming algorithm, which is shown in Figure 7b.

Online Control Strategy
The global optimization-based algorithm needs to predict all the information of the working conditions and has a large amount of computation, which limits the real-time application of the algorithm. However, the optimal equivalent factor sequence obtained by the DP-A-ECMS method can be used as a reference for the online controller. Considering that energy management strategies are influenced by many different factors, it is difficult to obtain deterministic equations or relationships. Neural networks can not only approximate any complex nonlinear mapping with any degree of accuracy, but also have the ability of learning and generalization [24]. Therefore, the nonlinear relationship between the optimal equivalent factor and state variables was learned based on the neural network in this study, and the relationship was then used to generate a real-time control strategy, such that the vehicle can obtain the optimal fuel economy performance during practical application.

Neural Network Training
In this study, a BP neural network was used. The input parameters of the network should contain sufficient information to reflect output parameters. The current time, speed, torque requirement, battery SOC, total mileage, residual mileage, average speed, maximum speed, minimum speed, maximum acceleration, and minimum acceleration were chosen as the input parameters, according to a large number of previous studies [13,[27][28][29][30]. The time intervals used to calculate the average speed, the maximum and minimum speeds, and the maximum and minimum accelerations were all 60 s. The target output parameter takes the median of the upper and lower limits of the optimal equivalent factor. Figure 7a shows the network structure. To compare it with the traditional neural network method, this study used the same neural network to fit the CVT speed ratio and motor torque obtained by the dynamic programming algorithm, which is shown in Figure 7b.   Through the BP neural network fitting in Figure 7a, time-varying equivalent factors with real-time operating conditions and the vehicle state could be obtained. Figure 8 shows the change in the equivalent factor with regard to the battery SOC under different remaining driving ranges when the vehicle speed is 60 km/h and the demanding torque is 30 Nm. When the remaining driving range is relatively small, the equivalent factor rapidly increases as the battery SOC decreases. The energy distribution of the vehicle changes from being driven mainly by electricity to being driven jointly by the engine and motor to maintain the SOC balance during the entire cycle. When the remaining driving range is large, to ensure that the battery can provide sufficient electricity in the future, the equivalent factor is always maintained at a high level, and does not change significantly with the decrease of SOC. relatively small, the equivalent factor rapidly increases as the battery SOC decreases. The energy distribution of the vehicle changes from being driven mainly by electricity to being driven jointly by the engine and motor to maintain the SOC balance during the entire cycle. When the remaining driving range is large, to ensure that the battery can provide sufficient electricity in the future, the equivalent factor is always maintained at a high level, and does not change significantly with the decrease of SOC. The trained BP neural network can be used to establish a real-time control strategy. Figure 9 shows a flowchart of the online controller. According to the required torque signal given by the driver, the current battery SOC signal, and the real-time vehicle speed, the remaining mileage is calculated at the same time, and combined with the historical working condition data, the neural network module can calculate the currently optimal equivalent factor within a 60-s period. The equivalent factor is the input into the A-ECMS energy distribution module to minimize the equivalent fuel consumption, and the optimal engine and motor power output at this moment can be obtained.   The trained BP neural network can be used to establish a real-time control strategy. Figure 9 shows a flowchart of the online controller. According to the required torque signal given by the driver, the current battery SOC signal, and the real-time vehicle speed, the remaining mileage is calculated at the same time, and combined with the historical working condition data, the neural network module can calculate the currently optimal equivalent factor within a 60-s period. The equivalent factor is the input into the A-ECMS energy distribution module to minimize the equivalent fuel consumption, and the optimal engine and motor power output at this moment can be obtained.
vehicle speed is 60 km/h and the demanding torque is 30 Nm. When the remaining driving range is relatively small, the equivalent factor rapidly increases as the battery SOC decreases. The energy distribution of the vehicle changes from being driven mainly by electricity to being driven jointly by the engine and motor to maintain the SOC balance during the entire cycle. When the remaining driving range is large, to ensure that the battery can provide sufficient electricity in the future, the equivalent factor is always maintained at a high level, and does not change significantly with the decrease of SOC.  The trained BP neural network can be used to establish a real-time control strategy. Figure 9 shows a flowchart of the online controller. According to the required torque signal given by the driver, the current battery SOC signal, and the real-time vehicle speed, the remaining mileage is calculated at the same time, and combined with the historical working condition data, the neural network module can calculate the currently optimal equivalent factor within a 60-s period. The equivalent factor is the input into the A-ECMS energy distribution module to minimize the equivalent fuel consumption, and the optimal engine and motor power output at this moment can be obtained.

Online Controller Verification
The adaptability to driving conditions of the real-time controller depends on the training data of the BP neural network. In this study, six sets of data representing three different driving conditions, including congestion, the urban area, and the highway, were used for training. To verify the practicability of the online controller, 10 New European Driving Cycles (NEDC) were used as the test condition. As shown in Figure 10, the NEDC is composed of four urban working conditions and one suburban driving condition, and the maximum vehicle speed under these conditions is 50 and 120 km/h, and the average vehicle speed is 19 and 63 km/h, respectively, which can represent urban and suburban highway conditions. The initial SOC is 0.9, and the target SOC is 0.3. The BP neural network control strategy was simulated and calculated on the basis of the simulation model built on MATLAB/Simulink. At the same time, the DP-A-ECMS algorithm was used to obtain the optimal sequence of equivalent factors offline for test conditions. including congestion, the urban area, and the highway, were used for training. To verify the practicability of the online controller, 10 New European Driving Cycles (NEDC) were used as the test condition. As shown in Figure 10, the NEDC is composed of four urban working conditions and one suburban driving condition, and the maximum vehicle speed under these conditions is 50 and 120 km/h, and the average vehicle speed is 19 and 63 km/h, respectively, which can represent urban and suburban highway conditions. The initial SOC is 0.9, and the target SOC is 0.3. The BP neural network control strategy was simulated and calculated on the basis of the simulation model built on MATLAB/Simulink. At the same time, the DP-A-ECMS algorithm was used to obtain the optimal sequence of equivalent factors offline for test conditions. In Figure 11, the results of the online identification of the equivalent factors of the neural network under a short period of working conditions are selected. Figure 11a is a comparison of the equivalent factor identified by the BP neural network and the median value of the optimal equivalent factors by DP-A-ECMS. The mean square error (MSE) is usually used to measure the recognition accuracy of neural networks. The MSE can be expressed as where BP S is the equivalent factor recognized by the neural network, mid S is the median value of optimal equivalent factors obtained by the DP-A-ECMS, and N represents the total time steps. The upper limit and lower limit of optimal equivalent factors obtained by the DP-A-ECMS are shown in Figure 11b. According to the conclusion of the previous analysis, as long as the equivalent factor is within its upper and lower limits, the final output of the vehicle is the same. Therefore, the updated MSE can be expressed as where max S and min S are the upper limit and lower limit of the optimal equivalent factors, respectively. It can be seen from Figure 11 that the neural network has a certain error in the recognition, but most of the recognized equivalent factors fall within the upper and lower limits of the optimal equivalent factor. Intuitively, the upper and lower limits of the optimal equivalent factor provide a fault-tolerant space for the online recognition of parameters. From the mathematical formula, in the updated MSE (Equation (20)), the part where the recognition equivalent factor is within the upper In Figure 11, the results of the online identification of the equivalent factors of the neural network under a short period of working conditions are selected. Figure 11a is a comparison of the equivalent factor identified by the BP neural network and the median value of the optimal equivalent factors by DP-A-ECMS. The mean square error (MSE) is usually used to measure the recognition accuracy of neural networks. The MSE can be expressed as where S BP is the equivalent factor recognized by the neural network, S mid is the median value of optimal equivalent factors obtained by the DP-A-ECMS, and N represents the total time steps.
Appl. Sci. 2020, 10, 8352 14 of 17 and lower limits is 0, so its value will be greatly reduced compared to the true recognition error of the neural network (Equation (19)). That is to say, under the test conditions, due to the characteristics of DP-A-ECMS, the influence of the recognition error of the neural network on the system performance is greatly diminished, so the generalization performance of the neural network has been greatly improved. In contrast, the DP-based online control strategy has no fault tolerance space, and the recognition error of the neural network directly affects fuel consumption. Therefore, the DP-A-ECMS-based neural network control strategy has a higher robustness, and the fuel consumption in the simulation will be closer to the offline optimization results.  Figure 12 shows the change curve of the SOC during the entire driving cycle. The best SOC curves obtained by the DP-A-ECMS algorithm and DP algorithm offline calculation are added as a reference. The solid red line represents the online control strategy based on the DP-A-ECMS algorithm proposed in this article, and the solid green line represents the neural network control strategy based on the traditional DP algorithm. It can be seen that the offline calculation results of the DP-A-ECMS and DP are extremely close. The neural network online control strategy based on the DP-A-ECMS algorithm is closer to the global optimal solution. Table 3 shows the vehicle fuel consumption of the two methods. It can be seen that the fuel economy of the method proposed in this paper is increased by 2.46%.  The upper limit and lower limit of optimal equivalent factors obtained by the DP-A-ECMS are shown in Figure 11b. According to the conclusion of the previous analysis, as long as the equivalent factor is within its upper and lower limits, the final output of the vehicle is the same. Therefore, the updated MSE can be expressed as where S max and S min are the upper limit and lower limit of the optimal equivalent factors, respectively. It can be seen from Figure 11 that the neural network has a certain error in the recognition, but most of the recognized equivalent factors fall within the upper and lower limits of the optimal equivalent factor. Intuitively, the upper and lower limits of the optimal equivalent factor provide a fault-tolerant space for the online recognition of parameters. From the mathematical formula, in the updated MSE (Equation (20)), the part where the recognition equivalent factor is within the upper and lower limits is 0, so its value will be greatly reduced compared to the true recognition error of the neural network (Equation (19)). That is to say, under the test conditions, due to the characteristics of DP-A-ECMS, the influence of the recognition error of the neural network on the system performance is greatly diminished, so the generalization performance of the neural network has been greatly improved. In contrast, the DP-based online control strategy has no fault tolerance space, and the recognition error of the neural network directly affects fuel consumption. Therefore, the DP-A-ECMS-based neural network control strategy has a higher robustness, and the fuel consumption in the simulation will be closer to the offline optimization results. Figure 12 shows the change curve of the SOC during the entire driving cycle. The best SOC curves obtained by the DP-A-ECMS algorithm and DP algorithm offline calculation are added as a reference. The solid red line represents the online control strategy based on the DP-A-ECMS algorithm proposed in this article, and the solid green line represents the neural network control strategy based on the traditional DP algorithm. It can be seen that the offline calculation results of the DP-A-ECMS and DP are extremely close. The neural network online control strategy based on the DP-A-ECMS algorithm is closer to the global optimal solution. Table 3 shows the vehicle fuel consumption of the two methods. It can be seen that the fuel economy of the method proposed in this paper is increased by 2.46%.
Appl. Sci. 2020, 10, 8352 14 of 17 and lower limits is 0, so its value will be greatly reduced compared to the true recognition error of the neural network (Equation (19)). That is to say, under the test conditions, due to the characteristics of DP-A-ECMS, the influence of the recognition error of the neural network on the system performance is greatly diminished, so the generalization performance of the neural network has been greatly improved. In contrast, the DP-based online control strategy has no fault tolerance space, and the recognition error of the neural network directly affects fuel consumption. Therefore, the DP-A-ECMS-based neural network control strategy has a higher robustness, and the fuel consumption in the simulation will be closer to the offline optimization results. Valid data points  Figure 12 shows the change curve of the SOC during the entire driving cycle. The best SOC curves obtained by the DP-A-ECMS algorithm and DP algorithm offline calculation are added as a reference. The solid red line represents the online control strategy based on the DP-A-ECMS algorithm proposed in this article, and the solid green line represents the neural network control strategy based on the traditional DP algorithm. It can be seen that the offline calculation results of the DP-A-ECMS and DP are extremely close. The neural network online control strategy based on the DP-A-ECMS algorithm is closer to the global optimal solution. Table 3 shows the vehicle fuel consumption of the two methods. It can be seen that the fuel economy of the method proposed in this paper is increased by 2.46%.   From the verification results of the online controller, it can be seen that the BP neural network can output online equivalent factors close to the theoretical optimal value. The online speed recognition of the neural network is very fast, and the average calculation speed in the computer simulation process is 4.8 ms. The works [24,26] pointed out that A-ECMS exhibited a good real-time performance, stability, and accuracy in both hardware-in-the-loop experiments and real-vehicle road tests. Therefore, the online control strategy proposed in this paper can effectively improve the fuel economy of PHEV in the real world. Table 4 compares the existing online control strategy with the method proposed in this paper in terms of the fuel economy, real-time performance, and robustness. The indicator of fuel economy is the fuel consumption per 100 km under 10 NEDC driving conditions. With the support of optimization theory, the optimization-based strategy and neural network-based strategy have great advantages compared with the RB strategy, which is empirical. The evaluation index of the real-time performance is the average calculation time per step. The rule-based strategy is essentially based on table lookup, the instantaneous optimization strategy is a comparison of a limited number of values, and the neural network strategy adds a parameter recognition process to the comparison. Therefore, the computation speed of these three strategies is very fast. It is difficult to unify the evaluation criteria of robustness. However, as we all know, the rule-based strategy can adapt to most working conditions, while optimization-based strategies can only be applied to a single optimized driving condition. Neural network-based strategies are related to the generalization of neural networks. As mentioned above, the neural network based on DP-A-ECMS has better generalization. Therefore, the table gives a subjective evaluation of the robustness of different strategies.

Conclusions
This paper has described the online energy management of PHEVs based on an offline optimization algorithm. The minimum strategy was analyzed, and the optimal strategy for the approximately equivalent fuel consumption was derived. Combined with the DP algorithm and taking the equivalent factor as the control variable, the A-ECMS was nested in the DP algorithm, and the DP-A-ECMS algorithm was formed to calculate the global optimal sequence of the equivalent factor through DP global optimization.
Taking a single-axis parallel CVT PHEV as the research object, a Simulink simulation model of the vehicle was established, and an off-line simulation calculation was carried out under the DP-A-ECMS strategy. The results show that, under different working conditions and different initial SOC conditions, the SOC curve under the DP-A-ECMS strategy fluctuates near the straight line formed by the initial SOC and the final SOC, and the battery can be discharged smoothly. Compared with the DP strategy, although the fuel consumption is slightly improved, the calculation cost is significantly reduced.
A BP neural network was used to extract the nonlinear relationship between the optimal equivalent factor and real-time operating conditions and vehicle states. The control strategy rules suitable for specific operating conditions were extracted, and an online control strategy based on a BP neural network was established. The simulation results show that the neural network control strategy based on optimal equivalent factor recognition has a good robustness, and the change in battery SOC can track the reference curve well. Compared with a direct method of identifying the motor torque and CVT speed ratio, the fuel economy is improved by 2.46%.