Real-Time Control Strategy for CVT-Based Hybrid Electric Vehicles Considering Drivability Constraints

: The energy management strategy has a great inﬂuence on the fuel economy of hybrid electric vehicles, and the equivalent consumption minimization strategy (ECMS) has proved to be a useful tool for the real-time optimal control of Hybrid Electric Vehicles (HEVs). However, the adaptation of the equivalent factor poses a major challenge in order to obtain optimal fuel consumption as well as robustness to varying driving cycles. In this paper, an adaptive-ECMS based on driving pattern recognition (DPR) is established for hybrid electric vehicles with continuously variable transmission. The learning vector quantization (LVQ) neural network model was adopted for the on-line DPR algorithm. The inﬂuence of the battery state of charge (SOC) on the optimal equivalent factor was studied under di ﬀ erent driving patterns. On this basis, a method of adaptation of the equivalent factor was proposed by considering the type of driving pattern and the battery SOC. Besides that, in order to enhance drivability, penalty terms were introduced to constrain frequent engine on / o ﬀ events and large variations of the continuously variable transmission (CVT) speed ratio. Simulation results showed that the proposed method e ﬃ ciently improved the equivalent fuel consumption with charge-sustaining operations and also took into account driving comfort.


Introduction
Hybrid electric vehicles adopt multiple power sources to drive vehicles to improve fuel economy and reduce pollutant emissions. The energy management strategy (EMS) greatly impacts the fuel economy by controlling the power distribution among the power sources [1,2].
Global optimization methods such as dynamic programming (DP), genetic algorithm (GA), and simulated annealing (SA) are widely used to solve the optimal energy management problem over a finite horizon [3][4][5]. Major disadvantages of these numerical methods are the computational burden and the need to know the driving cycle in advance in order to calculate the required power at the wheels. Therefore, it is not applicable in an actual controller. To tackle this problem, based on Pontryagin's minimum principle, Paganelli et al. [6] first proposed the equivalent consumption minimization strategy (ECMS). The equivalent factor has a great influence on the results of ECMS and the constraint on the battery state of charge (SOC), and it varies with different driving cycles. In order to enhance robustness to varying driving cycles, Musardo et al. [7] proposed a method by adding an on-the-fly algorithm for the estimation of the equivalent factor to the original ECMS framework, so that the SOC is maintained within the boundaries and the fuel consumption is minimized. Ambuhl et al. [8] employed a penalty function, and the equivalent factor was continuously estimated as a function of SOC deviations. In some studies, total fuel consumption was regarded as a function of the equivalent factor, and the control parameters under optimal fuel economy could be calculated by using systematic methods for given driving cycles [9,10]. In [11], the piecewise correction coefficient was used to penalize SOC variations, which was updated by a period of historical driving data.
The abovementioned methods usually adopted fixed initial conditions while neglecting the influence of complex driving conditions on the performance of the EMS, thus often leading to suboptimal results. In order to improve the flexibility of the controller, it is necessary to periodically recognize the driving pattern and update the control parameters accordingly [12]. Typically, a learning vector quantization (LVQ) neural network model can be well suited for realizing the on-line driving pattern recognition (DPR) through calculating the Euclidean distances [13]. In [14], based on the sample database of several standard driving cycles, the representative feature parameters were obtained by principal component analysis (PCA), and a real-time DPR algorithm was established by using an LVQ neural network model. The optimized control parameters were then periodically updated according to the driving pattern. A comprehensive review of DPR-based control algorithms can be found in [15]. In addition, the DPR algorithm can be incorporated into an adaptive-ECMS (A-ECMS) controller for on-line estimation of the equivalent factor. In [16], an A-ECMS based on DPR was proposed, and a reasonable time window of historical driving data for the DPR algorithm was determined. In [17], the standard driving cycles were classified by K-means clustering, and the nominal equivalent factor was periodically updated by a mean equivalent factor of the selected driving class. A similar method can be found in [18], in which fuzzy rules were adopted for driving cycle classification, and each driving pattern was represented by the optimal equivalent factor of a typical standard cycle in the driving class. These methods have shown promising results for on-line implementation of the ECMS; however, they ignore the influence of the SOC on the estimation of the equivalent factor. In fact, the initial SOC and the subtle change of the equivalent factor are very sensitive to the SOC balance [19]. In this paper, a novel adaptation method of the equivalent factor is proposed by considering the type of driving pattern and the battery SOC to improve fuel economy.
In real-time application of the optimization method, if the fuel economy is the only criterion, results will lead to poor drivability [20,21]. Previous research on the multiobjective optimization of fuel economy and drivability mainly focused on the frequent engine on and off events, gear shifting, and improving the engine torque reserve [22][23][24]. Since the speed ratio changes continuously in a continuously variable transmission (CVT), driving problems caused by an automatic transmission can be avoided to a large extent. However, a large variation of the speed ratio will cause an abrupt change of the speed at the CVT output shaft, resulting in negative dynamics [25]. Usually, an optimal speed ratio trajectory can be obtained by using global optimization methods; then, the speed ratio is controlled by a feedforward controller or static look-up tables [26]. In this paper, penalty terms are incorporated into the cost function to enhance drivability.
The remainder of this paper is constructed as follows: in Section 2, a forward-facing model of a typical CVT-based hybrid electric vehicle is established. In Section 3, toward the real-time application of the A-ECMS, first, drivability problems caused by the single-criterion cost function for the CVT-based hybrid electric vehicle (HEV) are analyzed, and penalty terms are incorporated into the cost function to enhance drivability. Second, based on the real-time DPR algorithm, a novel adaptation method of the equivalent factor is proposed. In Section 4, a random test cycle is used to validate the effectiveness of the proposed method. Finally, conclusions are given in Section 5.

Vehicle Configuration
The structure of the CVT-based parallel hybrid electric vehicle studied in this paper is shown in Figure 1. In this system, an integrated starter/generator (ISG) is powered by a Li-ion battery pack Appl. Sci. 2019, 9, 2074 3 of 15 and connected to the engine crankshaft through a disc clutch. The torque converter in a conventional CVT is removed, and the clutch is installed inside the rotor of the ISG. An electric oil pump (EOP) is employed in order to meet the pressure and flow demand of the CVT hydraulic system under electric driving mode. Driving mode transition is realized by controlling the working state of the engine, ISG, and clutch. The vehicle parameters are shown in Table 1.
Appl. Sci. 2019, 9, x FOR PEER REVIEW  3 of 14 connected to the engine crankshaft through a disc clutch. The torque converter in a conventional CVT is removed, and the clutch is installed inside the rotor of the ISG. An electric oil pump (EOP) is employed in order to meet the pressure and flow demand of the CVT hydraulic system under electric driving mode. Driving mode transition is realized by controlling the working state of the engine, ISG, and clutch. The vehicle parameters are shown in Table 1

Vehicle Model
According to physical causality principles, two different modeling approaches can be used: forward-or backward-facing modeling. In the backward-facing model, the required torque at the wheel is calculated backwards through the driveline until the outputs of the components in the system are obtained. This method is usually adopted to design high-level control strategies. In the forward-facing model, based on the feedback of the actual vehicle speed, a driver model is adopted to generate the control signals of the acceleration and the brake pedals. These signals are received by the vehicle control units (VCUs), and the optimal torque split is computed. Then, the energy consumption of the power sources can be calculated. This method serves as the platform of design and calibration of real-time control strategies. In this study, a forward-facing simulator was developed in the Matlab/Simulink environment.
For the single-shaft parallel hybrid system described above, the power demand of the vehicle is satisfied by the sum of the output power of the engine and the ISG, and the demanded power to propel the vehicle can be calculated as where ρ is the air density, D C is the drag coefficient of wind, A is the frontal area, m is the total mass of the vehicle, α is the road slope, v is the vehicle speed, and r f is the rolling resistance coefficient.

Vehicle Model
According to physical causality principles, two different modeling approaches can be used: forward-or backward-facing modeling. In the backward-facing model, the required torque at the wheel is calculated backwards through the driveline until the outputs of the components in the system are obtained. This method is usually adopted to design high-level control strategies. In the forward-facing model, based on the feedback of the actual vehicle speed, a driver model is adopted to generate the control signals of the acceleration and the brake pedals. These signals are received by the vehicle control units (VCUs), and the optimal torque split is computed. Then, the energy consumption of the power sources can be calculated. This method serves as the platform of design and calibration of real-time control strategies. In this study, a forward-facing simulator was developed in the Matlab/Simulink environment.
For the single-shaft parallel hybrid system described above, the power demand of the vehicle is satisfied by the sum of the output power of the engine and the ISG, and the demanded power to propel the vehicle can be calculated as where ρ is the air density, C D is the drag coefficient of wind, A is the frontal area, m is the total mass of the vehicle, α is the road slope, v is the vehicle speed, and f r is the rolling resistance coefficient. The engine was modeled as static look-up tables, and data were obtained through bench tests. Fuel flow rate is a function of the engine speed and output torque: . m f (t) = f (T Eng (t), ω Eng (t)). The engine output power can be calculated by where LHV is the lower heating value of the fuel, and η Eng is the mechanical efficiency of the engine.

ISG
A permanent magnet synchronous motor (PMSM) was adopted in this research, and the lumped efficiency of the system was obtained through bench tests. The speed of the ISG is proportional to the wheel speed and can be calculated as where i CVT and i FD represent the speed ratio of the CVT and the final drive, respectively. The output torque of the ISG is confined by the maximum torque available and the output power of the battery, and the ISG output power can be calculated as when k = 1/ − 1, the ISG works as a generator/motor.

CVT
The CVT employs an electric oil pump to decouple the flow of the oil pump from the engine speed, which can reduce the energy consumption of the oil pump better than a mechanical oil pump. CVT mechanical efficiency is a function of the speed ratio, input speed, and torque: The required power by the EOP drive motor can be calculated as P EOP,mot = P pump,req /(η pump · η EOP,mot ) where η pump and η EOP,mot represent the efficiencies of the oil pump and the drive motor, respectively. P EOP,req is the power demand of the oil pump, which can be calculated as where P hyd is the required pressure of the hydraulic system, Q hyd is the flow of the hydraulic system, determined by the flow demand for speed ratio control, cooling, lubrication, and leakage.

Battery
The battery is modeled as a static equivalent circuit as described in [19], which is considered as a series structure of an ideal voltage source and internal resistance. The current can be calculated as where U oc and R int represent the open circuit voltage and the internal resistance of the battery, respectively, P bat is the power of the battery and is determined by the sum of the power demand of the ISG and the EOP drive motor. The value of I bat is positive during discharging and negative during charging. The battery SOC can be derived using ampere-hour integration: where SOC init is the initial SOC, and C nom is the nominated capacity of the battery pack.

ECMS and SOC Management
The optimal control problem in terms of fuel economy can be stated as follows: for a given driving cycle, find the optimal control sequence u * ∈ U in the admissible control set to minimize the total fuel consumption: where . m f is the fuel flow rate, and φ f (x(t f )) represents a constraint on the final state. For a CVT-based HEV, the battery SOC is chosen as the only state variable, x t = SOC(t). The control variables are the CVT speed ratio and the torque split factor between the engine and ISG motor: With respect to the real-time optimal control problem with state and control variable constraints, according to Pontryagin's minimum principle, for each time instant, the minimization of the cost function (10) is equivalent to minimizing the following Hamiltonian function: where S . OC(t) can be derived by Equation (9), and λ(t) is the adjoint state, defined by the Euler-Lagrange equation: .
The instantaneous cost function is to minimize the sum of the fuel consumption of the engine and the equivalent fuel consumption of the ISG, H = . m f (t) + . m f ,bat,eq (t). Define the equivalent factor s(t) for balancing fuel and battery power consumption as The equivalent fuel consumption can be calculated as Conmmonly, a pair of equivalent factors (s chg , s dis ) can be defined according to charge and discharge of the battery, and a more accurate result of the fuel consumption can be expected. However, this also leads to a bidimensional problem, which burdens the computational cost. In fact, research shows that the performances are practically the same for ECMS with a pair and a unique equivalent factor [7]. In order to reduce the computational cost, a unique equivalent factor is used for the charging and discharging of the battery [19,21]. In hybrid electric vehicles, battery SOC is controlled to change in a certain range, so the influence of the SOC deviations on the battery parameters, such as open-circuit voltage, internal resistance, and the charge/discharge efficiencies, is rather small; in other words, the dependence of H on the state variable x can be neglected. Thus, in Equation (14), . λ(t) = 0, and the equivalent factor s is a constant. For a given driving cycle, a shooting algorithm can be used to determine the optimal equivalent factor s opt , so that the constraint on the final state can be fulfilled, SOC(t f ) = SOC(t 0 ). However, the optimal equivalent factor varies with different driving cycles. In on-line applications, the equivalent factor needs to be continuously adjusted. By incorporating a reference SOC, the equivalent factor can be calculated as where s nom is the nominal equivalent factor, and f p is the penalty function to prevent the SOC from large variations. Typically, a proportional-integral (PI) controller can be adopted to maintain the SOC around the reference value. In this study, the following functions, as described in [27], were used for SOC correction: where kT is the sampling time, and n and m are the tuning parameters for the P, I multipliers, respectively. In this study, n = 4 and m = 5 were used in the penalty function. The P control was used to keep the SOC in the confined boundary [SOC min , SOC max ], and the I control was adopted to eliminate the accumulated deviation of the SOC from the reference value.

Drivability Problems
In real-time applications, if the fuel economy is the only criterion, drivability problems may arise. Typically, in a CVT-based HEV, the unconstrained control signals in Equation (13) will lead to frequent engine on/off events and drastic fluctuations of the CVT speed ratio. An engine on/off event often involves a driving mode transition; thus, excessive engine events will result in bad drivability. Additionally, when an abrupt change in the speed ratio occurs, the dynamic characteristics of the driveline will have a negative impact on the drivability. In engineering applications, CVT is usually controlled to adjust engine operating points along the optimal operating line to ensure fuel economy. The target speed ratio is a function of vehicle speed and engine output torque. Considering the dynamic response of the hydraulic system, the rate of change of the speed ratio is usually confined to a certain range [28]. As long as there is no sudden change in driving conditions, the speed ratio will not fluctuate significantly. However, in an optimization method, the algorithm is designed to select the best control policies that minimize fuel consumption, without considering the drivability performance. So, it is possible to cause erratic control signals. Therefore, control policies that are more representative of real-world driving behaviors and ensure good drivability are needed. In this paper, penalty terms have been incorporated into the cost function to constrain the frequent engine on/off events and large variations in the rate of change of the CVT speed ratio. The Hamiltonian in Equation (13) can be reformulated as (20) where α and β are the weighting factors, and I EE is an indicator function that equals 1 when an engine on or off event take places. The speed ratio differences are quadratic in order to keep the penalty term always positive and penalize large variations of the speed ratio. The Hamiltonian described above not only minimizes instantaneous fuel consumption but also takes the drivability into account. Note that the tuning of the weighting factors needs some trial and error depending on the desired outcomes.
Based on previous research [29], the weighting factors α = 0.3 and β = 1 were selected in this study.

The Adaptation of the Equivalent Factor
In the A-ECMS, the selection of the nominal equivalent factor has a great influence on the estimation of the equivalent factor, which affects the minimization of fuel consumption and the final state of the SOC. In Figure 2, SOC evaluations for different nominal equivalent factors for an initial SOC of 65% under the Urban Dynamometer Driving Schedule (UDDS) are depicted. We can see that, as the s nom increases, the final SOC value increases and does not converge to the initial value. Since the strategy is more inclined to constrain the use of battery power, this also leads to more fuel consumption, as shown in Table 2. Therefore, in on-line implementations, the nominal equivalent factor needs to be regularly updated according to different driving cycles.  The extraction of feature parameters and clustering method of driving cycles was systematically introduced in [30]. On this basis, 11 typical driving cycles were selected as the basic sample database. The database was divided into four classes, namely, urban, suburban, and highway cycles, in which the suburban driving was divided into suburban A and suburban B according to the distribution of the optimal equivalent factors. The classification of driving cycles are shown in Table 3. Note that a sufficient number of samples was required to ensure the training effect of the LVQ neural network. Moreover, in order to better reflect the characteristics of the standard cycles, a microtrip extraction method [31] was used to collect the velocity information for statistical analysis. Later, the feature vectors of each microtrip were imported as training sample data. Finally, the 11 standard cycles were divided into 81 microtrips, which could ensure the recognition accuracy.  The extraction of feature parameters and clustering method of driving cycles was systematically introduced in [30]. On this basis, 11 typical driving cycles were selected as the basic sample database. The database was divided into four classes, namely, urban, suburban, and highway cycles, in which the suburban driving was divided into suburban A and suburban B according to the distribution of the optimal equivalent factors. The classification of driving cycles are shown in Table 3. Note that a Appl. Sci. 2019, 9,2074 8 of 15 sufficient number of samples was required to ensure the training effect of the LVQ neural network. Moreover, in order to better reflect the characteristics of the standard cycles, a microtrip extraction method [31] was used to collect the velocity information for statistical analysis. Later, the feature vectors of each microtrip were imported as training sample data. Finally, the 11 standard cycles were divided into 81 microtrips, which could ensure the recognition accuracy. An LVQ neural network model was constructed in the Matlab/Simulink environment, which consisted of an input layer, a competition layer, and an output layer. The neural network was trained under supervision and the competition layer weights were adjusted according to the learning results. In this study, the LVQ1 algorithm was used, and the training process was as follows: Step 1: Initialize the competition layer weights w ij and the learning rate η (η > 0).
Step 2: Send the input vector X = (x 1 , x 2 · · · x R ) T into the input layer and calculate the Euclidean distance of the input vector and the competition layer neurons.
Step 3: Select the closest compitition layer neuron (e.g., d m ), then mark the correspnding linear output layer neuron connected as C m .
Step 4: If the category of the output layer neuron is consistent with that of the input layer, the competition layer weights are updated according to Equation (21); otherwise, they are updated according to Equation (22): w ij,new = w ij,old − η(x − w ij,old ).
Eleven feature parameters were used as input neurons, including (1) average speed v m , (2) maximum speed v max , (3) maximum acceleration a max , (4) average acceleration a m , (5) maximum deceleration d max , (6) average deceleration d m , (7) idle time ratio r idle , (8) acceleration time ratio r acc , (9) deceleration time ratio r b , (10) constant speed time ratio r c , and (11) idle times f idle . The competition layer was defined as 20 neurons, and 4 output neurons represented 4 driving patterns. The training result is shown in Figure 3. The learning rate was 0.001. After 42 iterations, the mean squared error was reduced to less than 0.05.

Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 14
Step 4: If the category of the output layer neuron is consistent with that of the input layer, the competition layer weights are updated according to Equation (21); otherwise, they are updated according to Equation (22): The training result is shown in Figure 3. The learning rate was 0.001. After 42 iterations, the mean squared error was reduced to less than 0.05. In order to verify the on-line DPR algorithm, a random combination of standard driving cycles was used, as shown in Figure 4. Since the driving cycle is not known a priori, in this paper, we assumed that the driving cycle did not change frequently. Thus, feature vectors extracted from 120 s of historical data were used to predict the driving pattern for the next 10 s. It is worth noting that the In order to verify the on-line DPR algorithm, a random combination of standard driving cycles was used, as shown in Figure 4. Since the driving cycle is not known a priori, in this paper, we assumed that the driving cycle did not change frequently. Thus, feature vectors extracted from 120 s of historical data were used to predict the driving pattern for the next 10 s. It is worth noting that the length of the recognition cycle has a great influence on the recognition accuracy. A long cycle will lead to poor accuracy, while a short cycle will increase the computational burden and may also result in frequent switching of the driving mode [16]. Due to the lack of historical data in the first 120 s at the beginning of the driving cycle, a default pattern of urban driving was used. As shown in Figure 5, the algorithm successfully recognized most of the driving patterns. However, at some points when transtioning from one driving pattern to another and the vehicle speed dramatically changed, the recognition accuracy decreased, mainly because the feature vectors were relatively close and the algorithm was sensitive to changes in speed and acceleration. In order to verify the on-line DPR algorithm, a random combination of standard driving cycles was used, as shown in Figure 4. Since the driving cycle is not known a priori, in this paper, we assumed that the driving cycle did not change frequently. Thus, feature vectors extracted from 120 s of historical data were used to predict the driving pattern for the next 10 s. It is worth noting that the length of the recognition cycle has a great influence on the recognition accuracy. A long cycle will lead to poor accuracy, while a short cycle will increase the computational burden and may also result in frequent switching of the driving mode [16]. Due to the lack of historical data in the first 120 s at the beginning of the driving cycle, a default pattern of urban driving was used. As shown in Figure  5, the algorithm successfully recognized most of the driving patterns. However, at some points when transtioning from one driving pattern to another and the vehicle speed dramatically changed, the recognition accuracy decreased, mainly because the feature vectors were relatively close and the algorithm was sensitive to changes in speed and acceleration.

Selection of the Nominal Equivalent Factor
In previous research, a fixed equivalent factor was used to represent each driving pattern [17,18], which, however, ignored the influence of the SOC on the estimation of the equivalent factor. Because of the complexity of driving conditions, when the type of driving pattern changes, the SOC does not necessarily equal the corresponding initial SOC of the optimal equivalent factor, as shown in Figure  6. At this point, the fixed equivalent factor will lead to inaccurate estimation of the equivalent factor during the driving cycle.

Selection of the Nominal Equivalent Factor
In previous research, a fixed equivalent factor was used to represent each driving pattern [17,18], which, however, ignored the influence of the SOC on the estimation of the equivalent factor. Because of the complexity of driving conditions, when the type of driving pattern changes, the SOC does not necessarily equal the corresponding initial SOC of the optimal equivalent factor, as shown in Figure 6. At this point, the fixed equivalent factor will lead to inaccurate estimation of the equivalent factor during the driving cycle.
In previous research, a fixed equivalent factor was used to represent each driving pattern [17,18], which, however, ignored the influence of the SOC on the estimation of the equivalent factor. Because of the complexity of driving conditions, when the type of driving pattern changes, the SOC does not necessarily equal the corresponding initial SOC of the optimal equivalent factor, as shown in Figure  6. At this point, the fixed equivalent factor will lead to inaccurate estimation of the equivalent factor during the driving cycle.  Figure 7 presents the optimal equivalent factors for initial SOCs ranging from 0.55 to 0.8 for different driving patterns when the target final SOC was set to 0.7. It can be seen that for a certain initial SOC, the optimal equivalent factors corresponding to different standard cycles were very concentrated, except for SOCs ranging from 0.72 to 0.8 in urban driving. Further, the relationship between the optimal equivalent factor and the initial SOC was almost linear for each driving pattern. From a statistical sense, the average curve can well reflect the general tendncy of a class of driving cycles. In this study, the average curve (shown by the red line) was used to represent the relationship between the initial SOC and the optimal equivalent factor for each driving pattern. Thus, the nominal equivalent factor in Equation (16) was periodically updated as a function of the driving pattern and the SOC.   Figure 7 presents the optimal equivalent factors for initial SOCs ranging from 0.55 to 0.8 for different driving patterns when the target final SOC was set to 0.7. It can be seen that for a certain initial SOC, the optimal equivalent factors corresponding to different standard cycles were very concentrated, except for SOCs ranging from 0.72 to 0.8 in urban driving. Further, the relationship between the optimal equivalent factor and the initial SOC was almost linear for each driving pattern. From a statistical sense, the average curve can well reflect the general tendncy of a class of driving cycles. In this study, the average curve (shown by the red line) was used to represent the relationship between the initial SOC and the optimal equivalent factor for each driving pattern. Thus, the nominal equivalent factor in Equation (16) was periodically updated as a function of the driving pattern and the SOC.

Implementation of the A-ECMS
The schematic diagram of the adaptive-ECMS strategy based on driving pattern recognition is shown in Figure 8. Vehicle speed is fed back to the controller, and the DPR algorithm regularly recognizes the driving pattern according to the feature vectors. Then, the adapter in the A-ECMS controller updates the nominal equivalent factor according to the driving pattern and the current

Implementation of the A-ECMS
The schematic diagram of the adaptive-ECMS strategy based on driving pattern recognition is shown in Figure 8. Vehicle speed is fed back to the controller, and the DPR algorithm regularly recognizes the driving pattern according to the feature vectors. Then, the adapter in the A-ECMS controller updates the nominal equivalent factor according to the driving pattern and the current battery SOC. Finally, the ECMS algorithm computes the optimal controls based on the real-time equivalent factor and the torque request.

Implementation of the A-ECMS
The schematic diagram of the adaptive-ECMS strategy based on driving pattern recognition is shown in Figure 8. Vehicle speed is fed back to the controller, and the DPR algorithm regularly recognizes the driving pattern according to the feature vectors. Then, the adapter in the A-ECMS controller updates the nominal equivalent factor according to the driving pattern and the current battery SOC. Finally, the ECMS algorithm computes the optimal controls based on the real-time equivalent factor and the torque request.

Simulation Results
In order to verify the effectiveness of the proposed method, simulation tests were conducted under a test driving cycle, which was a random combination of six standard driving cycles: NurembergR36, HL05, SC03, REP05, HWFET, and FTP. The velocity profile of the cycle and the result of the on-line DPR are shown in Figure 9. The sampling time for the simulation was 0.1 s. Both the initial and the reference SOCs were set to 0.7. For simplicity, the proposed adaptive-ECMS based on DPR is hereinafter referred to as the "proposed method".

Simulation Results
In order to verify the effectiveness of the proposed method, simulation tests were conducted under a test driving cycle, which was a random combination of six standard driving cycles: NurembergR36, HL05, SC03, REP05, HWFET, and FTP. The velocity profile of the cycle and the result of the on-line DPR are shown in Figure 9. The sampling time for the simulation was 0.1 s. Both the initial and the reference SOCs were set to 0.7. For simplicity, the proposed adaptive-ECMS based on DPR is hereinafter referred to as the "proposed method". In this part, simulation results are demonstrated to verify the drivability improvements.
Comparison of the torque split factor is shown in Figure 10. When the weighting factor 0 α = , the engine on/off events took place as much as 828 times for the test cycle. After the constraint was imposed, the frequency of changing of the torque split factor decreased to a large extent, which means that less driving mode switching occurred, better drivability could be expected, and engine on/off events dropped by 62.6%. Improvement of the CVT speed ratio signal is shown in Figure 11. When the weighting factor 0 β = , the fluctuation of the speed ratio was rather frequent and drastic. After the constraint was imposed, the speed ratio signal was smoothed and more representative of realworld driving behavior. Although the introduction of penalty terms inherently leads to extra fuel consumption, for the test cycle, the equivalent fuel consumption only increased by 1.1%, which is fairly acceptable. In this part, simulation results are demonstrated to verify the drivability improvements. Comparison of the torque split factor is shown in Figure 10. When the weighting factor α = 0, the engine on/off events took place as much as 828 times for the test cycle. After the constraint was imposed, the frequency of changing of the torque split factor decreased to a large extent, which means that less driving mode switching occurred, better drivability could be expected, and engine on/off events dropped by 62.6%. Improvement of the CVT speed ratio signal is shown in Figure 11. When the weighting factor β = 0, the fluctuation of the speed ratio was rather frequent and drastic. After the constraint was imposed, the speed ratio signal was smoothed and more representative of real-world driving behavior. Although the introduction of penalty terms inherently leads to extra fuel consumption, for the test cycle, the equivalent fuel consumption only increased by 1.1%, which is fairly acceptable. events dropped by 62.6%. Improvement of the CVT speed ratio signal is shown in Figure 11. When the weighting factor 0 β = , the fluctuation of the speed ratio was rather frequent and drastic. After the constraint was imposed, the speed ratio signal was smoothed and more representative of realworld driving behavior. Although the introduction of penalty terms inherently leads to extra fuel consumption, for the test cycle, the equivalent fuel consumption only increased by 1.1%, which is fairly acceptable.  To examine the fuel economy improvement of the proposed method, an adaptive-ECMS with a PI controller and a DPR-algorithm-based A-ECMS as reported in [17] were calculated under test cycles for comparison. For simplicity, the former is referred to as the "A-ECMS", and the mean equivalent factor of four driving patterns for an initial SOC of 0.7 was selected as the nominal equivalent factor. The latter is referred to as the "DPR A-ECMS", in which the nominal equivalent events dropped by 62.6%. Improvement of the CVT speed ratio signal is shown in Figure 11. When the weighting factor 0 β = , the fluctuation of the speed ratio was rather frequent and drastic. After the constraint was imposed, the speed ratio signal was smoothed and more representative of realworld driving behavior. Although the introduction of penalty terms inherently leads to extra fuel consumption, for the test cycle, the equivalent fuel consumption only increased by 1.1%, which is fairly acceptable.  To examine the fuel economy improvement of the proposed method, an adaptive-ECMS with a PI controller and a DPR-algorithm-based A-ECMS as reported in [17] were calculated under test cycles for comparison. For simplicity, the former is referred to as the "A-ECMS", and the mean equivalent factor of four driving patterns for an initial SOC of 0.7 was selected as the nominal equivalent factor. The latter is referred to as the "DPR A-ECMS", in which the nominal equivalent To examine the fuel economy improvement of the proposed method, an adaptive-ECMS with a PI controller and a DPR-algorithm-based A-ECMS as reported in [17] were calculated under test cycles for comparison. For simplicity, the former is referred to as the "A-ECMS", and the mean equivalent factor of four driving patterns for an initial SOC of 0.7 was selected as the nominal equivalent factor. The latter is referred to as the "DPR A-ECMS", in which the nominal equivalent factor was periodically updated only by the selected driving pattern. The off-line ECMS with an optimal equivalent factor served as the benchmark for the on-line strategies.
In Figure 12, the optimal equivalent factor for the corresponding test cycle was 2.987, and the equivalent factors for on-line strategies varied continuously with time and fluctuated around the optimal equivalent factor. As can be seen from Figure 13, for the proposed method, the battery SOC decreased at the beginning of the driving cycle, since the nominal equivalent factor was relatively smaller, and the penalty for the use of battery power was smaller as a result. At about 1200 s, when the HEV was in engine recharge mode, the vehicle speed suddenly decreased, and the battery SOC increased to 0.9 due to the braking energy regeneration. At this time, the equivalent factor decreased rapidly to penalize the output of the engine power, and the SOC was quickly controlled within the target range. At the end of the test cycle, compared with the off-line ECMS, the SOC for on-line strategies did not perfectly converge with the initial SOC; however, the proposed method showed better robustness. Note that for A-ECMS, better fuel economy and SOC management can be expected for a smaller nominal equivalent factor. increased to 0.9 due to the braking energy regeneration. At this time, the equivalent factor decreased rapidly to penalize the output of the engine power, and the SOC was quickly controlled within the target range. At the end of the test cycle, compared with the off-line ECMS, the SOC for on-line strategies did not perfectly converge with the initial SOC; however, the proposed method showed better robustness. Note that for A-ECMS, better fuel economy and SOC management can be expected for a smaller nominal equivalent factor.  The comparison of the cumulated fuel consumption and the final SOC is shown in Table 4. The fuel consumption for the DPR A-ECMS was very close to that of the proposed method and off-line ECMS; however, the constraint effect on the final state was relatively poor. Considering the correction of the total fuel consumption at the end of the driving cycle, the actual fuel consumption will be even higher. The propopsed method showed the most promising results compared with the other two online strategies, and the performance was very close to that of the off-line ECMS with the optimal equivalent factor.  increased to 0.9 due to the braking energy regeneration. At this time, the equivalent factor decreased rapidly to penalize the output of the engine power, and the SOC was quickly controlled within the target range. At the end of the test cycle, compared with the off-line ECMS, the SOC for on-line strategies did not perfectly converge with the initial SOC; however, the proposed method showed better robustness. Note that for A-ECMS, better fuel economy and SOC management can be expected for a smaller nominal equivalent factor.  The comparison of the cumulated fuel consumption and the final SOC is shown in Table 4. The fuel consumption for the DPR A-ECMS was very close to that of the proposed method and off-line ECMS; however, the constraint effect on the final state was relatively poor. Considering the correction of the total fuel consumption at the end of the driving cycle, the actual fuel consumption will be even higher. The propopsed method showed the most promising results compared with the other two online strategies, and the performance was very close to that of the off-line ECMS with the optimal equivalent factor.  The comparison of the cumulated fuel consumption and the final SOC is shown in Table 4. The fuel consumption for the DPR A-ECMS was very close to that of the proposed method and off-line ECMS; however, the constraint effect on the final state was relatively poor. Considering the correction of the total fuel consumption at the end of the driving cycle, the actual fuel consumption will be even higher. The propopsed method showed the most promising results compared with the other two on-line strategies, and the performance was very close to that of the off-line ECMS with the optimal equivalent factor.

Conclusions
In this paper, a real-time optimal control strategy for a CVT-based hybrid electric vehicle was developed. In order to enhance drivability performance, penalty terms were incorporated into the Hamiltonian to constrain frequent engine on/off events and large variations of the CVT speed ratio. For the on-line estimation of the equivalent factor, an LVQ neural network model was adopted for on-line recognition of the driving pattern. The influence of initial SOC on the optimal equivalent factor was analyzed under different driving patterns. On that basis, a method of adaptation of the equivalent factor was proposed by considering the driving pattern and the battery SOC. Simulation results indicate that the use of the penalty function can effectively enhance drivability with very little fuel overconsumption. Besides that, compared with traditional A-ECMS strategies, the proposed method has better fuel economy and robustness to varying driving cycles, and the result is close to that of the off-line ECMS with an optimal equivalent factor.