Stochastic Optimal Control of Parallel Hybrid Electric Vehicles

Energy management strategies (EMSs) in hybrid electric vehicles (HEVs) are highly related to the fuel economy and emission performances. However, EMS constitutes a challenging problem due to the complex structure of a HEV and the unknown or partially known driving cycles. To meet this problem, this paper adopts a stochastic dynamic programming (SDP) method for the EMS of a specially designed vehicle, a pre-transmission single-shaft torque-coupling parallel HEV. In this parallel HEV, the auto clutch output is connected to the transmission input through an electric motor, which benefits an efficient motor assist operation. In this EMS, demanded torque of driver is modeled as a one-state Markov process to represent the uncertainty of future driving situations. The obtained EMS has been evaluated with ADVISOR2002 over two standard government drive cycles and a self-defined one, and compared with a dynamic programming (DP) one and a rule-based one. Simulation results have shown the real-time performance of the proposed approach, and potential vehicle performance improvement relative to the rule-based one.


Introduction
Hybrid electric vehicles (HEVs) have been considered as an inevitable choice of vehicle industry from the traditional type to the pure electric vehicle for energy shortage and air pollution.In all hybridization techniques, energy management strategy (EMS) that defines the contribution of the two power sources: an internal combustion engine (ICE) and an electric battery in fulfilling a given power demand play an important part in fuel consumption improvement of an HEV.
Extensive literature has dealt with the EMS design [1,2].In the automotive industry, the rule-based control strategies, from thermostat strategy, power follower control strategy, and engine optimal working point strategy to fuzzy rule based techniques (e.g., conventional fuzzy strategy, adaptive fuzzy logic strategy and fuzzy Q-Learning strategy), were developed on the basis of experimental trials and engineering experiences for their effectiveness in real time control [3][4][5][6][7][8][9][10].However, the nature of results limited to specific vehicle design and not coping with multiple objects limits the application of these rule-based control strategies in HEVs.
In academic circles, the optimal control strategy is more popular [11].In fact, this kind of methods usually finds the minimum fuel consumption by minimizing a cost function over a fixed driving situation.Some machine learning techniques, such as game theory, neural network, particle swarm Energies 2017, 10, 214 2 of 16 optimization, genetic algorithm, and reinforcement learning, are used in the optimization [10,[12][13][14][15].Meanwhile, both dynamic programming (DP) and Pontryagin's Minimum Principle (PMP) are famous optimal control theories in EMS design [16,17].Dynamic programming (DP), a basis of comparison for evaluating the quality of other control strategies, is used to analyze the control actions and extract implementable rules in vehicle controls (e.g., switching of the drive train operating modes and the power split control), and then to design and improve other algorithms, such as rule-based algorithm [12,16,[18][19][20].Although the computation time improved significantly with the help of the high performance multi-core processor and some computational techniques [21], the major disadvantage of the DP approach, compared with other methods, is the heavy computational cost and requiring all the driving cycle information.In recent years, equivalent consumption minimization strategy (ECMS), based on the assumptions that the consumed electric energy could be converted to the equivalent fuel consumption, ensures that the equivalent fuel consumption is minimized at each moment [22,23].However, the equivalent factor could not be concluded fast over complex driving conditions and the method still fails to be directly implemented in real-world scenarios for acquiring knowledge of future routines.The model predictive control method, characterized with optimization in receding horizon, feedback nature, and forecasting model, has been applied to EMS for not requiring detail drive schedule information in advance [24][25][26][27][28][29][30][31].
Meanwhile, since Lin et al. firstly introduced the stochastic dynamic programming (SDP) in the infinite-horizon form to the EMS of a parallel HEV in 2004 [32], present literature highlights SDP in all kinds of HEVs (e.g., ICE HEVs, fuel cell HEVs, Plug-in HEVs, and hybrid electric bus) with different drivetrains for most calculation can be carried out offline and in potential real application, especially with the development of high performance computing equipment [33,34].SDP EMS researches based on more perfect vehicle model have been concluded with simulation, hardware, and on-road testing [34][35][36].In the optimization, cost functions mainly focus on fuel economy, including electricity consumption, emissions, drivability (e.g., gear changes, braking, and engine start-stop), component degradation (e.g., state of charge, SOC, to an expected final or nominal value, fuel cell degradation for reduction of transient loading), and electrical powertrain stress (e.g., square of battery charge) [33,35,37,38].The method also highly relies on the discount factor, however, which is still an open research problem [39].The SDP aims to use the DP method to solve a statistical model of future driving conditions (e.g., slope degree, speed limits, traffic flow information, and vehicle load) for a vehicle, which is usually presented by Markov chains.The Markov chains could be modeled as one-state and multi-step-state according to both standard drive schedule and historical real-world driving data (e.g., based on fixed routine with road-segment discrete mode) [35,36,40].As a special case, shortest path SDP (SP-SDP) has been proposed, which usually does not have a discount factor and considers each cycle ending with an absorbing terminal state (e.g., vehicle stop) [35,38,41,42].
The main focus in this study is to develop a near optimal EMS with real-time application potential for a pre-transmission single-shaft torque-coupling parallel HEV that will improve the fuel economy and reduce emissions without the prior knowledge of future traffic situations and without deteriorating the vehicle performances.For the parallel HEV discussed in this paper, when clutch is engaged, engine speed is equal to the motor speed; whereas, engine speed is zero when clutch is set free.We then define the EMS with torque-split between the engine and motor.Therefore, a strategy that is based on splitting the demanded propelling torque is investigated.Unlike other methods that treat the demanded torque from driver as a priori information, it is modeled as a one-state Markov process to represent the uncertainty of congestion degree, road type, and so on.The Markov chains are modeled on an American Urban Dynamometer Driving Schedule, a Japanese 1015 drive cycle, and a self-defined one.A stochastic dynamic programming approach is then applied to the EMS problem to alleviate the cycle-sensitivity of the optimal control law.
The remainder of this paper is organized as follows.The parallel HEV model and related EMS problem formulation is described in Section 2. In Section 3, we mainly focus on the stochastic modeling of the driver torque demand, the SDP control strategy, and its specific implementation steps.To test different performances of this EMS, a simulation platform based on ADVISOR2002 was built.Simulation results and comparison with a rule based strategy and a DP based strategy over three different driving cycles are presented in Section 4. Finally, the conclusions are given in Section 5.

Parallel HEV System Configuration
In this paper, we studied a pre-transmission single-shaft torque-coupling parallel HEV as shown in Figure 1.The engine, the electric motor, the Nickel Metal Hydride (Ni-MH) battery pack, the automatic clutch, and the automated transmission constitute the parallel hybrid electric powertrain.The auto-clutch output is connected to the transmission input through an electric motor.The distinguished advantage of this architecture is an efficient motor assist operation.When the automatic clutch is disengaged, the engine will be off and isolated from the transmission.The motor will then drive the vehicle to obtain pure electrical propulsion.When the automatic clutch is engaged, the engine and the motor rotate at the same speed.Furthermore, during deceleration or on a down slope driving, the motor will work as a generator and the regenerative braking power, coming from the vehicle kinetic or potential energy, will be charged to the battery pack at the same time.The main vehicle information is shown in Table 1.
Energies 2017, 10, 214 3 of 16 steps.To test different performances of this EMS, a simulation platform based on ADVISOR2002 was built.Simulation results and comparison with a rule based strategy and a DP based strategy over three different driving cycles are presented in Section 4. Finally, the conclusions are given in Section 5.

Parallel HEV System Configuration
In this paper, we studied a pre-transmission single-shaft torque-coupling parallel HEV as shown in Figure 1.The engine, the electric motor, the Nickel Metal Hydride (Ni-MH) battery pack, the automatic clutch, and the automated transmission constitute the parallel hybrid electric powertrain.The auto-clutch output is connected to the transmission input through an electric motor.The distinguished advantage of this architecture is an efficient motor assist operation.When the automatic clutch is disengaged, the engine will be off and isolated from the transmission.The motor will then drive the vehicle to obtain pure electrical propulsion.When the automatic clutch is engaged, the engine and the motor rotate at the same speed.Furthermore, during deceleration or on a down slope driving, the motor will work as a generator and the regenerative braking power, coming from the vehicle kinetic or potential energy, will be charged to the battery pack at the same time.The main vehicle information is shown in Table 1.

System Equations
The drive train components of the parallel HEV are all modeled according to the quasi-static principle.In other words, when the vehicle response frequency is obviously higher than that of the vehicle energy flow, this response will be ignored.
Vehicle Dynamics: The torque required by the driver in wheels T w is determined by the following equation: where η is the total transmission efficiency from the transmission to the driving wheels; R(g(k)) stands for the total transmission ratio when gear is g(k); T e and T m represent the engine torque and motor torque respectively.The angular speed of driving wheels ω w can be obtained from Kinematic equation as follows: where v(k) is the vehicle speed; r w (v(k)) is the dynamic radius of the wheel; ω m is the angular speed of the motor.
From the operating equation of this parallel HEV, the follow equation can be concluded.
where B w is the viscosity coefficient; F r is the rolling resistance; F a is the aerodynamic resistance; M r = m + m r is the effective mass of the vehicle; m is the equipment mass; m r is the effective translation mass of the rotating components of the vehicle.Engine: In this paper, we adopt the quasi-static principle and ignore the engine dynamic responses.In another word, we do not consider the engine temperature-varation effects and assume the engine has been fully warmed up.The fuel consumption rate and all emissions are assumed to be static functions of engine speed and engine torque, which can be obtained through looking up the engine map.
Transmission: The gearshift sequence of the automatic transmission is modeled as a discrete-time dynamic system.
where, g(k) is the current gear position number; shift(k) stands for the gear shift command.The value of shift(k) is constrained to 1, 0, and −1, respectively representing the gearshift command: downshift, sustain, and up-shift.
Motor: According to experimental data, motor efficiency is modeled as a function of the motor torque and the motor speed.
Battery: We ignore thermal-temperature effect and transient influence (due to internal capacitance) on the battery.The battery is modeled as a voltage source with an open circuit voltage V oc and an inner resistance R int both relying on the battery state of charge SOC(k).
where Q max is the maximum charge capacity of the battery; R t is the terminal resistance of the battery; η m is the motor efficiency.

of 16
As a result, we can conclude that the state vector of the parallel HEV includes four state variables: desired torque from the driver T dem (k) = T e + T m , angular speed of the driving wheels ω w , gear-position g(k), and battery state of charge SOC(k); the control vector of the parallel HEV system is comprised of the engine torque T e (k) and gear shift command shift(k).

Optimal Control Problem Formulation
The parallel HEV we studied in this paper is modeled as a discrete-time dynamic system.The evolution of the system state is described by the following equation: where u(k) is the control vector at time step k; x(k) is the state vector at time step k.From the analysis in part 2.2, we conclude the following equation: The parallel HEV operates according to the laws of Formulas ( 1)-( 5).
To ensure the engine and battery safety, and the smooth operation of the motor, the following constraints are imposed in the optimization of the system: We defined the EMS as a policy π according to which each state x will be mapped to a related action a.The state of parallel HEV will change from x to x' under action a.The goal of EMS is to find the control input u(k) to minimize the fuel consumption while satisfying the constraints in Equation (8).We adopt a value function J, which is defined as the expected cost-accumulation from state x to state x' in the future, to evaluate the policy π.In another word, the optimal EMS is the optimal policy π * , which has the minimum value function J * .Because the operating equations and the cost function of the parallel HEV system are both time-invariant and we do not have a final time of a driving cycle, the EMS problem is formulated as a dynamic optimization problem in an infinite horizon shown in Equation (9).The benefits of the infinite horizon problem are that the generated control policy is time-invariant and thus could be easily implemented.
where N is the sampling number in a driving cycle; γ is the discount factor, which ensures the minimum value function convergence; L(x(k),u(k)) represents the cost function defined specifically as follows: where L fuel , L ems , L soc , and L gs are respectively the cost of the fuel consumption, emission, battery's state of charge, and gear shift; α, λ and υ stand for the weight factor of the emission, battery SOC, and the gear shift in the cost function respectively.They represent the importance of related item in L(x(k),u(k)).L soc is defined as Equation (11) to keep charge sustenance and to minimize the probability of battery depletion.
where, SOC ref is the desired battery state of charge at the end of the driving cycle.The gear-shifting schedule is crucial to the vehicle fuel economy.Meanwhile, frequent gear-shifting will influence the comfort and drivability.As a result, L gs is defined as follows:

Problem Implementation
We firstly use DP to get a max performance benchmark, even though it will not work in practice.Then we develop the SDP approach to solve the EMS problem.

Dynamic Programming (DP) Method
The DP technique is based on the optimality principle and guarantees the global optimal.Bellman contributed a lot to the application of the priciple [43].The details of this principle is as follows [44]: Let 1 be an optimal policy for the basic problem, and then we assume that when using π * , a given state x i occurs at time step i with positive probability.Consider the subproblem whereby we are at x i at time step i and wish to minimize the "cost-to-go" from time step i to time step Then in the EMS problem, for the driving cycle is known in advance, the DP approach is the deterministic DP method.The overall optimization problem can be decomposed into a sequence of simpler minimization sub-problems as follows.
Time step N − 1: where J * k (x(k)) is the optimal value function at time step k under state x(k).The backward method can be used to solve Equations ( 13) and ( 14).The optimal control vector at time step k will be as follows: The optimal control law will be: The main disadvantage of the DP approach, compared with other methods, is still the heavy computation burden, which limits the DP method application in EMS.

SDP Method Description
SDP is widely used in the optimization problems of uncertain planning.The essence of the SDP problem is to reach the termination state with minimum expected cost, where the termination occurs with probability 1.The DP method is a special case of SDP, where the transition probability is equal to 1 for each state-control pair [44].According to this, generally speaking, the SDP approach significantly reduces the computation burden compared with the DP method.
We define SDP problem as M = (X, A, T, L), where X, A, T, and L stand for finite state space, finite action space, probability transition matrix, and cost function respectively.At each step, the system is in one state of X = {x 1 , x 2 , . . ., x N }.In each state x k ∈ X, k = 1, 2, . . ., N − 1, there is a finite set of actions A = {a 1 , a 2 , . . ., a M } that the system performs.The parallel HEV system operates according to P(x k , a k , x k+1 ), the transitioning probability from state x k to state x k+1 under action a k , k = 1, 2, . . ., M, performed.T is composed of all the P(x k , a k , x k+1 ).The performance evaluation is given by the cost function L(x k , a k , x k+1 ) which is the cost that the vehicle has to pay for state transition from x k to x k+1 under action a k .A policy π is defined as a sequence of state-to-action maps specifying which action is chosen from the current state to the next state visited at each transition time.
A value function U(x k ) is defined as the sum of future costs for state x k under a policy π.The optimal value function U*(x k ) is defined as the average of all the minimum costs function of each state.The optimal value function U*(x k ) is then obtained by solving the following equation after k 0 time steps: where γ ∈ (0, 1) is the discount factor, which assures the infinite sum of rewards convergence.The closer γ is chosen to 1, the more long-term the system performance expectations are taken into account.The Bellman Formula shows the relationship between the value function of each state and that of its neighbors: The agent selects optimal action sequences according to the following equation.

.2. Stochastic Modeling of the Driver Torque Demand
Markov processes are applicable in fields characterized by uncertain state transitions and a necessity for making sequential decisions [28].Markov processes satisfy the Markov property, which means that state transitions are independent from actions and states encountered before the current decision step.
The required torque from driver is another expressing of the demanded power.When operating a vehicle, a driver generates the acceleration and braking signals based on personal desire and interference from driving conditions (e.g., congestion and stops) which leads to the uncertain characteristics of driver demanding torque.In another word, T dem at the next time step is only related with that of the current time step.We can conclude T dem meets Markov property.Therefore, it is modeled as a Markov process in this paper.
Meanwhile, T dem at each time step could be calculated according to the inverse kinematics model of the parallel HEV with the help of the history data of driver operations or government driving cycle.
After that, T dem is discretized to a finite set with N T values, We can then adopt the Maximum Likelihood Estimation (MLE) to calculate the probability transition matrix T = (T ij ) m×n , where m and n represent the number of rows and columns of the matrix respectively; T ij stands for the one step transition probability on the condition of T dem = T i dem at time step k and T dem = T j dem at time step k+1. where As an example, the transition probability of an American standard driving cycle, Urban Dynamometer Driving Schedule (UDDS), is shown in Figure 2. where As an example, the transition probability of an American standard driving cycle, Urban Dynamometer Driving Schedule (UDDS), is shown in Figure 2.

Implementation of the EMS Based on the SDP Approach
When implementing the SDP EMS, we firstly discretized the vehicle continuous state variables (e.g., angular speed of driving wheels w  , state of charge of the battery pack SOC and the output torque of engine Te) to one finite grid respectively.The summary of the parallel HEV modeling parameters is shown in Table 2. Secondly, according to previous driving records, the Maximum Likelihood Estimation (MLE) method was adopted to acquire the probability transition matrix of required torque from the driver.Thirdly, the policy iteration method was utilized to calculate the optimal cost function.The maximum iteration number in this paper is about 70 times.The policy iteration alternatively performs a policy evaluation step and a policy improvement step until the optimal cost function converges.The policy evaluation calculates the value function when time step , given policy πk.In the policy improvement step, according to the value function calculated at time step k, a greedy algorithm was performed to get a new policy πk+1.The overall schematic diagram of the SDP approach is shown in Figure 3.
Although the SDP approach has less time steps than the DP approach, SDP has more computation to do the probabilities for each state and policy iteration step, whereas DP is a single iteration.In this research, we adopted the following approaches to accelerate the optimization search of the SDP approach.Firstly, the cost matrix of one step state-action pairs stored in memory in the form of (x(k), a(k), x(k + 1), Lfuel, Lems, Lsoc, Lgs) for computing quickly.Secondly, we used vector

Implementation of the EMS Based on the SDP Approach
When implementing the SDP EMS, we firstly discretized the vehicle continuous state variables (e.g., angular speed of driving wheels ω w , state of charge of the battery pack SOC and the output torque of engine T e ) to one finite grid respectively.The summary of the parallel HEV modeling parameters is shown in Table 2. Secondly, according to previous driving records, the Maximum Likelihood Estimation (MLE) method was adopted to acquire the probability transition matrix of required torque from the driver.Thirdly, the policy iteration method was utilized to calculate the optimal cost function.The maximum iteration number in this paper is about 70 times.The policy iteration alternatively performs a policy evaluation step and a policy improvement step until the optimal cost function converges.The policy evaluation calculates the value function when time step k (k =1, 2, . . ., N), U k = U π k , given policy π k .In the policy improvement step, according to the value function calculated at time step k, a greedy algorithm was performed to get a new policy π k+1 .The overall schematic diagram of the SDP approach is shown in Figure 3.In the SDP algorithm, a balance between the discretization dimension and the computational complexity exists all the time.A coarse discretization will result states and actions not smooth and poor algorithm performance; more discretization points will lead to a large state-action pair number, which results in a long computing time.We firstly used a coarse discretization to compute the corresponding cost function matrix.A linear interpolation was then adopted to produce a better cost function matrix.

Performance Evaluation
Simulation experiments were done with ADVISOR2002 (NREL's Advanced Vehicles Simulator) on a Pentium IV computer with Intel Core 2 Duo 3.0 GHZ CPU and 2G memory to evaluate performance of the SDP approach with a time consumption of 180 senconds.ADVISOR is a set of models, data, and script text files for use with MATLAB and Simulink.The modeling parameters of the parallel HEV are shown in Table 2.We defined a torque-split-ratio TSR = Te/Tdem to quantify the positive power flows in the powertrain and to analyze the EMS performance.Four operation modes were also defined when the required torque is in the positive manner: pure electric traction mode (TSR = 0), hybrid traction mode (0 < TSR < 1), pure engine traction mode (TSR = 1) and battery charging mode (TSR > 1).Although the SDP approach has less time steps than the DP approach, SDP has more computation to do the probabilities for each state and policy iteration step, whereas DP is a single iteration.In this research, we adopted the following approaches to accelerate the optimization search of the SDP approach.Firstly, the cost matrix of one step state-action pairs stored in memory in the form of (x(k), a(k), x(k + 1), L fuel , L ems , L soc , L gs ) for computing quickly.Secondly, we used vector operations in MATLAB7.14 to efficiently select parameters for cost function.The computation time was significantly reduced by similar engineering knowledge, e.g., the SDP toolbox in MATALAB.Thirdly, the SDP strategy took the form of a table that maps current state of parallel HEV to an optimal action.This table could be synthesized to a function to realize in the embedded system.
In the SDP algorithm, a balance between the discretization dimension and the computational complexity exists all the time.A coarse discretization will result states and actions not smooth and poor algorithm performance; more discretization points will lead to a large state-action pair number, which results in a long computing time.We firstly used a coarse discretization to compute the corresponding cost function matrix.A linear interpolation was then adopted to produce a better cost function matrix.

Performance Evaluation
Simulation experiments were done with ADVISOR2002 (NREL's Advanced Vehicles Simulator) on a Pentium IV computer with Intel Core 2 Duo 3.0 GHZ CPU and 2G memory to evaluate performance of the SDP approach with a time consumption of 180 senconds.ADVISOR is a set of models, data, and script text files for use with MATLAB and Simulink.The modeling parameters of the parallel HEV are shown in Table 2.We defined a torque-split-ratio TSR = T e /T dem to quantify the positive power flows in the powertrain and to analyze the EMS performance.Four operation modes were also defined when the required torque is in the positive manner: pure electric traction mode (TSR = 0), hybrid traction mode (0 < TSR < 1), pure engine traction mode (TSR = 1) and battery charging mode (TSR > 1).

Simulation Results under the Urban Dynamometer Driving Schedule (UDDS)
We firstly did simulations under the standard driving cycle UDDS (Urban Dynamometer Driving Schedule) to test the SDP EMS performances.
The TSR maps of SDP EMS at driving wheel angular speed of 8 rad/s, 20 rad/s, 40 rad/s, 60 rad/s, 80 rad/s, and 100 rad/s, are shown in Figure 4a-f respectively.From Figure 4, we can see that when SOC > 0.7 and T dem < 180 N•m, a relative small torque required, TSR = 0 and the vehicle is in a low speed.Thus, the drive train is in the electric-alone traction mode; when SOC > 0.7 and T dem ≥ 180 N•m, 0 < TSR < 1 and the vehicle is in the hybrid traction mode; when 0.5 ≤ SOC ≤ 0.7, the SOC has reached its bottom line and the drive train is in engine-alone traction mode; when SOC < 0.5, the constraint on SOC is not met, TSR > 1 and the engine will charge the battery no matter what vechile speed (from 20 rad/s to 100 rad/s).We can also see that from Figure 4b-f), when the same T dem , no matter vehicle speed variation, TSR will increase with the SOC decreasing.In another word, given a specific T dem , power though the engine and fuel consumption will increase with the decreasing of SOC to meet the battery constraints.When T dem < 0, EMS could be handled with a simple way: the motor will be in regenerative braking mode and recover as much regeneration energy as possible within constraints imposed by the motor and the battery.The mechanical brake device will supply whatever is left over.its bottom line and the drive train is in engine-alone traction mode; when SOC < 0.5, the constraint on SOC is not met, TSR > 1 and the engine will charge the battery no matter what vechile speed (from 20 rad/s to 100 rad/s).We can also see that from Figure 4b-f), when the same Tdem, no matter vehicle speed variation, TSR will increase with the SOC decreasing.In another word, given a specific Tdem, power though the engine and fuel consumption will increase with the decreasing of SOC to meet the battery constraints.When Tdem < 0, EMS could be handled with a simple way: the motor will be in regenerative braking mode and recover as much regeneration energy as possible within constraints imposed by the motor and the battery.The mechanical brake device will supply whatever is left over.The SDP EMS simulation results with time going on are shown in Figure 5.We can see from Figure 5 that SDP EMS tends to keep SOC within the range of 50%-65%, which guarantees efficient battery operation and prevents battery depletion.We can also see the range of SOC leaves enough capacity to handle an extended period of the battery discharge and enough capacity to absorb a long period of charging.That means the battery is maintained near a balance point to ensure charge-sustaining.The simulation results show the reliability and viability of SDP EMS.
Figure 6 shows the torque distribution trajectories from 160-300 s, which further explains the benefits of SDP EMS in improving fuel economy.It could be seen that the engine provides the cruising torque demand while the battery pack through motor helps meet the peak torque demand.The output torque profile of an engine, between 40 N•m and 60 N•m, has a large constant region but little peaking, which satisfies the quasi-static model of the engine.
Figure 6 shows the torque distribution trajectories from 160-300 s, which further explains the benefits of SDP EMS in improving fuel economy.It could be seen that the engine provides the cruising torque demand while the battery pack through motor helps meet the peak torque demand.The output torque profile of an engine, between 40 N•m and 60 N•m, has a large constant region but little peaking, which satisfies the quasi-static model of the engine.Figure 7a,b respectively depict the torque-speed operating points of engine and motor using the rule-based EMS under UDDS.Figure 8a,b report those using the SDP EMS.Figures 7 and 8 show Figure 6 shows the torque distribution trajectories from 160-300 s, which further explains the benefits of SDP EMS in improving fuel economy.It could be seen that the engine provides the cruising torque demand while the battery pack through motor helps meet the peak torque demand.The output torque profile of an engine, between 40 N•m and 60 N•m, has a large constant region but little peaking, which satisfies the quasi-static model of the engine.Figure 7a,b respectively depict the torque-speed operating points of engine and motor using the rule-based EMS under UDDS.Figure 8a,b report those using the SDP EMS.Figures 7 and 8 show that the engine operating points of SDP distribute in the higher efficiency region than that of the rule-based method.Hence, fuel consumption of SDP will be lower than that of the rule-based method.The SDP approach helps improving fuel economy and alleviating the emissions.We could see from Figure 8a that most of the engine operating points are in 35.6%-38%, and the engine torque is in the domain of 40 N•m and 60 N•m.When higher torque is needed, the engine of SDP will give a good performance.In the rule-based EMS, the motor operating points are mainly concentrated in low efficiency region; in the SDP strategy, the motor operating points are located more in high efficiency operation region, which shows that the SDP method enjoys higher motor performance than the rule-based EMS.
see from Figure 8a that most of the engine operating points are in 35.6%-38%, and the engine torque is in the domain of 40 N•m and 60 N•m.When higher torque is needed, the engine of SDP will give a good performance.In the rule-based EMS, the motor operating points are mainly concentrated in low efficiency region; in the SDP strategy, the motor operating points are located more in high efficiency operation region, which shows that the SDP method enjoys higher motor performance than the rule-based EMS.

Comparisons of Simulation Results under Different Driving Cycles
To test the robustness of the SDP controller, two driving cycles-1015 and a new one-were also engaged in simulations.The new driving cycle, defined by ourselves, consisted of some repetitions of three urban schedules of different natures (e.g., UDDS + 1015 + WVUCITY).The 1015 driving cycle used here is the one in ADVISOR2002, depicting the Japanese 1015 mode driving cycle, which represents an urban cycle with road of zero or near zero grades [45].Meanwhile, in the new driving cycle simulation, the transition probability matrix was built for each driving cycle individually.see from Figure 8a that most of the engine operating points are in 35.6%-38%, and the engine torque is in the domain of 40 N•m and 60 N•m.When higher torque is needed, the engine of SDP will give a good performance.In the rule-based EMS, the motor operating points are mainly concentrated in low efficiency region; in the SDP strategy, the motor operating points are located more in high efficiency operation region, which shows that the SDP method enjoys higher motor performance than the rule-based EMS.

Comparisons of Simulation Results under Different Driving Cycles
To test the robustness of the SDP controller, two driving cycles-1015 and a new one-were also engaged in simulations.The new driving cycle, defined by ourselves, consisted of some repetitions of three urban schedules of different natures (e.g., UDDS + 1015 + WVUCITY).The 1015 driving cycle used here is the one in ADVISOR2002, depicting the Japanese 1015 mode driving cycle, which represents an urban cycle with road of zero or near zero grades [45].Meanwhile, in the new driving cycle simulation, the transition probability matrix was built for each driving cycle individually.

Comparisons of Simulation Results under Different Driving Cycles
To test the robustness of the SDP controller, two driving cycles-1015 and a new one-were also engaged in simulations.The new driving cycle, defined by ourselves, consisted of some repetitions of three urban schedules of different natures (e.g., UDDS + 1015 + WVUCITY).The 1015 driving cycle used here is the one in ADVISOR2002, depicting the Japanese 1015 mode driving cycle, which represents an urban cycle with road of zero or near zero grades [45].Meanwhile, in the new driving cycle simulation, the transition probability matrix was built for each driving cycle individually.
Moreover, a rule-based approach, the Parallel Electric Assist Control Strategy (PEACS), and DP approach simulations were also conducted and compared with the SDP simulation to evaluate the SDP performance though four aspects: fuel economy, engine efficiency, motor efficiency, and generating efficiency.The PEACS is a heuristic strategy defined in the ADVISOR document with five different operating modes.When vehicle speed below the minimum speed is set in advance, the motor propels the vehicle alone.When the demand torque is bigger than the maximum output torque of the engine, the motor provides an auxiliary torque.In the regenerative braking mode, the braking torque drives the motor for battery charging.Given the rotate speed and demand torque, an engine with low efficiency will be off and the motor will drive the vehicle alone.With a low SOC, the engine will drive the motor for battery charging.The cost function expressions and parameter-selection in PEACS and the DP approach are the same with the SDP strategy discussed above.
Simulation results from UDDS, 1015 and the composite driving cycle are reported in Tables 3-5.We could conclude that the SDP strategy achieves obviously better results in both fuel consumption and components efficiency (e.g., engine efficiency, motoring efficiency, and generating efficiency) compared with the rule-based control strategy, PEACS.Simulation results of SDP and the global optimum results using DP show little difference within a few percent.

Conclusions
EMS design of HEVs is a challenging problem due to its complex structure and uncertain driving conditions.A stochastic dynamic program (SDP) is adopted to solve the EMS problem of a pre-transmission single-shaft torque-coupling parallel HEV.The special configuration enjoys an effective motor assist operation.In SDP, the required torque from the driver is modeled as a one-state Markov process to represent the uncertainty of future driving situations.ADVISOR2002 simulation results under three different driving cycles: UDDS, 1015, and a self-defining one (UDDS + 1015 + WVUCITY), indicate that this special SDP EMS achieves little performance than DP method.The engine efficiency and motor efficiency are greatly improved compared with a traditional rule-based strategy, PEACS.Therefore, we can conclude that the SDP approach has the potential for an off-line real-time on board control application to use the host computer-lower machine structure.The host computer is responsible for the establishment of the transfer probability matrix and the problem solution.Moreover, the slave computer, or the embedded system, is responsible for data collection and regular updating of the energy management strategy.
The SDP in this paper is a near-optimal control strategy only considering the fuel economy, gear-shift, and SOC sustaining.Our future work will focus on the aspects that could be tradeoff with the fuel economy-e.g., PM emissions, engine noise characteristics, other battery safety indicators and so forth.

Figure 1 .
Figure 1.Configuration of the parallel HEV drive train.

Figure 1 .
Figure 1.Configuration of the parallel HEV drive train.

Figure 2 .
Figure 2. Transition probability of Tdem for standard driving cycle UDDS.

Figure 2 .
Figure 2. Transition probability of T dem for standard driving cycle UDDS.

Figure 3 .
Figure 3. Overall schematic diagram of the SDP approach.

Figure 4 .
Figure 4.The TSR maps for SDP EMS under UDDS at different driving wheel angular speed, ωw.Figure 4. The TSR maps for SDP EMS under UDDS at different driving wheel angular speed, ω w .

Figure 4 .
Figure 4.The TSR maps for SDP EMS under UDDS at different driving wheel angular speed, ωw.Figure 4. The TSR maps for SDP EMS under UDDS at different driving wheel angular speed, ω w .

Figure
Figure7a,b respectively depict the torque-speed operating points of engine and motor using the rule-based EMS under UDDS.Figure8a,b report those using the SDP EMS.Figures7 and 8show Figure7a,b respectively depict the torque-speed operating points of engine and motor using the rule-based EMS under UDDS.Figure8a,b report those using the SDP EMS.Figures7 and 8show

Figure 7 .
Figure 7. Torque-speed operating points of engine and motor using the PEACS EMS.

Figure 8 .
Figure 8. Torque-speed operating points of engine and motor using the SDP EMS.

Figure 7 .
Figure 7. Torque-speed operating points of engine and motor using the PEACS EMS.

Figure 7 .
Figure 7. Torque-speed operating points of engine and motor using the PEACS EMS.

Figure 8 .
Figure 8. Torque-speed operating points of engine and motor using the SDP EMS.

Figure 8 .
Figure 8. Torque-speed operating points of engine and motor using the SDP EMS.

Table 1 .
Important parameters of the parallel HEV.

Table 1 .
Important parameters of the parallel HEV.