Towards Optimal Power Management of Hybrid Electric Vehicles in Real-Time : A Review on Methods , Challenges , and State-OfThe-Art Solutions

In light of increasing alerts about limited energy sources and environment degradation, it has become essential to search for alternatives to thermal engine-based vehicles which are a major source of air pollution and fossil fuel depletion. Hybrid electric vehicles (HEVs), encompassing multiple energy sources, are a short-term solution that meets the performance requirements and contributes to fuel saving and emission reduction aims. Power management methods such as regulating efficient energy flow to the vehicle propulsion, are core technologies of HEVs. Intelligent power management methods, capable of acquiring optimal power handling, accommodating system inaccuracies, and suiting real-time applications can significantly improve the powertrain efficiency at different operating conditions. Rule-based methods are simply structured and easily implementable in real-time; however, a limited optimality in power handling decisions can be achieved. Optimization-based methods are more capable of achieving this optimality at the price of augmented computational load. In the last few years, these optimization-based methods have been under development to suit real-time application using more predictive, recognitive, and artificial intelligence tools. This paper presents a review-based discussion about these new trends in real-time optimal power management methods. More focus is given to the adaptation tools used to boost methods optimality in real-time. The contribution of this work can be identified in two points: First, to provide researchers and scholars with an overview of different power management methods. Second, to point out the state-of-the-art trends in real-time optimal methods and to highlight promising approaches for future development.


Introduction
Hybrid electric vehicles (HEVs) are considered as an innovative solution towards clean and efficient transportation. Compared to conventional vehicles, HEVs offer lower fuel consumption, less hazardous emissions, and extended mileage. Therefore, they are receiving more attention from scholars, industry, and governments. Consequently, development of hybrid powertrains has been through fast evolutionary paces during the last two decades [1].
Hybrid powertrains, encompassing two or more on-board energy sources, deal with different energy forms to perform the required vehicle propulsion. Due to the complexity of such powertrains, multiple perspectives have to be considered in the development process to achieve the desired performance level as illustrated in Figure 1. In the design level, the set of design requirements and solutions are contradictive and contending in nature. Therefore, initial determination of design goals and solutions implies a base-line powertrain performance to the operation level. However, implementing efficient control schemes, i.e., power management methods, in operation level can significantly improve fuel consumption and emission rates, mileage extension, and operation robustness [2]. The interdependency between these perspectives implies a necessity to determine an efficient development strategy that achieves higher impact on the vehicle performance and minimizes tuning cost and time. In this context, the conceptual review on reported strategies gives the conclusion that development of HEVs falls into three main categories: design configuration, component sizing, and power management methods [3]. Design configuration and components sizing expend more effort to find the optimal driveline topology and the according components i.e., engine/motor size and capacity of the energy storage systems [4,5]. However, power management, defining efficient control schemes, proved to have a higher impact on powertrain performance at lower cost and the ability to target multiple objectives simultaneously or selectively [6].
The power management of HEVs can be defined as a set of algorithms regulating powertrain operation based on measured inputs and controlled outputs to achieve predefined process goals [7]. The objective of such control algorithms is the goal or set of goals to be attained instantaneously or over a specified driving time. Power management objectives include fuel consumption minimization, on-board charge sustenance/depletion, emission reduction, driveability, and components life-time maximization. Recently, further objectives have been considered such as smooth gear shifting, minimizing driveline vibration, handling and ride characteristics of the vehicle. It is required from the power management method to achieve the defined objectives and accommodate the changes of vehicle and driving conditions [8].
The mathematical formulation of control methods in terms of inputs, outputs, objectives, and constraints is referred to as the power management problem. The solution method, number of targeted objectives, and availability of inputs (measured/estimated) affect the applicability of these methods in real-time and the optimality of the solution. In this context, power management methods are classified into two main categories: rule-based and optimization-based methods as illustrated in Figure 2. In rule-based methods, the control law is defined by deterministic on-off rules or by fuzzy-logic rules. Optimization-based methods implement optimal control approaches to find a global optimal solution. This optimality can be obtained using backward calculation in global optimization methods; however, in real-time, instant-wise control decisions are taken based on estimated future cost minimization [9].

Problem Statement
The conceptual evaluation of each power management category reveals that RB strategies are characterized by lower computational requirements and higher applicability in real-time, but the solution provided is non-optimal over the specified drive range. Contrarily, optimization-based methods are sophisticated approaches, capable of finding the global optimal solution to the power management problem. However, a full "priori" knowledge about driving condition should be available-which is not the case in real applications. Briefly, a real-time-applicable control method should be able to search and find power handling decisions within the runtime limits of the on-board controller. Defining power management methods that are able to approach the global optimal solution in real-time is the state-of-the-art challenge in HEVs.
To adapt the basic power management methods to real-time application, further tools are integrated to basic power management methods to provide short-time prediction, code simplification, or situation identification in real-time. These tools, here denoted as "subsidiary tools", include pattern recognition, prediction/estimation techniques, multi-rate computing, and intelligent traffic system (ITS). A non-trivial balance between excessive complexity and solution optimality has to be achieved while integrating these tools to power management methods.

Contribution and Novelty
The main focus of this work is to review and analyze how solution optimality can be achieved in real-time power management of HEVs. To this aim, a review on power management methods in terms of mathematical formulations, objectives, and problem constraints is introduced. This review points out the pros, cons, and challenges in each method. Moreover, an in-depth analysis about the state-of-the-art trends to approach optimal solutions in real-time is given. Using categorized tables, this analysis illustrates how the subsidiary tools have been integrated into the basic power management methods.
This contribution presents, for the first time, a special insight into the real-time power management methods and reveals the associated innovative approaches. Through the comparative analysis of real-time methods and integrated subsidiary tools, this work aims to help researchers identify the promising approaches as well as the non-conquered research points in this field. This paper is organized as follows: Rule-based and optimization-based algorithms are introduced in Sections 2 and 3 respectively. The analysis of optimal-real-time methods, introducing more focus on the state-of-the-art trends, is given in Section 4, followed by the conclusion in Section 5.

Rule-Based Methods
Rule-based (RB) methods are heuristic strategies in which the control law is defined as a set of "if-then" rules to determine the control action [10]. In HEVs, RB methods are formulated using human expertise, intuition, operation boundaries, and safety considerations. The main advantage of these methods is the low computational requirements; therefore, they are widely used in several commercial vehicles like the Toyota Prius and the Honda Insight [11]. Rule-based methods are classified into: deterministic and fuzzy RB algorithms. A brief explanation of each method is given in the sequel.

Deterministic Rule-Based Methods
Deterministic RB methods are formulated in terms of fixed rules. Briefly, engine shut-off, powersplit, or battery charging commands are regulated by algorithms wherein the operation limits are already prescribed [10]. Deterministic RB methods are explained as follows: This is a classical control method, based on defining allowed working limits for the driveline components. The main objective of the control is to maintain the engine, electric motor, and battery working in specified operating ranges [12]. The overall system efficiency is not achievable, considering unplanned power demand. However, this method is widely used due to its low computational requirements [13].

Power Follower (Baseline) Control Strategy
This is a well-known strategy that reformulates the on/off algorithms into a more advanced yet applicable strategy. The main idea is to define different case-based control modules for the power management. In [14], four cases have been established for engine on/off status based on two levels of the battery's state of charge SoC b and engine power as illustrated in Figure 3. . Engine on/off rules for power follower strategy (according to [14]).
The engine operating point is mapped to leveled optimal points according to the power demand P d and available power from the battery P b . The simulation results of [14] shows the improvement of engine power efficiency; however, the overall fuel economy could not be optimized compared to similar powertrains. This articulates the main disadvantage of such engine-oriented control schemes.

Modified Power Follower-Adaptive RB (ARB)
In this method, a step-wise decision-taking process is performed. According to the current operating condition, a range of acceptable operating points S 1−n is defined for the next transition based on the driver demand and constraints as shown in Figure 4. The selection of the next point is based on instantaneous cost function minimization considering predicted future cost associated to each point.

Time
Decision  A well known example [15] is based on further development of the baseline method applied in ADVISOR simulator [16]. In this work, the candidate points are evaluated based on normalized constituent cost factors, user and target weighting, and final overall impact minimization. The method included emission reduction in the optimization objectives, hence, NOx have been decreased by 27% at the cost of a slight increase of 1.4% in energy consumption.

Frequency-Based Approach
The idea of using frequency analysis in power management is based on power demand decomposition into high and low-frequency components as shown in Figure 5. Super/ultra-capacitors are characterized by relatively high power and low energy capacity; therefore, they are more suitable to take over the load dynamics. In multi-source HEVs, batteries hold a balanced power/energy capacitance and hence are assigned to moderate dynamics load. The low frequency components are assigned to sources like engines or fuel cells to mitigate the aggressive transients of the load [17]. A hardware-in-the-loop validation of this method is conducted using an experimental HEV. The results have been compared to the thermostatic strategy stating fuel economy improvement of 5.9% soot emissions reduction of 62.7%, and a reduction of high current demand to the battery that lowered its average operating temperature by 3 • C. The battery life is then estimated using Ah-processed model revealing a reduction of 23% and consequently extending lifespan [18].

Optimal Points Tracking
This method concurs with the baseline approach in that both are targeting optimal engine operating conditions as a primary source; however, these optimal points are here targeted at higher precision using a prescribed engine map. The application of these optimal points is performed at a higher hierarchical level, then the power splitting between secondary power sources is regulated according to the given available power and charging/discharging efficiency of the secondary sources.
In [19], arbitrary local optimal points have been selected according to emissions and efficiency maps of the engine achieving 22.9% improvement of fuel economy. This principle is developed to a new definition of optimal operation line for engine in HEVs [20]. The new definition is based on merging constant battery power lines as contours to the engine map, then a new optimal line is found on these contours based on the solution of according optimization problem. The obtained power split solutions showed a qualitative fuel consumption improvement compared to the conventional line tracking method.

Fuzzy Rule-Based Methods
Fuzzy control is introduced to the power management in HEVs offering the advantages of output proportionality to different operating conditions, ease of fuzzy rules tuning, and robustness to modeling errors and inaccurate measurements. Fuzzy-based approaches are explained more in the sequel.

Basic Fuzzy
In this method, the controller performs the well-known basic steps of fuzzy logic. First, the inputs are fuzzified into membership functions, wherein human-and expertise-based rules are used to compute the fuzzy output. Finally, the output values are defuzzified to proportional control signals. It has been introduced to power management of HEVs using load leveling principle [21].
Offline control parameters optimization substantially improves the performance of fuzzy methods [22]. The investigative analysis of the results revealed that the method is adaptive to initial SoC and road-grade change achieving fuel consumption improvement of 20% compared to the conventional RB method [22].

Adaptive Fuzzy
The performance of basic fuzzy methods can be further enhanced if the control parameters are adaptive to the current operating conditions. In [23], an intelligent situation awareness agent (IEMA) is integrated to the fuzzy-based torque distribution algorithm. The driving conditions are classified based on road type and driver behavior. A learning vector quantization (LVQ) is implemented to determine the driving conditions using a limited duration of data. The results in [24] proved that IEMA is a successful extension of the basic fuzzy method for overall performance improvement.
On the other hand, fuzzy logic itself is applied for drive cycles classification into five sub-patterns [25]. An RB strategy is structured and optimized using DP for every pattern, then the driving data over 100 s are used for online pattern recognition. The achieved energy saving results showed that the developed adaptive method outperforms the basic fuzzy method.

Predictive Fuzzy
In addition to current driving condition recognition, a prediction of upcoming scenarios can be integrated into the fuzzy controller to achieve near optimal results. In [26], traffic and road type data are used to acquire knowledge about near future driving conditions. Furthermore, a protective state-of-health extension strategy is integrated into the method at the price of relatively higher fuel consumption and emissions.
A versatile driving conditions preceptive model is developed based on four traffic congestion levels [27]. Fuzzy logic parameters are optimized off-line for every condition using genetic algorithms GA. Simulation results revealed that the method is more adequate for real-life application rather than optimizing these parameters for the whole drive cycle.

Optimization-Based Methods
In these methods, the target of power management is to minimize the operation cost over the considered time span. From this perspective, these methods fall into two categories: global optimization and real-time optimization. Recent bibliometrics reveal that optimization-based methods grasp more attention in research with a percentage of 56.7% compared to rule-based methods 32.9% [11]. In the following, more details about each category are given.

Global Optimization
Global optimization methods are designed to attain the global optimum solution for the whole trip based on a priori knowledge about upcoming driving conditions. The real-time application of these methods is limited due to augmented computational load; however, they can serve as a benchmark solution to analyze, tune, and evaluate other methods [28]. The application of these methods can be conceptually categorized into dynamic and static optimization.
In dynamic optimization, the control parameters are decisively determined. The optimal values for these parameters at each time step are searched using backward calculation to minimize the overall cost function [11]. On the other hand, static optimization aims to find a fixed optimal value for each control parameter that yields balanced results at various operating conditions. To this aim, gradient-free methods are preferable where Lipschitz condition, i.e., existence and uniqueness of an optimal solution, cannot always be fulfilled [29]. A brief discussion on global optimization methods is presented in the following.

Linear Programming
Nonlinear models of hybrid powertrains can be simplified to formulate a less complex power management problem. Linearized models can be easily solved by several solvers and they achieve near optimal results at reduced computational processes. In [30], the convex power management problem is reduced to a linear one considering constant battery efficiency and neglecting power bus voltage ripples and engine transients to find the lower fuel consumption of a series propulsion system. In [31], the optimal power-split problem is formulated as a mixed-integer nonlinear optimization problem as: subject to where T denotes trip time, N number of discrete power levels, C the gap threshold between initial and minimum battery's SoC, P k power demand, and P e i and η e i the discrete power level and associated efficiency. By precalculating ∆SoC for every discrete power level in terms of maximum, minimum, and initial SoCs as: the optimization constraint in Equation (2) can be linearized, thus reducing the problem to a mixed-integer linear programming one. Henceforth, numerous efficient solvers can be implemented to find the global optimal solution. The obtained fuel saving results showed an improvement of 10-15% compared to the binary-mode strategy [31]. Nevertheless, due to the fact that complex powertrain models are not always possible to be linearized, the application of this method is still limited.

Dynamic Programming (DP)
Dynamic programming can be applied to solve different optimization problems in discrete form based on Bellman's principle of optimality [32]. The discretized formulation considers a finite number of solutions υ k at each time step k of the driving time span. The optimal solution is searched sequentially by evaluating the cost function J π of the states x k backwards in terms of instantaneous cost g k (x k , u k ) and remaining cost J k+1 . The constraints to the optimization function can be considered using a penalty term φ k,F at every step k and final step F respectively [33]. The cost function in discrete form can be written as: For the deterministic problem (DDP), the instantaneous cost function can be defined as: where P i is the individual power consumption for the Y driveline components. However, the next state x k + 1 should not necessarily coincide with the predefined discrete states. The associated cost function J k+1 should be interpolated to the nearest discrete point [33]. In the stochastic problem (SDP), the future operating conditions are defined as a probability function P for the transition from current state x k to next state x x+1 [34]. This transition probability model can be described as a normal finite-state Markov model [33] or as a homogeneous one where the future states depend only on the knowledge of the current state x k and not the previous ones [34]. The SDP problem can then be formulated as: where λ is a discount factor to ensure that J πi converges to a predefined limit. The cost is evaluated backwards for K = N − 1, N − 2, ..., 0 and checking for convergence while iterating. The iteration of next step terminates if a predefined convergence limit is achieved or if the iteration index i reaches N complete steps. The obtained solution by SDP is inferior to DDP due to the limited knowledge of upcoming driving condition leading to non-optimal policies π i [34]. Nevertheless, due to higher computational load of DP and the fact that the solution is exclusively optimal for a given driving cycle, it is used as a benchmark solution.

Genetic Algorithm (GA)
Genetic algorithm is a metaheuristic method inspired by evolution mechanisms. It initially postulates a set of solutions (chromosomes) as the first population. The solutions obtained from this first population are evaluated with respect to an objective fitness function. The best solutions are given a higher chance to develop (grow) and form the next generation. This procedure is repeated up to fulfillment of the stopping criteria (solution convergence or maximum number of iterations). The mechanism in which the next generation is established is regulated by the operators of crossover, mutation, migration, and extinction [35].
As a derivative-free algorithm, GA explores the solution subspace more accurately and avoids being trapped in a local minimum abscissa. Therefore, it proved suitability for unconstrained multi-objective optimization problems in HEVs [36]. Operational constraints are handled by penalty functions to degrade the infeasible solutions and reduce its fitness [36]. In addition, to prevent premature convergence of the solution during the selection phase, simulated annealing (SA) is typically applied to the algorithm [37].
Non-dominant sorting GA (NSGAII) considering both subpopulations of parents P i and offspring Q i into the sorting mechanism. The individuals are evaluated according to their rank which indicates the convergence to the optimal Pareto set and the crowding distance that reflects the solution diversification [38,39]. The computational load is then reduced to O(MN 2 ) instead of O(MN 3 ) in GA as shown in Figure 6. However, GA does not enable designers to have an insight about powertrain behavior during the optimization; therefore, expertise-based rules extraction is not practical.   Figure 6. Illustration of multi-objective optimization steps using NSGA-II.

Optimal Control Theory
In optimal control theory, the control law is defined for the given system such that the optimization goals are met. It is based on introducing Lagrangian/costate multipliers to the cost function according to the problem's constraints so that a sufficient condition for the optimality can be found by the Hessian of the Lagrangian H(L). Referring to the described system in Equation (6), the cost function can be defined as: where the optimization constraints are defined in L, u(t) denotes the control parameters vector. The nonlinear control path constraints and boundary conditions can be defined explicitly [40]. At lower computational effort, application of optimal control methods in a fuel cell HEV outperformed the fuel economy obtained by an equivalent cost minimization strategy (ECMS) and exhibiting smoother profile of the battery's SoC which is a further advantage of optimal control [41].
Nevertheless, the estimation of initial Lagrangian parameters L(0) (similarly initial costate or Hamiltonian) has a non-negligible monotonic effect on the solution convergence if the drive cycle is not given. To bring this method into real-time application, continuous adjustment of L(0) is needed to achieve acceptable target values [41,42].

Particle Swarm Optimization (PSO)
Inspired by intelligent swarm behavior, PSO is introduced as an optimization method for nonlinear dynamic systems. In this optimization problem, solution parameters are defined within a multi-dimensional space. In this space, a swarm of particles (solutions) is generated as an initial set of solutions. The movement (value change) of each particle can be characterized by two variables: position and speed; then adjusted for the best values for the individual particles and the whole swarm as a group. The particle's movement is then oriented to follow the best direction with respect to the group [43].
In [44], the initial solution is chosen based on a revised RB strategy. In this case, the optimal values are searched using PSO and SA for six different drive cycles. In [45], five initial solutions are set up. Considering a limit of 100 iterations, the best particle position at the last step denoted the optimal values of control parameters. However, the aforementioned contributions gave the insight that PSO suffers from limited solution optimality (based on N iterations) and improper handling of solution divergence. These issues are addressed in [46] using dynamically changing inertia weight (DCWPSO) by introducing a dynamic factor w = f (a, e), where e and a denote the particle evolution and convergence respectively. This means that the range of position and speed change of each particle was adapted to the solution closeness to global optimality achieving 15.8% reduction of fuel consumption compared to normal PSO.

Further Methods
Many other methods are introduced into the power management problem. However, most of them are still in the early development and exploration phase. Based on an integrated use of orthogonal polynomials in DIRECT method [47], a method called pseudo-spectrum is developed to enhance the direct collocation algorithm [48]. The results obtained could achieve relatively higher accuracy of the optimal solution point; however, optimality of the point itself is still not guaranteed.
Game Theory (GT) is a mathematical approach, initially presented to economics, that depends on learning, understanding, and predicting human behaviors [49]. The behavior of different energy sources in HEVs can be modeled as a multi-agent game system, where the Nash equilibrium between these agents denote an optimal (balanced) power split ratio [50]. The mathematical computation of GT is simpler than other methods. The solution is relatively less dependent on the drive cycle; however, it is less applicable to vehicular control schemes due to the dependency on human behavior expectation and the nonlinearity of complex powertrain models.
Space exploration and unimodal region elimination (SEUMRE) is a stochastic method based on spreading sampling points to explore the solution space and predict where the global solution may exist. The results converge to a final solution of highly nonlinear problems faster than SA and GA; however, the obtained solution is suboptimal [51].

Real-time Optimization
Real-time optimization methods apply instantaneous power handling policies to minimize the cost function based on future equivalence assumptions of the energy consumption. In general, the mathematical formulation of these methods should be suitable for real-time application in terms of computational requirements and memory resources [11]. A brief review to these methods is given in the sequel.

ECMS
Equivalent cost minimization strategy (ECMS) is a well-established method based on converting the on-board electric energy depletion into an equivalent fuel consumption using equivalent factors and predicts future cost to compensate this energy [52]. The key challenge in ECMS is the estimation of these equivalent factors considering individual components' efficiencies and transient dynamics of power sources. In the early version of ECMS, powertrain components have been assumed to have constant efficiencies (mean value), then the cost C tot (k(t), T e (t)) is defined as: where C ICE and C eq are the real engine fuel consumption and electric motor equivalent fuel consumption respectively. The design variables of this problem are torque demand by the driver T th (t) and gear number k(t). The equivalence of electric energy is calculated considering different charge/discharge process of the battery as: η e ·η batt · 3.6 · 10 6 ∀ T e < 0, SFC dis · P e (ω e , T e ) ·η e ·η batt 3.6 · 10 6 ∀ T e ≥ 0, where SFC rech and SFC dis are the mean specific fuel consumption for the recharge and discharge cases respectively,η e andη batt are the mean efficiency of battery and electric motor, and P e is the motor power at torque T e and speed ω e . Finally, the instantaneous optimal control problem is defined to minimize the total cost C tot (k(t), T e (t)). The obtained results revealed the ability of ECMS to yield a near optimal solution compared to DP at lower computational requirements [53]. Due to the fact that equivalence factors estimation directly influences the performance of ECMS, this point is further investigated from two perspectives: finding optimal static values or dynamic update of the estimated value [11]. The former is a typical static optimization problem that can be solved using DP or GA [54]. The latter can be classified into three subcategories: First, utilizing a correction term for the offline optimized factor using feedback control [55]. Second, considering an equivalent factor function in terms of two optimized values for charging/discharging cases and a probability factor based on current and expected electric energy depletion [56]. Third, by implementing a multi-dimensional LUT using the current information of Vehicle load [57], position [58], speed [59], or trip length [60].

Pontryagin's Minimum Principle (PMP)
As a special case of the Euler-Lagrangian equation, PMP is suitable for solving state-constrained problems in real-time under some reasonable postulates. Referring the optimal control problem in Equations (6)-(9), the Hamiltonian H be defined for all t ∈ [0, t f ] as: where, λ is the Lagrangian multipliers vector defined asλ T = −∂H/∂x and whose elements are the costate variables of the system. The condition for optimality states that there is an optimal control u * that yields an optimal state and costate's trajectories x * and λ * respectively and satisfies: in the predefined time span. Assuming in the discrete form that T f is a priori, i.e., ∂H/∂t = 0, the problem becomes much simpler and can be solved in real-time to achieve a near optimal solution [61,62].
Hereby, the core factor of this method is the initial costate determination due to its direct impact on state trajectory and solution convergence to optimality. Dynamic correction of this value is realized by feedback control methods; for example, PI-control [63] including also observers [64] to reduce the estimation error. In approximate PMP (A-PMP), a simple convex approximation of the Hamiltonian could achieve fuel consumption reduction of 6.96% in a PHEV compared to the conventional PMP [65].

Model Predictive Control (MPC)
This is an intelligent method that depends on a definite model of the dynamic system to predict the output behavior then, optimize the control parameters to achieve the desired output. Referring to Equation (8), for the discrete system description in state space: the steps of typical MPC algorithm can be illustrated as follows: First, the manipulated output y(x) is predicted stepwise ahead asŷ n = (k + 1 | k) based on n different controller sets u 1...n (k). An optimization function is then applied in order forŷ to yield the desired output as shown in Figure 7.
The according optimization function can also be defined as a multiobjective one as: where Γ l and B l are weighting matrices. The output error arising from the unknown disturbance input d(t) and inaccurate modeling of the plant P asP. Finally, optimization horizon is shifted one step ahead and the procedure is performed for the new horizon. More details of the output prediction steps and solution methods of MPC constrained problems are given in [66,67].
In the lack of a priori knowledge about the driving cycles, the prediction mechanism of future driving conditions is based on stochastic Markov chain models, neural network (Hamming NN), and fuzzy logic methods [68][69][70]. Moreover, MPC can perform at a non uniform sampling time where small samples allow better stability, continuity, and smoothness of power references and large samples (relaxation) is applied for long term planning of the energy sources [71,72]. Figure 7. Working principle of MPC based on the "moving horizon" approach [66].

Adaptive Dynamic Programming (ADP)
Dynamic programming, as an efficient optimization method, received further modifications to suit real-time applicability. Here, ADP refers to the adaptive methods where the main DP algorithm is simplified, discretized, or reconstructed to reduce the computational load during real-time control. Intuitively, the simplified version yields a suboptimal solution; however, it outperforms many other real-time control methods [73].
Iterative DP (IDP) is an approach that uses uses relatively coarse grids for both state and control vectors to determine the optimal policies for the next iteration [74]. These grids are refined through progressive iterations to achieve the global optimal fuel saving (29.76% compared to RB method) at substantially reduced computational steps and memory requirements [75]. The optimal control strategy found by IDP can be fit to real-time application using Elman NN achieving a suboptimal result of 24.6%. The drawbacks of this method is that the drive cycle is required as a priori and that small changes in traffic or ridership, even for fixed-route buses, can entirely change the driving profile [75].
In [76], a multi-rate DP is introduced such that slowly evolving dynamic states, e.g., battery thermal states, can be handled at different rates than faster ones as load dynamics. In [77], a dual-scale DP performs two optimization problems based on an advanced traffic model. On the macro-scale, a global optimal SoC profile is found for the whole trip. In real-time, the route is divided into nearly equal segments, for which the micro-scale DP solves a limited horizon optimization problem to find the optimal control parameters according to the objective SoC profile. Artificial NN (ANN) are implemented in neural DP (NDP) to reinforce the algorithm making optimal decisions [78]. However, the offline learning time of ANN is relatively long. The principle of limited horizon DP is further developed based on a generic definition of vehicle states in terms of power demand and speed dynamics of the vehicle [73,79]. The results achieved better fuel economy compared to adaptive RB methods and close to the global DP solution.

Extremum Seeking (ES)
Extremum seeking is a model-free real-time optimization method that differs from classical control paradigms in that stability is not exclusively addressed in its formulation [80]. The method is based on stimulating unknown plants using periodic excitation inputs, then using the system output to probe the gradient and seek the minima/maxima at the zero-gradient locations [81]. As shown in Figure 8, the method employs a perturbation sinusoidal signal sin(ωt) to stimulate the nonlinear system f (θ) then measure the according output. This output is filtered using a washout filter then manipulated with the same sinusoidal signal to estimate the static gradient of the plant f (θ). A final integrator is then applied to estimate a valueθ(t) such that a zero gradient i.e., local minimum of f (θ) is achieved [81].
The application of ES in HEVs is comparatively studied using three different types of filters namely: first-order, high-pass, and band-pass filter to regulate a charging/discharging policy for the batteries. The experimental validation is conducted using a HiL and the results revealed near-optimal results compared to the global optimal SoC profile found by DP. Moreover, the band-pass type outperformed the other two types with better ability to reduce load dynamics applied to the fuel cell and hence improve the durability of the energy storage system [82]. In [83], the gradient probing step is improved to yield an adaptive static map. To this aim, two factors are used to modify the optimization surface of the static map, such that the optimal points achieve current required goals i.e., range extension, efficiency maximization, or maximum power delivery based on route conditions and next refueling stations. The result's comparison to a basic RB strategy showed an improvement of energy efficiency by 1-2.1% depending of the weighting factors.

Robust Control (RC)
The principle of RC is applied to control systems with uncertain or unknown models, disturbance inputs, noisy or inaccurate measurements, and high-order nonlinear dynamics in order to maintain higher stability and robustness of the system. The method is based on better exploration of design parameters to achieve the desired output and mitigate the system's sensitivity to the multiplicative effect of the aforementioned unknown inputs. In HEVs, H 2 , H ∞ , and linear matrix inequalities (LMI) are typical methods to be applied in power managements of HEVs [53,84].
In power management of a parallel HEV using RC, the chemical energy (fuel in the tank), electric energy depletion in terms of SoC B and total power consumption defined in the state vector are used for the dynamic output-feedback controller [53]. Defining the command input u as the electric power share, the fuel consumption minimization problem can then be interpreted as maintaining u as close as possible to the power demand. The obtained results have been inferior to both DP and ECMS approaches; however, better results could be obtained when the electric synergy is omitted in pure-diesel mode. The inherit disadvantages are the complexity of problem formulation, heavy online computations, and the inability to yield near-optimal solutions [53].

Towards Optimality in Real-time
Based on the introduced literature, it can be noticed that the main trend of recent power management systems is to acquire optimal solutions in real-time. To this aim, the basic methods have to be adapted to mitigate excessive computations or being trapped in non-optimal solutions. In this section, more focus is given to the subsidiary adaptation tools used to enable optimal power management in real-time.

Subsidiary Adaptation Tools
The main drawbacks of classical RB methods are their fixed formulation and mild flexibility to suit different driving conditions. On the other side, optimization-based methods are less applicable in real-time due to their augmented computational load. The subsidiary adaptation tools are integrated into both methods to address these drawbacks. The analysis of optimal-based real-time methods gives the insight that these tools can be divided into five main categories as illustrated in Table 1. Rules optimization is applied to RB methods (deterministic and fuzzy) to increase the methods' robustness against variation of driving conditions. This can be done by searching for optimal values of the control parameters (quantitatively) or defining the optimal structure of the control vector (qualitatively) i.e., optimal set of control parameters to achieve balanced performance at different driving conditions. Multi-rate computing enables the optimization problem to be reduced into two subproblems, whereto a suitable processing rate can be assigned. This rate can be fixed for predefined cases (slow and fast) or adaptive to the length of lookahead window to find instantaneous (short-term) optimal solutions or strategic (long-term) ones.
Pattern recognition is a well-known tool, used to apply certain optimized control modules based on situation identification. Recognized patterns can be driver style (ex. calm, aggressive), road type (urban, highway), or specific routes. Pattern recognition algorithms are formulated using fuzzy logic rules, NN, or more sophisticated machine learning approaches. In addition, prediction of upcoming driving conditions is used to solve a limited-horizon optimization problem. Often, the Markov model is applied to foresee the change in next driving conditions based on statistical data analysis. In addition, observers/Kalman filters are used to estimate the SoC behavior so that the power split strategy can be optimized accordingly.
Intelligent traffic systems add more knowledge to the power management problem based on vehicular communication techniques. They are also referred to as Vehicle-to-everything (V2X) communication; under which, a Vehicle-to-Vehicle (V2V), Vehicle-to-Grid (V2G), Vehicle-to-Interface (V2I), and more can be found. Such intelligent systems offer more information about nearest refueling options, congested routes, traffic lights, and shortest path to the destination. This information enables better planning of power management and battery's charge depletion/sustenance policies.

Integration to Power Management Methods
To develop optimal real-time power management methods, two main requirements have to be fulfilled: reduction of computational load to suit vehicular control platforms and the ability to search and find near optimal solution. To this aim, the above-explained adaptation tools are integrated into many power management methods. To deliver an overview from this perspective, different power management methods are sorted out in Figure 9 considering their computational load, solution optimality, and adaptation tools integrated into each method. The ability of each method to achieve near-optimal results is based on the comparative analysis within the concomitant literature. The computational load is expressed in terms of the associated mathematical operations of each method. For example, in Region-I in Figure 9, classic conditional reasoning can be improved by defining nested case-based modules to apply related optimal control parameters. These cases are correlated to specific load levels, driving patterns, or frequency analysis of power demand [13,15,18]. Moreover, a simple nearest-optimal point selection can be defined within the deterministic rules using look-up tables [19,20]. The same principle applies to fuzzy-based reasoning where adaptive and predictive fuzzy methods can achieve better fuel saving results i.e., overall cost function reduction [22,25,27]. Nevertheless, adapted RB methods in Region-I proved better performance in comparison to the conventional on/off method; however, there is still a gap to the global optimal solution.
Methods in Region-II are based on limited horizon optimization to minimize the instantaneous equivalent cost and hence yield near optimal solutions in real-time. Moreover, pattern/route recognition is also applied to update cost equivalent factors of PMP and ECMS [59,65]. Adaptation of the computing rate is widely applied to MPC [72] and adaptive/stochastic DP [34,73]. More focus on some methods of Region-II is in the sequel. In Region-III, global cost minimum is searched by a systematic step-by-step backward calculation in DDP [33], by evolutionary algorithms [36,39,46], or by linear optimization [85]. These methods are used as benchmark for evaluation and analysis of other methods.

Insight Into Optimal Real-Time Methods
With more focus on real-time power management methods, the role of subsidiary tools is pointed out in this section. As illustrated in Table 2, offline rules optimization has been applied in literature to RB, ECMS, and PMP methods. Rules optimization can be applied for two conceptual purposes: find optimal fixed formulation of the rules that mitigates the control sensitivity to driving conditions change (quantitatively or qualitatively) [86][87][88] or to find optimal control parameters related to every driving condition separately [25,44]. In ECMS and PMP, the equivalent factors are crucial control parameters to be optimized offline as a balance-fixed or case-based values [54,89].
Pattern recognition is an efficient tool that is applied to most of the real-time power management methods. It offers the advantage of achieving better results at much lower computational effort. An important aspect of pattern recognition application in HEVs is the determination of recognized parameters (speed, acceleration, dynamics, etc.) and their discrete values for recognition. Briefly, the set of parameters to be considered for pattern recognition determines the effort of offline optimization (number of patterns) and the accuracy of online recognition [90]. Machine learning-based techniques are the state-of-the-art tools that enable better online recognition when a large number of recognized parameters are considered [91].
Once the system behavior can be characterized based on statistical analysis or classified using machine learning, a predictive model can be developed for driving conditions prediction. Stochastic or Markov models are typical examples to deliver a prediction of the upcoming driving condition, wherein low-probability predictions have to be penalized in the cost function [92,93]. System state estimation based on mathematical models can be performed by observers and Kalman filters. They are mostly used to estimates next SoC changes due to its substantial impact on power handling strategy [64,94].
Application of multi-rate computing tools has been found limited to DP and MPC methods. It performs two parallel optimization problems: long-and short-term ones. For the long term optimization problem, slow dynamical parameters are preferred to be considered. The strategic optimal path is then applied as a target to the faster-rate problem to solve a limited-horizon optimization problem that fulfills the global optimal path [76]. However, another concept is to adapt the sampling time of MPC according to the prediction horizon i.e., small sampling time for short-time prediction and inversely. This decoupling concept offers better control of dynamical parameters and longer horizon prediction [72].
Although ITS have been state-of-the-market for a few years, limited contributions considered it into the real-time power management algorithms. The information offered by different V2X communications enables the power management system to determine short-and long-term power splitting strategies yielding better fuel saving results [95]. * Numbering of subsidiary tools is acc. to Figure 9, + Driving pattern recognition, ++ Route/road type recognition; † HMM, † † Machine learning, ‡ Statistical-based, K Kalman filter, ü Luenberger observer, A ARMA filter, N NN.

Evaluation and Discussion
Based on the presented review on state-of-the-art real-time power management methods and the analysis of associated challenges, it can be observed that near optimal solutions can be achieved in real-time by two main comprehensive principles: finding optimal case-based solutions or performing limited-horizon optimization. The first is based on accurate case recognition and the latter is based on accurate driving conditions prediction and simplification of the short-term optimization problem. In this context, the integration of specific subsidiary tools is getting more and more important providing the required information in real-time.
Pattern recognition methods are the theme of modern real-time power management algorithms. They offer an essential knowledge about current operating case using a set of recognized parameters. However, no crisp discussion about the effectiveness of considering certain parameters namely to ensure better pattern recognition is reported in literature. A thorough analysis of the parameters' impact (mean/current speed, acceleration, dynamics, number of stops, ..., etc.) on case recognition (drive style, road type, ... , etc.) can contribute to defining near optimal case-based solutions in real-time.
Driving conditions prediction based on machine learning is an advantageous tool due to its capability to consider a large number of parameters, evaluate their impact on the learning process, and enhance the prediction accuracy. They require relatively long offline learning time; however, the online integration to power management algorithm offers the necessary priori knowledge about next driving conditions. Moreover, simplifying the optimal control problem by considering multi-rate sub-problems is a time-efficient tool that proved better results when applied to DP and MPC.
Finally, V2X communication systems are promising tools that, so far, are not sufficiently integrated into real-time power management systems. The offered knowledge by V2X systems, if integrated to pattern recognition or machine learning tools, can enhance the accuracy of current case recognition and better long-term planning of the power management. In addition, this knowledge is transferable to other inter-connected vehicles, grids, or even a database and hence reducing the computational effort.

Summary and Conclusions
This contribution presents, for the first time, a special focus on real-time optimal power management methods, revealing the recent innovative trends in this field. To understand the challenges in developing such methods, a comprehensive review on basic power management methods is introduced. The explanation of each method revealed its merits, demerits, and potential improvement aspects. Rule-based methods are easier to be implemented in real-time vehicular control system; however, they are less capable of finding optimal power handling solutions. Optimization-based methods perform more mathematical operations to search the global optimal solution and hence are less applicable in real-time. Adaptation of these basic methods to develop real-time optimal ones is acquiring increasing attention in the recent literature.
The thorough analysis of these optimal real-time methods revealed a significant role of subsidiary tools, to adapt the basic power management methods to real-time application. These tools are classified into five main categories: rules optimization, multi-computing rates, pattern recognition, prediction/estimation, and intelligent transportation systems (ITS). These tools offer better situation identification, limited horizon prediction of driving conditions, and more knowledge about traffic, next refueling options and alternative routes.
The integration of the subsidiary tools to different basic power management methods is discussed in detail to explain how these tools contribute to computational load reduction and add more functionality to the basic methods in real-time application. This discussion points out the well-received approaches as well as the insufficiently investigated fields. This work aims to provide researchers and scholars with the necessary knowledge about power management methods in HEVs, to shed light on the state-of-art approaches and solution, and to help identify the most promising points for future development.
Author Contributions: Both authors contributed equally to the paper, whereby the corresponding author was responsible for the writing, figures and literature research and the second author for organizing, reviewing, and proof reading of the entire contribution.

Conflicts of Interest:
The authors declare no conflict of interest.