A Overview of Energy Management Strategies for Hybrid Power Systems

Feng, Guoyu; Feng, Zhishu; Sun, Peng; Guo, Lulu; Chen, Zhiyong

doi:10.3390/en18174769

Open AccessReview

A Overview of Energy Management Strategies for Hybrid Power Systems

by

Guoyu Feng

¹,

Zhishu Feng

¹,

Peng Sun

¹,

Lulu Guo

² and

Zhiyong Chen

^3,*

¹

Aviation Operations Service College, Aviation University of Air Force, Changchun 130022, China

²

Research and Development Academy, China First Automobile Group Co., Ltd., Changchun 130022, China

³

School of Automotive Engineering, Jilin University, Changchun 130022, China

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(17), 4769; https://doi.org/10.3390/en18174769

Submission received: 30 July 2025 / Revised: 19 August 2025 / Accepted: 27 August 2025 / Published: 8 September 2025

(This article belongs to the Special Issue Energy, Electrical and Power Engineering: 4th Edition)

Download

Browse Figures

Versions Notes

Abstract

This paper systematically reviews and analyzes various energy management strategies, as well as the characteristics, core challenges, and general processes of energy management for hybrid vehicles, aircraft, and ships. It also Analyzes the application scenarios, advantages, and limitations of rule-based energy management strategies. Based on the characteristics, design challenges, and general processes of optimized energy management strategies, a comparative analysis was conducted of mainstream strategies such as dynamic programming algorithms, Pontryagin’s minimum principle, equivalent energy consumption minimization, and multi-objective prediction. The focus was on analyzing intelligent control energy management strategies, including hybrid power system energy management strategies and their control effects based on neural network control, adaptive dynamic programming, reinforcement learning, and deep reinforcement learning. Finally, this paper addresses the challenges in applying energy management strategies, the limitations of modeling approaches, the validation of their effectiveness, and future research directions.

Keywords:

hybrid power systems; energy management strategies; rule-based energy management strategies; optimized energy management strategies; intelligent control energy management strategies

1. Introduction

With the global energy crisis, climate warming, and environmental pollution becoming increasingly severe, the transportation industry is actively seeking efficient and environmentally friendly energy power solutions. As an innovative technology that combines traditional energy and new energy drive systems, hybrid power has great potential to improve efficiency and reduce emissions by coordinating multiple energy sources and power units, thereby improving overall energy efficiency and system performance [1]. Aircraft, vehicles, ships, and rail vehicles have very different operating environments and power requirements, and energy management needs to consider multiple contradictory goals at the same time, such as energy consumption minimization, emission control, and cruising range [2,3,4,5]. The energy management strategies of hybrid power systems for different transportation vehicles have both commonalities and individual characteristics. As the top-level control of hybrid power systems, energy management strategies constitute a core component of the technical framework, and their performance directly affects overall performance, fuel economy, and environmental friendliness [6]. Therefore, a systematic and in-depth study of management strategies is of practical significance.

Regarding the research on energy management strategies for hybrid power systems in various modes of transportation, such as airplanes, ships, trains, and rail vehicles, many researchers and engineers have achieved fruitful results. Currently, the research on energy management strategies for hybrid power systems in vehicles is the most extensive, followed by that on hybrid power aircraft. However, there are relatively fewer studies on hybrid power systems for ships, rail vehicles, and military aircraft. Different transportation tools have unique operational environments, performance requirements, and control objectives due to their specific application scenarios. For example, hybrid power unmanned aerial vehicles (UAVs) emphasize stability, reliability, and rapidity in military fields, with particular focus on responsiveness and reliability, while in civilian fields, economic viability, duration performance, and energy efficiency are prioritized [7]. Hybrid power vehicles and rail vehicles, which are based on safe and stable operation principles, exhibit good economic properties [8]. Hybrid power ships aim to reduce load fluctuations and improve power generation efficiency while minimizing costs and emissions [9]. According to the operating characteristics of hybrid power systems, various energy management strategies have emerged, such as rule-based energy management strategies, optimization-based energy management strategies, and intelligent control-based energy management strategies [10]. These strategies improve the operating efficiency of hybrid power systems, lower system costs, and provide strong technical support for their widespread application.

This paper focuses on a systematic analysis of different management strategies. Firstly, by combining different types of hybrid power systems, the classification of hybrid power system types and energy management strategies was reviewed. The system architecture and energy management characteristics of hybrid vehicles, aircraft, and ships were analyzed, as well as the core challenges and general processes of energy management strategy design. Second, we systematically analyzed rule-based energy management strategies, including deterministic rules and fuzzy logic control, and evaluated their advantages and disadvantages in applications for hybrid power systems across vehicles such as airplanes, ships, and trains, along with usage scenarios. We also analyzed optimization-based energy management strategies, including global optimization methods and instantaneous optimization methods, with a focus on the different applications, advantages, and disadvantages, as well as usage scenarios of optimization-based energy management strategies in hybrid vehicles, aircraft, and ships. A comparative analysis was conducted on the energy management strategies of mainstream hybrid power systems based on the dynamic programming algorithm (DP)/Pontryagin’s minimum principle (PMP), the equivalent consumption minimization, and multi-objective prediction. The focus was on analyzing intelligent energy management strategies, including neural network control, reinforcement learning strategies, and energy management strategies based on reinforcement learning, adaptive dynamic programming, and deep reinforcement learning.

2. Classification of Hybrid System Types and Energy Management Strategies

Mixed electric systems are a broad concept, where systems composed of two or more types of energy or power sources can be collectively referred to as mixed electric systems. These systems are widely used in various fields, such as vehicle, aircraft, and ship design [11]. A variety of power generation forms exist, including petrol-electric, hydrogen-fueled electrochemical, and solar photovoltaic–electrochemical systems. Currently, the most widely used mixed electric system is one composed of an internal combustion engine and an electric motor.

Mixed electric vehicles combine the advantages of internal combustion engines and electric motors. They effectively reduce fuel consumption, while meeting power requirements and enabling long-distance travel.

2.1. Types and Characteristics of Hybrid Vehicles

2.1.1. System Architecture and Characteristics of Diesel-Electric Hybrid Passenger Vehicles

Hybrid electric vehicle (HEV) technology has reached a certain level of maturity. It integrates three energy sources: internal combustion engine, electric motor, and battery, achieving zero emissions while maintaining high performance. The internal combustion engine serves as the primary source of power output, and the electric motor can be configured for front-wheel or rear-wheel drive to provide auxiliary driving power or to regenerate braking energy. Battery groups store electrical energy and supply starting energy to the internal combustion engine.

HEV working mode:

Fuel-only drive mode: mainly used in low-load or shutdown conditions, relying only on the internal combustion engine for power.
Hybrid mode: under medium loads, the internal combustion engine and electric motor work together to improve energy efficiency.
Pure electric mode: in the event of a high load or emergency, the internal combustion engine is switched off, and the vehicle runs entirely on the electric motor, achieving zero emissions.

According to the structural characteristics of the hybrid power system, the power output configuration can be divided into series hybrid systems, parallel hybrid systems, and combined hybrid systems [12].

(1): Series hybrid system

The engine serves as the primary power source, delivering sustained power at low speeds and high efficiency. The electric motor is responsible for assisting driving, providing auxiliary power during low-speed or city driving, significantly improving fuel efficiency. At high speeds or acceleration, the fuel engine speed increases, and the electric motor switches to primary drive. The series hybrid system is shown in Figure 1, where the engine is directly connected to the generator. Part of the energy generated by the engine is supplied to the battery through the motor and inverter, converting the electrical energy generated by the engine into chemical energy stored in the battery. The other part is used to drive the wheels through the motor. In this mode, the transmission system is only connected to the motor and can only be driven by the motor. The engine does not directly output power to the transmission system. Therefore, the system can better control the operating point of the engine, ensuring that the engine always operates in the optimal efficiency range, effectively reducing emissions and fuel consumption. As the only driving mode, motor drive also improves the operability of the system and the comfort of vehicle operation. The drawback is that the energy generated by the engine undergoes multiple conversion processes during the transmission to the wheels, resulting in a decrease in the overall energy utilization efficiency of the system. At the same time, all the power required by the transmission device is supplied by the electric motor, which increases both the power and performance demands on the electric motor and increases the construction cost.

(2): Parallel hybrid system

The engine and motor can work independently or in combination, providing flexible power output. The battery provides power in low-speed or pure electric mode, while the motor supplements power during acceleration, overtaking, or steep slopes, achieving a higher range while remaining suitable for urban commuting and long-distance driving. The hybrid power system is composed of two subsystems, including mechanical drive and electric drive, connected by a gear set to form a whole. A typical parallel hybrid power system using electric machines is shown in Figure 2. The mechanical energy generated by the engine can directly drive the wheels through the transmission device, or it can be supplied to the battery through the motor to charge and store excess energy. The chemical energy stored in the battery is converted into mechanical energy by the motor, which drives the wheels through the transmission device. Therefore, the system can be driven by a single power source (either the engine or the motor) or by both simultaneously through a power coupling device. Compared with series hybrid power systems, parallel hybrid power systems offer multiple driving modes, place lower demands on motor performance and battery capacity, and can effectively reduce system manufacturing costs. In a parallel hybrid power system, the motor is more responsible for auxiliary tasks, adjusting the engine load to operate in a high-efficiency range and playing a role in peak shaving and valley filling in various energy structures, effectively avoiding large-scale fluctuations in the engine operating point and improving the stability of system operation. The engine is directly connected to the transmission device, effectively reducing losses during energy transfer and improving the energy utilization efficiency of the system.

(3): Combined hybrid system

By combining series and parallel structures and providing comprehensive power support, both power systems can work simultaneously or independently, providing higher flexibility. According to different load conditions, various combinations of different power forms can be utilized to enable the system to operate in series, parallel, or hybrid structures. A typical hybrid system is shown in Figure 3. This structure can not only effectively control the engine operating point and improve the stability of system operation through the adjustment of the generator, but also achieve peak shaving and valley filling effects through the motor, allowing the engine to operate in the optimal efficiency range and improving the overall energy utilization efficiency of the system. Although this system has the advantages of both series and parallel hybrid power systems, it requires extremely high energy management strategies due to the large number of power sources included in the system, making it difficult to coordinate and allocate energy directly from multiple power forms. Meanwhile, due to the complex system structure and high performance requirements for a single device, it results in high construction costs.

The Energy Management System (EMS) for hybrid passenger vehicles is responsible for coordinating the energy allocation of internal combustion engines, electric motors, and batteries, optimizing energy utilization efficiency. By optimizing the energy distribution strategy, power switching (e.g., from internal combustion engine to electric motor) and energy recuperation (e.g., regenerative braking) can be achieved [13]. The system consists of hardware and software modules, such as CAN bus, ECU (electronic control unit), sensors, etc., which monitor the vehicle status (such as speed, accelerator, brake, remaining battery power) in real time and optimize the distribution of energy as needed.

The advantages of HEV are that it reduces fuel consumption and improves efficiency and economy through energy recovery technology. The battery part can realize pure electric operation to meet the requirements of energy saving, emission reduction and environmental protection. The coordinated control of the electronic control system makes the power output more stable, with better smoothness and stability.

2.1.2. Hybrid Electric Vehicle (PHEV) System Architecture and Characteristics

Hybrid electric vehicles (PHEVs) achieve zero emissions by combining conventional internal combustion engine and fuel cell technology. The internal combustion engine is the main source of power output and provides energy support. Fuel cells convert hydrogen from compressed air into electricity to charge battery packs. A battery pack stores electrical energy and provides supplementary energy for fuel cells. The Hydrogen-to-Electricity System (HTS) is responsible for converting compressed hydrogen into electricity through technologies such as catalysts, which is then uses in battery packs.

The operating modes of PHEVs typically include the following states:

Pure internal combustion engine drive mode: power is mainly supplied by the internal combustion engine (IC Engine) under low-load conditions.
Hybrid drive mode: the IC Engine and HTS operate together under moderate load conditions to improve energy utilization efficiency.
Pure fuel cell drive mode: under high-load conditions or emergencies, the IC Engine is shut down, and power generation solely relies on the HTS to achieve zero emissions [14].

The PHEV Energy Management System is responsible for coordinating the energy allocation among the internal combustion engine, fuel cell, and battery to optimize overall energy utilization. The system also enables powertrain switching operations, such as transitioning from the internal combustion engine to a fuel cell or battery power supply, along with energy recovery technologies such as regenerative braking.

Jia C et al. established a multi-physics coupling model considering electrochemical reactions, thermodynamic characteristics, and aging effects in the hybrid power system of fuel cell battery vehicles. They introduced a health perception control mechanism to monitor the health status of the fuel cell in real time and optimize energy allocation strategies to extend the life of key components. This method combines model predictive control (MPC) and deep reinforcement learning (DRL) to achieve multi-objective optimization while considering the dynamic characteristics of the system. The research conducted provides new ideas for energy management in intelligent transportation systems, and its technical roadmap can also be extended to optimize control of other complex energy systems [15].

2.2. Types and Characteristics of Hybrid Aircraft

A hybrid aircraft is an aircraft that combines a conventional combustion engine with an electric motor that work together to optimize energy efficiency and overcome the limitations of a single power source, such as a fuel engine or battery. This technology increases energy density and lowers carbon emissions. Dual-power designs use an oil-powered engine, piston, or turbine engine to work with an electric motor, either independently or in tandem [16]. The power output is distributed according to the characteristics of the flight phase:

Take-off phase: the fuel engine works in tandem with the electric motor to provide maximum thrust to shorten the take-off distance.
Cruising phase: The fuel engine is the main power, and the electric motor assists in optimizing fuel efficiency. In some cases, it is possible to achieve pure electric short-term operation.
Landing phase: the motor switches to generator mode, recovers kinetic energy, and stores it in the battery.

In addition, in the event of an emergency, the electric motor can be used as backup power to ensure safe landing in the event of flight failure. From a performance perspective, hybrid aircraft outperform pure electric aircraft in terms of cruising altitude, load capacity, and range. This combination of multiple power sources not only takes into account the reliability of traditional aircraft, but also reflects the environmental protection of electrification, which has become a key technical direction in the field of low-altitude aviation. The main forms of hybrid propulsion are as follows [17]:

Series hybrid: the internal combustion engine and battery work simultaneously, with a complex mechanism, making it suitable for scenarios that require high efficiency and long endurance.
Parallel hybrid: battery-powered during vertical take-off and landing and horizontal propulsion powered by an internal combustion engine, suitable for scenarios that require independent control of different propulsion requirements.
Series-parallel hybrid: Combines the characteristics of series and parallel, with higher flexibility and efficiency. Power distribution is typically adjusted to support load tasks by optimizing goals such as fuel economy and mechanical efficiency.

The advantage of hybrid aircraft lies in their ability to combine the dual advantages of internal combustion engine and battery, and they are significantly better than pure electric aircraft in terms of range, with a lower overall energy consumption and fewer emissions, in line with the trend of green development. In addition, the concept of solar/hydrogen hybrid drones has also attracted much attention, as this system combines the advantages of solar, fuel, and lithium batteries to achieve low-noise, pollution-free, and long-endurance flights, being suitable for applications in surveying and mapping, environmental monitoring, traffic management, and other fields [18].

2.3. Hybrid Ship System Architecture and Characteristics

In the maritime field, hybrid power systems are widely used to reduce fuel consumption, minimize environmental impact, and enhance energy efficiency.

2.3.1. Hydraulic-Electric Hybrid Propulsion System Architecture

A hydraulic-electric propulsion system combines a diesel engine with batteries to achieve low emissions and high energy efficiency. It is suitable for ships operating at intermediate and low speeds.

The diesel engine provides the mechanical power needed for ship propulsion, delivering the primary driving force.
Generators convert the mechanical energy from the diesel engine into electrical energy, storing it in batteries.
Electric propellers or propulsion systems use stored energy to generate thrust.

In terms of power output, traditional hybrid systems transfer energy through a series of components: the engine’s energy is directly transmitted to the generator for charging before being supplied to electric propulsion devices via batteries. Long-duration batteries provide electrical power to electric propulsion systems when fuel supply is unavailable. The rapid shutdown and restart function regulate energy flow through parallel or series circuits between the diesel engines and batteries. Fuel economy is excellent, emissions are low, and costs are relatively low. However, generators occupy a larger volume, with relatively poor conversion efficiency for stored energy [19].

2.3.2. Other Hybrid System Architectures

Hybrid propulsion systems can be designed to combine alternative energy sources with batteries for zero emissions and long-range capabilities, leveraging the strengths of fuel cells and batteries. Such systems are suitable for high-power demand or unmanned vessels. In terms of power generation, fuel is used to convert hydrogen or fuel gas into electrical energy, which is stored in batteries. Electrical propulsion devices are powered by batteries [20]. Other hybrid propulsion systems can integrate gas turbines (e.g., steam turbines and jet engines) with batteries for high-power density requirements. These systems offer high power output: gas turbines provide mechanical power to drive propulsion devices. Batteries convert mechanical energy into electrical energy for storage in the system. In a mixed propulsion system using a fuel cell and battery combination, hydrogen fuel is directly converted into electricity and stored in batteries. Electrical propulsion devices are then powered by these batteries.

There are also studies using diesel engines, lithium batteries, solar cells, and other hybrid propulsion systems with multiple energy supplements that can achieve longer range. Through more efficient energy management strategies, fuel consumption is continuously reduced, energy conversion and utilization efficiency are improved, and endurance is increased. Such ships can be used in both port operations and ocean voyages. Diesel engines are mainly used for low-speed or high-power demand power output. Lithium batteries, as the main energy storage unit, continuously provide electrical energy. The solar panel charges the battery when there is sufficient sunlight and provides power to the thruster at night. Electric thrusters are powered by batteries to propel ships. The comprehensive utilization of multiple energy sources not only saves energy, protects the environment, and reduces carbon emissions, but also improves the overall long-distance performance of ships. A hybrid ship system consists of a diesel generator, battery, supercapacitor, and propulsion load. To optimize energy use, a real-time power management strategy is adopted based on minimizing equivalent power consumption and utilizing model predictive control for state prediction and real-time power management. By combining optimization algorithms with model predictive control (MPC), efficient energy allocation, optimized energy management, reduced fuel consumption, and effectively improved remote performance can be achieved [21].

2.4. Control Objectives of Hybrid Power Systems

Different types of vehicles have different types of hybrid power systems, with different control objectives based on their respective performance requirements. For example, hydrogen electric hybrid systems often use a combination of batteries and supercapacitors with hydrogen fuel cells to provide power. They are used in transportation vehicles such as drones, cars, and ships. The energy management of hydrogen electric hybrid systems mainly focuses on the speed, stability, and economy of control. In practical applications, the energy distribution method between different power sources has a significant impact on the system’s energy supply efficiency, control speed, fuel consumption, operating costs, and system lifespan. To achieve different control objectives while ensuring system performance, it is necessary to design efficient and reasonable energy management strategies. The core goal of energy management strategies is to achieve efficient allocation and utilization of energy, enabling effective conversion and distribution of energy between key components, such as internal combustion engines, electric motors, conventional batteries, solar cells, etc., and improving energy utilization efficiency and system performance.

Based on different control objectives, one can select a combination of energy management strategies. However, since each strategy has its own advantages and disadvantages, their cooperative integration can achieve better control effects. Therefore, these methods are not entirely independent [21,22,23].

In summary, regardless of the type of hybrid propulsion system, the core objective of energy management strategies is to optimize multiple objectives in complex operating conditions. The key points are as follows:

Economy: minimize fuel/hydrogen consumption.
Real-time response: achieve the shortest control response time.
Range: particularly important for aircraft systems (longer range).
Stability: adapt to dynamic working conditions and enhance disturbance immunity.
Reliability and power generation: ensure system stability, robustness, and strong power generation capabilities.
Battery life: maintain high state of charge (SOC) by minimizing large current charging/discharging effects.
Emission control: reduce harmful emissions such as NO_x and CO₂ in exhaust gases.

Generally, corresponding control strategies are adopted based on different control objectives, such as rapid control, control stability, fuel economy, multi-objective optimization control, etc.

Wang Bofei et al. analyzed the energy management strategy of hydrogen electric hybrid systems based on control objectives, using two typical structural modes: direct connection of lithium-ion batteries to the DC bus or connection to DC/DC and then to the DC bus. According to different control objectives, control strategies are divided into four types: fast control, control stability, fuel economy, and multi-objective optimization control. By selecting the corresponding energy management strategies based on different control objectives, better control effects can be achieved through the complementary and collaborative work between different methods. The analyzed energy management strategies were designed to enable rapid control, stability control, fuel economy, and multi-objective optimization control [23].

2.5. Energy Management Strategy Classification and Characteristics

2.5.1. Energy Management Strategy Classification

In order to achieve a balance between environmental protection and economy, more optimized energy management strategies are needed to improve energy utilization efficiency and economy. However, challenges still remain, such as adaptability to complex working conditions, cost control, technical implementation challenges, and the lag in technological innovation relative to standardization issues. An Energy Management System is designed to enhance the overall system performance of hybrid powertrains by solving the issue of power allocation among multiple energy sources [24]. Its core task is to optimize energy use while meeting the power demand of transport vehicles, achieving efficient conversion and storage of energy.

Energy Management Systems exhibit several key characteristics: First, their adaptability enables them to effectively handle various operating conditions. Second, in terms of efficiency optimization, EMSs are designed to fully utilize renewable energy and minimize energy losses. Thirdly, cost-effectiveness is achieved by optimizing power distribution and storage systems. Additionally, EMSs contribute to environmental sustainability by optimizing fuel usage and enhancing energy efficiency, thereby reducing emissions to the greatest extent possible. These distinctive features make EMS a crucial component in achieving the economic and environmental objectives of hybrid powertrain systems.

According to the different characteristics of strategies, energy management strategies can be divided into several representative types, such as rule-based, optimized, and intelligent control strategies [25]. Rule-based strategies are divided into deterministic rules and fuzzy rules. Deterministic rule-based strategies include logic threshold strategies, thermostat strategies, state machine strategies, power-following strategies, etc. Optimization energy management strategies include global optimization and instantaneous optimization. Global optimization strategies include the dynamic programming method, Pontryagin’s minimum principle (PMP), the genetic algorithm, the particle swarm optimization algorithm, and the simulated annealing algorithm. Instantaneous optimization strategies include the equivalent fuel consumption minimization and model predictive control [26]. Intelligent control-based energy management strategies include neural networks, reinforcement learning, etc. The adaptive dynamic programming (ADP) based on artificial intelligence combines the advantages of neural networks and reinforcement learning, offering good adaptability and real-time performance [27]. Reinforcement learning is highly sensitive to data but requires a large amount of training data. Improved deep reinforcement learning strategies include the Q-learning algorithm, deep Q-network (DQN), etc. Based on DQN, improved strategies include value function-based, policy gradient-based, actor–critic, and double deep Q-network (DDQN) methods [28]. Among the many algorithms applied to energy management fields, the majority are based on value functions DQN and Deep Deterministic Policy Gradient (DDPG) algorithms based on AC. Additionally, there are strategies such as hierarchical reinforcement learning, safe reinforcement learning, multi-agent reinforcement learning, and meta-reinforcement learning. Hybrid electric systems often involve multi-objective optimization problems, requiring multiple competitive objectives to be balanced and compromising trade-offs. Rule-based strategies are simple and easy to implement but have poor adaptability and insufficient system performance optimization. Global optimization aims to obtain the global optimal solution, while instantaneous optimization emphasizes real-time performance and can achieve global optimality but has high computational complexity, making it difficult to apply in real-time scenarios.

2.5.2. Hybrid Power System Energy Management Core Challenges

The core challenges faced by hybrid electric system energy management are as follows [3,5,29]:

Dynamic demand matching: real-time adjustment of energy allocation and power output according to various operating conditions such as acceleration, deceleration, grade climbing, etc.
Variable loads: rapid changes in load requirements under different operating conditions affect the energy management strategy.
Multiple constraints: components such as energy storage systems, engines, motors, etc., need to simultaneously satisfy performance indicators, safety requirements, etc.
Environment adaptability: the system needs to maintain stable operation when facing variations in wind speed, battery state of charge, power supply fluctuations, etc.
Efficiency optimization and environmental protection: maximizing fuel or electric energy efficiency while reducing emissions.

To address these challenges, management strategy must possess strong nonlinear modeling capabilities and adaptive abilities.

2.5.3. Energy Management Strategy Design Process

To design the energy management strategy for hybrid electric systems, the following steps are required [29]:

Firstly, establish system models and perform dynamic analysis: develop system mathematical models, including the dynamic characteristics of internal combustion engines, motors, and batteries, as well as energy conversion relationships. Secondly, define objectives and optimization criteria: clearly define the goals of energy management, such as minimizing fuel consumption, real-time control, maximizing longer range, reducing emissions, etc., and design corresponding optimization criteria. Finally, design control strategies: determine the control methods to be used and develop dynamic energy allocation strategies for optimizing and adjusting energy distribution based on real-time data.

The main steps of strategy design are as follows:

(1): Firstly, in the data collection and preprocessing stage, the system continuously collects operational parameters through multi-dimensional sensors, including key indicators such as speed, acceleration, and battery status, and cleans and preprocesses these data to ensure data quality.
(2): Secondly, building a system energy management model based on machine learning algorithms can simulate the behavioral characteristics of the system under different operating conditions and optimize the control algorithm through historical data analysis during actual operation.
(3): Next, the system continuously receives the latest sensor data, uses the trained model to calculate the optimal energy allocation ratio, and converts these results into precise control instructions to achieve efficient operation.
(4): Finally, by establishing a feedback mechanism, the system can continuously correct and optimize energy management strategies, ensuring continuous improvement in performance indicators, thereby achieving maximum stability and efficiency.

The application of EMS strategies varies across different domains [30]:

In hybrid vehicles, particularly under urban traffic conditions with frequent stop-and-go motion, dynamic allocation between the internal combustion engine and electric motor is crucial to minimize fuel consumption. Strategies such as neural network-based control and reinforcement learning can adjust energy distribution in real time based on factors such as speed, acceleration, and battery state.
For hybrid aircraft operating in challenging weather conditions, flexible energy management is essential for ensuring mission safety and efficiency. Reinforcement learning strategies adapt energy allocation dynamically by considering real-time weather data and flight parameters.
In hybrid ships subjected to varying weather conditions and operational states, efficient utilization of wind and water power requires dynamic energy management strategies. Neural network-based control and reinforcement learning can optimize energy distribution based on factors such as wind speed, tidal changes, and ship motion.

These EMS applications demonstrate how advanced control strategies enhance system performance by optimizing resource allocation under diverse operating conditions.

3. Rule-Based Energy Management Strategies

The rule-based control strategy is a method of implementing automatic system control based on predefined rules or logic. This type of control strategy primarily depends on engineering experience and expert experiences to establish control rules, which often result in suboptimal outcomes compared to global optimization. While it has the advantage of low computational complexity and ease of implementation, it lacks adaptability for all operating conditions, resulting in relatively poor performance [31]. Compared to optimization control, intelligent control, and multi-objective optimization strategies, this control strategy has lower computational complexity and higher real-time performance but lacks mechanisms for adaptive adjustment to environmental changes or system internal parameters.

The rule-based control strategy includes the following:

(1): Rule base: a collection of all predefined “premise-action” rules, typically expressed in the form of IF-THEN or IF-THEN-ELSE.
(2): Input sensors: real-time sensors that collect environmental or system state data to serve as the premise judgment criterion for rules.
(3): Inference engine: A component that matches the rules in the rule base based on current input data, triggering corresponding actions or decisions. Common inference methods include forward chaining and backward chaining.
(4): Actuators: components that execute specific operations based on triggered rules, such as switching devices or adjusting parameters.

In practical applications, rule strategies are often combined with other controls for optimization and improvement and can be combined with machine learning to dynamically generate or optimize rule libraries using reinforcement learning. Fuzzy logic extension are adopted and fuzzy rules are introduced to handle uncertainty problems. Hierarchical rule design can also be used to divide rules into global and local layers, enhancing system scalability. In short, the rule-based control strategy, as a classic control method, is suitable for systems with clear scenarios and simple rules. For complex or dynamic environments, it is necessary to combine other methods, such as machine learning and fuzzy control to enhance adaptability. When designing rules, it is important to focus on the completeness of rules, priority management, and long-term maintainability.

3.1. Rules of Certainty

Deterministic rule-based control strategy is a method of implementing system control based on clear and unambiguous logical rules. Its core feature is that the execution result of the rule is completely determined by the current input state, without randomness or ambiguity [32].

In the energy management strategy controlled by deterministic rules, the core of EMS relies on experience, and rules or thresholds are often difficult to ensure optimal optimization results. However, this strategy is computationally simple, efficient, and highly responsive in real time, making it widely used.

The design process of deterministic rule-based control mainly includes the following steps: rule definition and modeling, conflict detection and priority setting, and rule validation and testing.

This strategy is the preferred solution for high reliability and strong real-time performance, especially suitable for scenarios that require strict predictability of action results, complete observability and discretization of system states, simple-rule logic, and no need for dynamic adjustment. For complex or dynamic environments, a balance between flexibility and certainty can be achieved by combining other control methods, such as model predictive control and reinforcement learning.

In hybrid power systems, energy management strategies based on deterministic rules automatically adjust energy allocation according to parameters such as speed, acceleration, and battery status through a predefined set of “condition action” rules to achieve optimal energy utilization. This strategy ensures that the hybrid power system can maintain stable operation under various working conditions, improving the reliability and safety of the system. For example, during heavy load climbing or emergency braking, the system can quickly adjust the energy distribution plan to avoid power interruption, thereby ensuring the continuity and safety of the vehicle’s power.

3.2. Fuzzy Control

Fuzzy control is controlled through fuzzy logic, mainly used to deal with uncertainty, nonlinearity, and problems that are difficult to accurately model in complex systems. Its core components include fuzzification, rule base, inference engine, and defuzzification. The design process of fuzzy control mainly includes the following steps: First, determine the input and output variables of the system. Secondly, design membership functions for each variable to describe their linguistic information. Next, establish a fuzzy rule set based on experience. Then, choose an appropriate reasoning method for inference calculation. Finally, use appropriate deblurring methods to transform the system’s behavior into executable control instructions. The advantage of this control method is that it does not require precise mathematical models, making it particularly suitable for complex dynamic systems such as nonlinear, time-varying, or time-delay, especially in scenarios where precise modeling is not possible or where human experience intervention is required.

In the field of energy management, fuzzy control technology can be further applied to achieve intelligent energy allocation strategies. The energy management strategy based on fuzzy rules is actually an important manifestation of fuzzy control theory in practical applications. Specifically, this strategy first compiles a series of fuzzy rules based on prior experience and then fuzzifies the variables of the system to obtain corresponding fuzzy information. Subsequently, these fuzzified results are used as inputs for fuzzy rules, and the system output is obtained through fuzzy inference. Finally, the inference results are transformed into specific control signals through the method of deblurring [33]. This process enables energy management strategies to achieve flexible, robust, and adaptable control effects based on the real-time status of the system and complex environmental conditions.

3.3. Hybrid Power System EMS Based on Rule-Based Control

In the field of energy management, deterministic rule-based control and fuzzy control are two complementary strategies. The former is suitable for scenarios with clear and predictable conditions, while the latter excels in handling uncertainties and nonlinear problems.

In the hybrid electric system of aircraft, Savvaris et al. proposed an EMS based on fixed rules for a drone equipped with both a rechargeable battery and a fuel cell. The control variable is the output power from the fuel cell, allowing it to operate in different modes to meet the energy demands of the hybrid electric system. The control strategy includes three operational modes: parallel mode (both power sources provide power to the drone), charging mode, and load-following mode [34].

Zhang et al. designed a fuzzy-based EMS for small-scale unmanned aerial systems with mixed propulsion systems, comparing it with passive control strategies and state machine strategies. They evaluated the performance of three EMS approaches under different power requirements: pulse power demand, normal flight task power demand, and power demand for long-endurance missions. The experimental validation revealed that all three strategies support battery charge/discharge management but showed negligible differences in fuel consumption during long-endurance missions [35].

Liu Yang focused on energy-saving and improving the fuel economy of HEV by employing rule-based EMS. The study used SOC, vehicle speed, and required torque to divide operational modes such as pure-electric drive and hybrid driving. Logical threshold values for engine efficiency intervals were established using particle swarm optimization. The objective function was defined as the minimum sum of fuel and electricity consumption costs. Key logical threshold values were optimized parameters with constraints on vehicle dynamics, including acceleration time, slope grade, and maximum speed. Both methods were validated through dynamic driving scenarios constructed using Data Envelopment Analysis for urban road conditions. The results indicated a 12.21% improvement in fuel economy after optimization, with engine operating points being more concentrated at high efficiency regions under the CD/CS modes, leading to reduced fuel consumption [36].

LI Shuang et al. investigated the application of logical threshold strategies and fuzzy logic strategies for electric line inspection multi-rotor drones. The simulation results demonstrated that fuzzy logic strategies could correctly allocate power to batteries and supercapacitors under normal and extreme operating conditions, thereby reducing large current discharging stress on batteries [37].

Chen Hongwei aimed to improve the endurance performance of oil-electric hybrid autonomous unmanned aerial vehicles by analyzing power flow transmission directions and various operational modes. The engine-generating set was controlled using a thermostat control strategy, achieving maximum efficiency. A comparative analysis of commonly used rule-based and optimization EMS strategies led to the selection of a finite-state machine control strategy for managing the engine-generating system and battery charge/discharge circuits. Based on different system working states, FSM strategies were established to accurately control dual DC/DC converters’ operating modes in real time, dynamically regulating the current values transmitted by battery groups to meet power demand requirements. The simulation results verified the feasibility of the proposed control strategy [38].

Xiaohui Zhang et al. designed an energy management strategy using finite state machines and fuzzy control to improve the energy utilization efficiency of unmanned aerial vehicles (UAVs). The strategy was applied to control a parallel hybrid system, and the simulation results showed that it reduced fuel consumption, extended the endurance time by more than half, and improved system performance [39].

Hui Y N, et al. proposed comprehensive fuel consumption calculation procedures for hybrid air vehicles, incorporating both thrust and power allocation strategies. These procedures can perform rapid assessment of fuel economy under various thrust and power allocation strategies, providing valuable inputs for system architecture optimization in hybrid air vehicle design. A fuzzy logic-based energy management strategy has been developed to ensure stable power demand from the electric propulsion system during the same flight phase and smooth transitions between different operating phases, thereby maintaining the stability of the hybrid air vehicle’s electrical power system [40].

Hu Chunming, et al. established dynamic programming strategies, fixed rules, and fuzzy logic strategies for the EMS of diesel-electric hybrid UAVs. The cumulative fuel consumption of the dynamically programmed energy management strategy is 5578.3 g, which is 4.6% and 6.5% lower than that of the fixed-rule energy management strategy and the fuzzy logic energy management strategy of power allocation factor, respectively. The average fuel consumption rate of the dynamically programmed energy management strategy is 334.2 g/kW·h, which is 5.1% and 5.9% lower than that of the fixed-rule energy management strategy and the fuzzy logic energy management strategy of power allocation factor, respectively. Aomparison of fuel economy strategies is shown in Table 1. The fuel economy of the dynamic programming algorithm energy management strategy is optimal, and the system can still maintain the aero engine operating at a point of low instantaneous fuel consumption under the condition of high power demand. In addition, in response to discrete sudden wind disturbance and random turbulence disturbance, the speed fluctuation of the fixed rule and fuzzy logic strategy is 324 r/min and 365 r/min, respectively, and the speed fluctuation of the aeroengine for the dynamic programming energy management strategy is 241 r/min, which is 33.9% and 25.6%. lower than that of the fixed rule and fuzzy logic strategies. The maximum difference between the engine speed and the target speed of the fixed rule, fuzzy logic, and dynamic programming strategies is 162 r/min, 156 r/min, and 126 r/min, respectively, indicating that the dynamic programming strategy has better stability [41].

The dual layer fuzzy control strategy analysis was conducted for energy management and engine management of a series hybrid electric unmanned aerial vehicle. Simulation results showed that after genetic algorithm optimization, compared with logic gate rules and fixed parameter PID strategy, it has more advantages in improving the fuel economy of the hybrid power system and ensuring system operation stability [42]. Chen Zongke adopted a finite state machine energy management strategy for a combined hybrid electric propulsion UAV. Based on the power requirements of the UAVs and the SOC of the lithium batteries, the strategy rationally allocates engine and electric motor power outputs to ensure the engines remain in efficient operating intervals. The simulation results indicated that the endurance significantly improved compared to traditional configurations [43]. Zhou Shengzhe, et al. established a finite state machine to manage energy distribution between the fuel cell system and lithium batteries in hybrid electric vehicles. Based on the demand power and SOC values, fixed power outputs were allocated between them [44].

Fu Z et al. designed a genetic algorithm to optimize the fuzzy controller for fuel cell power fluctuations and hydrogen consumption control problems and demonstrated the effectiveness of the strategy through simulation and experiments [45].

To achieve optimal operation of the system during a ship’s voyage, Xiao Nengqi et al. designed a mixed propulsion system comprising four diesel engine units and two screw propellers. This research investigates the operating condition characteristics and corresponding energy flow patterns of various operational modes within the system. By extracting the characteristic parameters of navigation conditions and establishing standard navigation condition profiles, we employed a fuzzy pattern recognition model to identify the type of navigation condition associated with each sample. Based on this identification, a rule-based energy management control strategy is proposed to unify and optimize energy consumption in the ship’s diesel-engine hybrid propulsion system. During navigation, the aforementioned method is utilized in real time to recognize navigation conditions by switching the state of the clutch command instructions, enabling the ship to operate under optimal conditions to reduce fuel consumption and emissions [46].

Based on the above analysis, the main conclusions regarding rule-based energy management strategies are as follows:

(1): The rule-based energy management strategy is the most fundamental and mature control method in hybrid power systems.
(2): Its core advantages are a simple structure, low computational complexity, and strong real-time performance, which can be directly deployed on low-cost hardware in vehicles.
(3): Its disadvantages include reliance on expert experience, poor adaptability, and difficulty in achieving global optimality in complex dynamic working conditions.

In practical applications, rule-based strategies are often used as benchmark strategies or combined with learning strategies, such as using reinforcement learning to optimize the membership function of fuzzy rules, in order to balance real-time performance and optimization. Although there is a lack of flexibility in coping with unforeseen complex scenarios or dynamically changing environments, they still play an important role in emphasizing speed, improving system operation reliability, reducing energy consumption, and improving fuel economy. Compared to fixed rules, fuzzy rules can better adapt to changes in operating conditions and achieve reasonable power allocation between power components.

4. Strategies Based on Optimizing Energy Management

Energy optimization strategies can be designed using mathematical models and optimization algorithms to enhance the performance and energy efficiency of hybrid systems. Global optimization control strategies, which require foreknowledge of the entire driving profile, are used to determine the optimal energy trajectory for the entire journey. On the other hand, instantaneous optimization control strategies such as EFCM and MPC focus on optimizing energy flow for the vehicle’s current operational conditions without requiring knowledge of future driving conditions [47].

Based on optimized control strategies, approaches such as dynamic programming and genetic algorithms fall under offline optimization, demanding prior knowledge of the driving profile to achieve global optimality and involving significant computational costs. Online optimization methods such as MPC and EFCM strategies adjust the energy management in real time while simplifying the model’s complexity.

The core concept is transforming complex energy optimization problems into computable forms, leveraging algorithms to find optimal or near-optimal solutions tailored to specific applications. This approach ensures that hybrid systems can achieve enhanced performance while maintaining high energy efficiency through precise control strategies.

The optimization strategy design flow includes several stages: problem definition and data acquisition, model construction, algorithm selection and solution, validation and simulation, real-time deployment and updating, etc. Dynamic optimization models can be implemented using the model predictive control technique, which relies on short-term forecast data to perform rolling optimization and dynamically adjust strategies. This involves predicting future time segments’ renewable power generation and load demands, then solving for the optimal control commands in the current time segment. After executing the control instructions, the prediction window is updated for the next iteration of optimization.

4.1. Global Optimization Methods

Global optimization refers to the process of finding the minimum or maximum value of a function throughout its entire domain, rather than just remaining at local minima or maxima [48].

The methods employed in this study include both exact approaches and approximate algorithms. Exact methods encompass techniques such as the minimum principle, dynamic programming, the branch-and-bound algorithm, etc., which systematically explore all candidate solutions to ensure finding the global optimum. Dynamic programming requires prior knowledge of the complete profile information offline to generate optimal solutions but entails high computational costs. Pontryagin’s minimum principle converts optimization problems into Hamiltonian function extremum issues and is suitable for online applications. Approximate methods include metaheuristic algorithms such as the genetic algorithm, simulated annealing, particle swarm optimization, etc., which simulate natural or physical processes to search a broader range of the solution space and partially escape from local optima.

The main challenges include high computational complexity, especially in multidimensional spaces, where exact methods often require enormous computations. Additionally, slow convergence rates are common issues for some algorithms, necessitating significant time to approach near-global optimum solutions.

4.2. Instantaneous Optimization Methods

Instantaneous optimization refers to optimization tasks that are completed within an extremely short timeframe. This approach emphasizes real-time responsiveness and rapid adjustments, rather than long-term stability or global optimality [49]. The adopted optimization methods include feedback control techniques, such as proportional-integral-derivative (PID) controllers, which adjust parameters based on the current state to achieve predefined targets.

Another approach is sliding window optimization, which focuses on optimizing within a rolling data window. This method is particularly suitable for real-time data analysis and prediction. Online algorithms are also employed, enabling continuous updates and model optimizations to adapt to dynamic changes in system parameters.

The EFCM strategy involves converting electric energy into an equivalent amount of fuel to achieve real-time optimization.

While instantaneous optimization offers advantages such as high responsiveness and the ability to adjust to dynamic parameter changes, it also faces challenges. These include potential delays due to the requirement for rapid computation, which may be constrained by hardware performance and depend on accurate system models. Additionally, real-time optimization may be affected by data noise or abrupt changes, potentially compromising system stability and reliability. The comparison of global and instantaneous optimization is shown in Table 2.

Global optimization and instantaneous optimization each have their unique characteristics and application scopes. Understanding their differences can help select appropriate algorithms and methods to meet different practical demands.

4.3. Energy Management Strategies Based on Optimization in Hybrid Power Systems

In the EMS of hybrid power systems, global optimization methods are an effective means for exploring the entire solution space to find optimal solutions. Given the multi-variable, non-linear relationships, and the complex constraints inherent in hybrid dynamic systems, global optimization methods can help overcome limitations of local optima, thereby achieving more efficient energy allocation and control.

4.3.1. EMS Based on DP/PMP in Hybrid Power Systems

Energy management strategies based on optimization in hybrid power systems are represented by DP and PMP. The optimal energy management problem of hybrid dynamic systems is essentially a multi-stage control problem [50,51]. Compared to other optimization algorithms, DP is particularly suitable for solving optimization problems involving multi-stage decision-making processes [52].

Instantaneous optimization strategies have been developed by Xu L F et al., incorporating a proportional-integral-multiplier strategy combined with dynamic programming for energy management systems. This approach enhances computational efficiency by adding an additional state variable, making it suitable for real-time control applications [53].

Stephan et al. introduced a novel PMP-DP method that integrates the PMP technique with dynamic programming to address energy management challenges in hybrid vehicles. Their solution achieved more than tenfold faster computation compared to baseline methods while maintaining equivalent performance quality, demonstrating significant improvements in computational efficiency for complex control scenarios [54].

In actual flight operations, the operating environment of the aircraft is dynamic and constantly changing. It encompasses factors such as load fluctuations, airspeed variations, wind speed changes, etc. By employing optimization methods to make optimal energy allocation decisions under different operational conditions, not only can the system’s stability and reliability be enhanced, but its battery lifespan can also be extended and overall efficiency improved. The dynamic programming algorithm can achieve optimal power distribution in the power allocation system, improving fuel economy; furthermore, it allows for the evaluation of other electromotive strategies to optimize control rules. Given that hybrid propulsion systems typically utilize multiple energy sources, achieving higher performance in power delivery also increases the demand for rational energy distribution and management. Ma Rui et al. proposed a system-level energy management method tailored for unmanned aerial vehicles equipped with fuel cell, battery, and supercapacitor technologies. This approach considers load variations and combines extremum-seeking algorithms with strategies for maximum external power extraction to optimize power distribution across multiple tasks. Through real-time simulation experiments in complex flight scenarios, the proposed SAPMP algorithm demonstrated superior performance compared to traditional ECMS and PMP methods. The method achieved a 4.6% reduction in equivalent hydrogen consumption, enhancing fuel economy by approximately 10.3%. Additionally, it significantly reduced system stress on fuel cell systems by 39.8%, extending their operational lifespan. The comparison of simulation results from different method strategies is shown in Table 3. These results highlight the SAPMP algorithm’s advantages in improving fuel economy, system efficiency, and energy source utilization under varying flight conditions [55].

These advancements underscore the potential of integrating dynamic programming with novel optimization strategies to address intricate control challenges in hybrid powertrain systems, offering promising solutions for future automotive and aerospace applications.

The DP algorithm can be applied to the hybrid power system of diesel electric ships, optimizing the power transmission of the generator and battery pack, establishing the objective function of minimizing equivalent fuel consumption to obtain the optimal control sequence of the diesel generator, and thus calculating the minimum fuel consumption value of the diesel generator throughout the entire navigation cycle. Studies show that dynamic programming can effectively reduce system fuel consumption compared to traditional pattern-type control strategies. However, its computational complexity and long calculation time make it unsuitable for real-time optimization. In practical applications, to better optimize performance and reduce energy consumption, multiple strategies can be combined. For example, combining the rule-based strategy, classical proportional integral control, EFCM strategy, and charge-depleting–charging-sustaining (CD-CS) strategies can effectively reduce energy loss. However, these strategies are not globally optimal.

Guo Xiaodong et al. propose an optimized energy management strategy based on Pontryagin’s minimum principle for the hybrid electric ship system consisting of an internal combustion engine (ICE) generator and a supercapacitor to reasonably allocate power among different power sources under varying operational conditions. The simulation results indicate that, compared with the conventional energy management strategy, this strategy optimizes the operation state of the prime mover in the ICE generator group by improving the “peak shaving” ability of the supercapacitor [56]. The fuel consumption value based on Pontryagin’s minimum principle is lower than that of the conventional strategy within the entire navigation scenario, achieving a 28.7% reduction in total fuel consumption (204.1 kg). The comparison of the results from two energy management strategies is shown in Table 4.

4.3.2. EMS Based on Equivalent Consumption Minimization Strategy

Lei Pan proposed a multi-mode equivalent consumption minimization strategy based on state identification, considering the comprehensive equivalent factor influenced by the storage battery group power output variation amplitude and energy consumption. The simulation results indicate that this strategy can enable the UAV to actively switch the expected SOC level of the storage battery group according to the flight orientation, optimize the SOC balance value in anticipation through modifying it, and prevent the storage battery group from being discharged below a certain level when the power output of the power system is high. It not only extends battery life but also improves safety [57].

Mmotapon et al. proposed an Equivalent Energy Minimum Strategy to evaluate and optimize energy management strategies by analyzing the fuel economy and overall efficiency of hybrid powertrains, aiming to reduce engine fuel consumption by maximizing battery output under state of charge constraints [58].

Dingyang Yan designed and constructed an energy management strategy simulation model based on the power output characteristics of hybrid unmanned aerial vehicles. A comparative analysis was conducted on the performance of dynamic programming (DP) and energy management strategies such as fixed rules and fuzzy logic in terms of system fuel economy and operational stability. The simulation results showed that DP can significantly reduce fuel consumption, improve system economy, and ensure the stable operation of the system when dealing with typical disturbance models such as discrete gusts and random turbulence [59].

Han J et al. used dynamic programming to obtain the global optimal equivalent factor and established a hybrid power system equivalent consumption minimization strategy control method. On this basis, a real-time equivalent factor adaptive algorithm was proposed, and the simulation results showed that the algorithm can adjust the equivalent factor according to the road slope and achieve fuel economy and battery energy storage performance comparable to that of dynamic programming while recovering as much potential energy as possible in the steep downhill section [60].

Boukoberine et al. applied ECMS to the control of multi-rotor unmanned aerial vehicles, comparing it with strategies based on rules, frequency separation, and ECMS itself. The results showed that ECMS could save 3% hydrogen [61].

Man Yi focused on a small aircraft hybrid power system incorporating fuel cells, lithium batteries, and supercapacitors, proposing an EMS based on the MPC framework with an equivalent hydrogen consumption minimization algorithm to optimize fuel economy while maintaining battery state of charge [62].

Moura et al. used a stochastic DP to optimize the energy management strategy based on global optimization for plug-in hybrid vehicles and constrained the battery SOC in the solution process, which made the battery SOC decline more slowly, thereby improving the fuel economy of the vehicle; however, there was a large amount of computation, and the online real-time application was limited [63].

Zhu Xinyu et al. designed an energy management method based on the ECMS to reduce fuel consumption and maintain the battery SOC for the hybrid system of composite wing UAVs, which equates the battery energy consumption to fuel consumption, solves the optimal control sequence through the Hamiltonian function, and dynamically adjusts the equivalent factors to balance the immediate fuel consumption and future energy replenishment demand. According to the mission profile and load requirements, the operating conditions are divided into hybrid, pure electric, and engine driven. The optimal engine power distribution is calculated in real time, the SOC is maintained around the set value, and the error is <1.5%. The simulation verification was carried out, revealing that the point-to-point flight fuel consumption was reduced by 6.07%, and the SOC error was 1.2% in the working condition test. The fuel consumption during multi-point hovering operations was reduced by 5.49%, and the SOC error was 1.5%. Compared with the rule strategy, the ECMS is significantly better than the threshold control based on expert experience in terms of fuel economy and SOC maintenance. However, one disadvantage is that the impact of environmental disturbances such as wind speed on the strategy is not considered. In addition, the simulated operating conditions need to be further adapted to the actual task. Subsequently, the equivalent factor can be optimized by combining intelligent algorithms such as reinforcement learning or extended to parallel/hybrid architectures [64].

Jian Feng selected range-extended hybrid electric systems as the focus of a research object. Starting from the conventional ECMS, new dynamic ECMS (DECMS) methods based on different information sources and multi-objective optimization are proposed and verified through simulation and experiment verification. The proposed DECMS not only enhances energy management functions, but also improves fuel economy and comfort. In order to extend the DECMS method to the situation when the vehicle mileage information is known, an extended DECMS (EDECMS) based on the mileage information is proposed. By constructing a reference SOC related to the mileage information, the hybrid driving mode decision and mode optimization are separated. The reference SOC constructed by the mileage information is utilized to switch between hybrid driving modes, and a cost function incorporating noise constraints is utilized to optimize the vehicular fuel economy and the noise level in the cockpit in real time. The verification results of the vehicle road experiments show that, compared with the DECMS, the EDECMS can further reduce the vehicular fuel consumption; meanwhile, the noise level in the cockpit is also suppressed [65].

Han Jihun et al. proposed a closed-loop solution based on the idea of equivalent energy consumption, and created an optimal solution for fuel cell hybrid vehicles, which not only considers the hydrogen consumption of the fuel cell itself, but also quantitatively estimates the fuel cell life to reduce the overall loss of the system. The hybrid power system based on fuel cells is also the mainstream technical scheme used in small electric aircraft [66].

In order to improve the operational stability, output dynamics, and energy utilization rate of multi-rotor hybrid UAVs, Yang Mingtang et al. used GT-Power and Simulink to jointly build a model and compared the rule-based energy management strategy with the equivalent consumption minimization strategy. An adaptive ECMS (A-ECMS) was designed and developed based on back propagation neural network optimization. The simulation results showed that the fluctuation rate of A-ECMS was 7.74% in terms of running stability, which is significantly lower than that of the rule-based strategy and ECMS. The speed fluctuations of the A-ECMS were 8.32% and 7.18% under compound disturbance and random turbulence, respectively, demonstrating greater stability compared to the rule-based strategy and ECMS when subjected to sudden conditions. The A-ECMS can effectively improve the dynamic performance of the hybrid system and keep the engine in an efficient operating condition. The battery power can be adjusted in real time based on changes in the SOC. The average fuel consumption rate of the A-ECMS is 297.585 g/(kW·h), and the overall fuel consumption is 3755.31 g, both significantly lower than those of the rule-based strategy and ECMS [67].

4.3.3. Multi-Objective Energy Prediction for HEVs

Chen Luming et al. proposed a layered energy management strategy combining wavelet filtering and MPC methods to achieve optimal energy distribution among multiple power sources. The strategy involves using a wavelet filter layer to decompose the load demand power into high- and low-frequency components, removing the medium- and high-frequency components via wavelet filtering to use the low-frequency component as the reference input for the MPC layer. The optimization objectives include improving fuel economy, optimizing battery charging status, and maintaining grid voltage stability. Online rolling optimization with constraints was performed using a multi-objective optimization approach based on MPC to obtain the optimal control sequence [68].

The simulation experiments conducted using dSAPCE and RTLAB demonstrated that layered energy management can efficiently coordinate multiple power sources, resulting in a 13.05% improvement in fuel economy compared to fuzzy rule-based control and a 5.79% improvement compared to single MPC-based control. Under the layered energy management strategy, the target powers for energy-type power sources change more smoothly, and energy-type power sources are dispatched more frequently, indicating the validity of this energy management strategy in optimizing fuel economy performance. Index comparisons are shown in Table 5.

In order to improve the energy utilization efficiency and economy of hybrid ships, Liang Tianchi et al. selected a certain type of hybrid ship as the research object, combined online working condition recognition with real-time optimization strategy, and proposed a support vector machine (SVM) and MPC. The simulation results showed that the SVM-based working condition identifier enables the online identification of ship operating conditions, and the accuracy of working condition recognition can reach 0.9132 based on the historical working condition data collected by a real ship. Compared with the single Markov prediction model, the SVM-based working condition recognizer can effectively improve the accuracy of the Domarkov prediction model for future working conditions, which is helpful for the optimal control of energy management, improving the performance of energy management strategy, and further reducing fuel consumption. In addition, the wavelet decomposition method can reasonably distribute the output power of the battery and supercapacitor and prolong the service life of the battery [69]. Compared with the MPC-based energy management strategy, the fuel consumption rate is reduced by about 4.02% when the power is allocated by the SVM-MPC-based energy management strategy, which is closer to the global optimal allocation result obtained by dynamic programming. The simulation comparison results of the energy management strategies are shown in Table 6.

Qin Dadong et al. proposed a stochastic model predictive control strategy based on explicit optimization, targeting the improvement of overall fuel economy for power-split hybrid power vehicles by utilizing Markov chain prediction to achieve speed and establishing an energy consumption minimization model within the predicted domain. The multi-parameter quadratic programming algorithm was employed to solve the objective function, ultimately validating the control strategy as effective through simulation analysis. It was found, through comparison with other control strategies, that the fuel consumption of the explicit stochastic model predictive control strategy is reduced by up to 28.64% compared to the rule-based strategy. The computation time of this strategy is 1% of that required for the random model predictive control strategy, and the average computation time is 3.1 ms, indicating good operability [70].

Ma et al. proposed a multi-objective energy management strategy based on model prediction by integrating speed prediction and driving mode recognition. The simulation results indicated that this EMS system can maintain the SOC close to the reference value, effectively avoiding the degradation of proton exchange membrane fuel cells caused by frequent starting and rapid changes in load, while achieving good fuel economy performance [71].

Based on the above analysis, the main conclusions of the optimized hybrid power system energy management strategy are as follows:

(1): The core of the energy management strategy is mathematical optimization, pursuing the optimal or suboptimal system efficiency under specific conditions.
(2): The optimization effect highly depends on accurate power system models, such as engine fuel consumption, motor efficiency, battery models, etc.
(3): In both theory and practice, optimization strategies can provide higher fuel economy than rule-based strategies, especially in complex and variable operating conditions.
(4): The main challenges are balancing computational complexity and real-time performance, as well as information uncertainty. Parameter tuning, such as the equivalent factor of ECMS, the predicted time domain length, and the weight coefficients of MPC require precise calibration. The SOC remains stable within a reasonable range.
(5): The combination of future optimization strategies and rule strategies can utilize optimization results to optimize rule parameters, such as hierarchical control. The use of data-driven methods to learn optimal control strategies and solve model dependency and complex optimization problems has enormous potential, but still faces challenges in real-time performance, robustness, and interpretability.

In summary, hybrid power control based on optimization strategies represents a more advanced and intelligent direction for energy management. It significantly improves the fuel economy potential of the system through mathematical modeling and optimization solutions. Among them, the ECMS has become the industry standard for instantaneous optimization strategies due to its good performance and online implementability. MPC, combined with predictive information, provides the closest online solution to the global optimum, which is the focus of future development. Despite facing challenges such as real-time performance, model dependency, and prediction accuracy, technological advancements, and the integration of optimization and rule-based strategies are jointly driving hybrid power systems toward higher efficiency and intelligence. The key comparison between the optimization strategy and the rule-based strategy is shown in Table 7.

5. Intelligent Control Energy Management Strategies

The core of the EMS is to rationally allocate energy from different power sources of the hybrid system and optimize energy efficiency under different working conditions. With the rapid development of artificial intelligence technology, methods based on neural network control and reinforcement learning control have gradually become effective means to solve the optimization problem of complex systems [72].

5.1. Neural Network Control

The basic principle of a neural network is an artificial intelligence model that mimics the structure of the human brain, consisting of multiple node neurons connected by synaptic links, with parallelism and nonlinear processing capabilities, capable of learning nonlinear relationships and making predictions or decisions. The types of neural network control include direct inverse model control, trajectory generation-based control, adaptive neural fuzzy control, adaptive dynamic output feedback control, model reference adaptive control, etc. In energy management, neural networks can be used to model complex dynamical systems and adjust energy allocation strategies in real time. Applied to nonlinear modeling and prediction, neural networks can handle the complex nonlinear relationships of hybrid systems and predict energy demand and efficiency under different operating conditions. Capable of real-time control and optimization, neural networks can quickly calculate the optimal energy allocation ratio to minimize fuel consumption or maximize battery range through real-time data input [73]. The neural network strategy design process includes data collection, model training, real-time control, etc. Its significant advantage is its high degree of non-linear adaptability, which is suitable for handling the complexity and uncertainty of hybrid powertrains. At the same time, it can realize online, real-time optimization to improve the response speed and efficiency of the system. However, there are design challenges, as training neural networks requires large amounts of high-quality data and computational resources. Neural network models may suffer from overfitting or insufficient generalization ability and exhibit unstable defects under extreme operating conditions [74].

5.2. EMS of Hybrid System Based on ADP

The basic principle of ADP is to approximate the cost function or control strategy of the system by using neural networks to achieve real-time optimization through the approximate dynamic programming method [75]. The structural classification of ADP includes heuristic dynamic programming (HDP), based on the approximation of the cost function. Double heuristic dynamic programming (DHP) is based on the approximation of the gradient of the cost function. Global double heuristic dynamic programming (GDHP) combines the advantages of HDP and DHP. Single-network adaptive evaluation (SNAC) simplifies the structure, reduces the computational burden, and is suitable for real-time applications.

In HEV energy management, the optimization goal of ADP is to achieve a balance between fuel economy and power through ADP. For example, the ADP algorithm is used to design the online shift controller and the power distribution controller, and the optimization effect close to the DP algorithm is achieved by integrating the neural network. By adding a speed prediction module, the operating range of the engine is further optimized to improve fuel economy [76]. In terms of real-time analysis, ADP significantly improves the real-time performance compared with the DP algorithm, but there is still room for improvement when compared with the rule-based strategy. A two-module structure (e.g., ADHDP) is more conducive to real-time applications than a three-module structure. The current shortcomings of the ADP algorithm are that the design of neural networks lacks theoretical guidance and relies on experience and trial and error. The actual vehicle test data are insufficient, and most of them rely on simulation verification. Real-time performance still needs to be further improved, especially computing efficiency under complex working conditions. Subsequently, the function approximator can be optimized, and other approximation methods, such as support vector machines and linear basis functions, can be explored [77]. A variety of methods are used to combine ADP with DP and genetic algorithm in order to complement each other’s strengths. Information fusion is combined with high-precision maps and Internet of Vehicles technology to improve the ability to predict working conditions. In terms of hardware support, 5G and cloud computing are used to increase computing power and support more complex online optimization. ADP-controlled vehicles show great potential in the field of energy management in hybrid powertrains, with the ability to combine optimized performance with real-time performance. In the future, ADP is expected to become one of the mainstream solutions for energy management through theoretical improvement, multi-method integration, and technological advancement.

Jin Hui et al. propose the application of ADP in HEV energy management. The optimization objective is to achieve a balance between fuel economy and powertrain efficiency through ADP, which integrates the advantages of neural networks and reinforcement learning by online learning to optimize energy allocation strategies, ensuring that the engine and motor operate in their most efficient regions and thereby reducing fuel consumption. In terms of structure, HDP approximates value functions based on experience. Dual Heuristic Programming combines value function and its derivative information for optimization. Global Dual Heuristic Programming integrates the strengths of HDP and DHP for global optimization. Compared to rule-based strategies, ADP is more adaptable to various driving conditions. Unlike dynamic programming, which offers less flexibility but better optimality, ADP achieves higher real-time performance, making it suitable for online applications. Several algorithm improvements have been implemented. Traditional function approximation methods using neural networks lack theoretical guidance, and the literature suggests exploring alternatives such as support vector machines or linear basis functions to enhance performance. By combining DP’s global optimality with neuro-adaptive critic designs for fuzzy logic rule generation, the approach improves energy management efficiency. To optimize real-time performance, dual-module control strategies are proposed to reduce computational demands, while single-network adaptive critic methods further lower computational burden. Combining ADP with the RBF neural network or the K-means clustering optimization algorithm can also improve fuel economy. However, current limitations include reliance on empirical design for neural networks and insufficient theoretical guidance. Real-time performance improvements are still needed, particularly in complex driving scenarios where computational efficiency must be enhanced. Furthermore, existing research often relies on simulations rather than real-world testing, limiting practical applicability [78].

In the future, multi-information fusion can be combined with high-precision maps and Internet of Vehicles technology to optimize energy management strategies. Hardware upgrades, combined with 5G and cloud computing, boost computing power. The algorithm has been further improved to develop ADP variants with faster convergence and better stability. In summary, the ADP strategy strikes a balance between real-time performance and optimality and is suitable for complex and changeable driving conditions. Although the real-time performance of the rule strategy is optimal, the adaptability is poor. Optimized strategies such as DP are globally optimal, but they are computationally intensive and difficult to apply online. The ADP algorithm has shown great potential in HEV energy management, especially in terms of adaptability and real-time performance. In the future, through the combination of multiple methods, information fusion and hardware support, ADP is expected to become the core strategy of HEV energy management.

5.3. Reinforcement Learning Strategies

Reinforcement learning is a machine learning method based on trial and error, in which the agent continuously tries different actions in the environment, and the reward function guides the system to develop toward the desired target state through the reward mechanism [79]. Mainly used in path optimization and strategy exploration, reinforcement learning can find the optimal control path under complex working conditions by continuously trying different energy allocation strategies. With dynamic adaptability, reinforcement learning can adjust control strategies through continuous learning and feedback in the face of uncertain environments or changing conditions to improve the robustness and efficiency of the system [80].

The main components of a reinforcement learning strategy include the agent, environment, state action, reward, strategy, and value function. The value function measures the expected cumulative rewards from a given state while following a particular strategy. Common types include the Q-value and the V-value [81].

The reinforcement learning algorithm workflow consists of the following:

Initialization, setting the initial state, action space, and reward function.
Execution of actions and observation of results, with the agent selecting an action based on the current state during the exploration phase. After executing the chosen action, the environment returns the next state and a reward.
Update of value functions or strategies by adjusting the parameters of value functions or policies based on the obtained rewards.
Iterative repetition of the above steps, with the agent gradually learning the optimal strategy through multiple iterations.

Reinforcement learning is applied in hybrid power systems, such as electric-parallel hybrid vehicles, where it involves defining system states, including speed, acceleration, battery SOC, and required power. The actions include engine torque distribution or motor torque command, whereas rewards are designed to minimize equivalent fuel consumption while maintaining SOC stability. The deep reinforcement learning (DRL) algorithm autonomously learns the optimal power allocation strategy through maximizing cumulative rewards, with the agent’s objective being to achieve this goal through continuous training [81,82].

By combining deep learning’s powerful feature extraction capabilities with reinforcement learning’s decision-making optimization capabilities, DRL is solving increasingly complex problems and promoting the application of artificial intelligence in various fields. There are many types of reinforcement learning, and the most widely used ones are Q-learning algorithms, DQN, strategy gradient methods, and Actor–Critic methods. DQN is a value-based algorithm that uses a neural network to fit the Q function and outputs the power allocation action of the engine-generator set based on the SOC and the required power. The improved Double DQN decoupling action selection and evaluation reduces the Q-value overestimation. Samples with high learning efficiency are trained first to accelerate convergence. The Policy Gradient DDPG is suitable for continuous action spaces such as the engine power change rate and outputs deterministic policies through the Actor–Critic network.

The advantage of RL is that it is highly adaptable and can find the global optimal solution under complex multi-constraint conditions. It has the ability to handle non-linear, dynamically changing uncertain systems and optimize performance in real time in dynamic environments. It exhibits high-level optimization capability, finding the global optimal solution through a trial-and-error mechanism. It also has the ability to handle multiple constraints, such as battery life, safety, and emission limits. The shortcomings include slow convergence speed, as reinforcement learning usually requires a lot of iterations to converge to the optimal strategy. The demand for computing resources is high, and complex environments and action spaces significantly increase computing burden. An effective reward function needs to be designed to guide the learning process; otherwise, it may lead to inefficient learning or fall into local optimum.

5.4. EMS of Hybrid Power System Based on RL

5.4.1. Energy Management Strategy Based on Reinforcement Learning

Compared to traditional methods, reinforcement learning exhibits stronger adaptability, such as being able to adapt to dynamic operating conditions, unlike conventional rule-based strategies such as CD-CS, which rely on fixed rules. Dynamic programming, an optimization method, depends on global information that may be difficult to obtain and computationally intensive. Deep reinforcement learning adjusts the strategy in real time based on online interaction, effectively handling complex scenarios such as sudden transitions between traffic jams and highways. DRL directly generates control instructions from high-dimensional state inputs such as sensor data, eliminating the need for manual rule design.

Kong Zehui et al. proposed an energy management optimization model for serial HEV using RL algorithms based on target operating conditions. The simulation results demonstrated that, compared to traditional rule-based strategies, the fuel economy of the EMS based on RL improves by 12% [83]. The comparison of the two EMSs is shown in Table 8.

Li et al. proposed a new equivalent factor predictor using a neural network-based approach, which can dynamically and accurately predict the equivalent factor under varying operating conditions based on current vehicle state information. This strategy is designed to achieve the minimum equivalent fuel consumption. The simulation and experimental results demonstrated that the equivalent battery charging state remains within an optimal range, achieving energy efficiency improvements of approximately 3% compared to the DP-based EMS approach. Moreover, this newly designed strategy outperformed traditional equivalent FC minimum strategies by saving computation time by a factor of 2~3 [84].

Jin Haitian et al. addressed the increased operational costs for ships equipped with diesel generators, batteries, and propulsion electric machines in series hybrid power systems. To balance economic efficiency and control real-time performance, they proposed an energy management strategy integrating reinforcement learning and model predictive control. This strategy abandons traditional optimization approaches that reduce the output of diesel engines to meet ship power demands while enhancing engine output capacity. By ensuring that the full power requirements for the ship are met, extra charging is provided for the battery in advance, thus optimizing energy reserves before operational needs. Simulations and validations confirmed that, compared to the conventional MPC strategy, this proposed approach reduces operational costs by approximately 6.8% under emissions-control zone conditions while cutting average computation time by about 30% [85]. Battery output power under the three strategies is shown in Figure 4. Computing time under the three strategies is shown in Figure 5.

5.4.2. Energy Management Strategy of Hybrid Power Systems Based on Deep Reinforcement Learning

In 2013, Mnih et al. first proposed applying convolutional neural networks (CNNs) in the field of deep learning applied to reinforcement learning, thereby introducing the deep Q network algorithm and laying the foundation for DRL [86]. As a subfield of deep learning, DRL integrates the feature extraction capabilities of deep learning with decision-making abilities from RL [87], enabling direct optimal decisions based on high-dimensional input data. This approach represents an end-to-end decision control system that reduces reliance on manual rule design.

Chen et al. proposed an intelligent energy management strategy for an unmanned ship hybrid system based on deep reinforcement learning, and the simulation verified the fuel economy and environmental performance of hybrid ships using this management strategy under different working conditions [88].

In order to manage the energy of multi-mode vehicles, Hua et al. proposed a multi-agent DRL strategy, even though the two agents work together in DDPG. By introducing the correlation degree and analyzing the factors affecting the learning performance of the two agents, the unified setting of the two agents is obtained, and then the optimal working mode under the strategy is obtained through the parameterization study of the correlation degree [89].

The basic idea of value-based DRL is to fit the value function of each state–action pair through a deep neural network and select the action with the greatest value as the output of the strategy. Q-learning is a value-function-based algorithm in RL, the core of which is to enable the agent to obtain the optimal strategy optimal action value function by learning one function, and then guide the agent to make the action with the highest return in a certain state, transforming reinforcement learning into deep reinforcement learning.

The structure of the QN algorithm is mainly composed of an experience replay pool, evaluation network, target network, etc. DQN is not suitable for dealing with continuous action problems, and there are shortcomings such as poor stability and overestimation. The DQN strategy does not change in real time, and its advantage is that the Q-value is calculated by using the deferred update target network, which greatly improves the convergence and stability of neural network training. Based on the Q-learning global optimization algorithm, You Jie distributed the energy of the whole vehicle to obtain the optimal torque of the engine and motor and simulated the fuel consumption of the hybrid vehicle as compared with the traditional small car while maintaining the balance of the battery SOC, which greatly improved the fuel economy of the vehicle [90].

For plug-in hybrid electric vehicles, Tian Zejie proposed a Q-learning algorithm-based energy management strategy, using the battery SOC and required power as state variables and the output power of the dynamic assistance unit as the control variable. It employs the temporal difference algorithm to update real-time action-state values. When compared with the global optimal algorithm based on PMP, in a typical urban bus scenario of over 100 km, the total price for the Q-learning strategy is only CNY 1.57 higher, indicating the effectiveness of the Q-learning strategy from an application perspective [91]. It is believed that, from an applied standpoint, this strategy can enhance the overall economic level of the vehicle. The results of the Q-learning EMS are shown in Table 9.

Strategy-based reinforcement learning directly maps optimal operations to the current state and excels at solving high-dimensional and continuous problems. Algorithms can be subdivided into deterministic strategy-based algorithms and randomness-based algorithms. The most widely used DDPG is an algorithm that combines the Actor–Critic framework with a deterministic strategy gradient algorithm, while continuing to employ an empirical replay mechanism and a target neural network. The determinism of the DDPG algorithm is reflected in the fact that the agent obtains one definite action through the policy function, rather than the probability density of all actions in a certain state. Therefore, it is possible to simplify the sampling integration in the action space and obtain the specific output action, which improves the operation efficiency of the system and is better in terms of stability, robustness and environmental exploration. Traditional reinforcement learning cannot address the high-dimensional situation of state space and action spaces and relies more on DRL, which leverages neural networks to solve the high-dimensional problems of state and action spaces [92]. DQN and double deep Q-network (DDQN) use a greedy policy to ensure that the agent’s actions are random. DQNs have roughly the same structure as DDQNs. However, DQNs are prone to overestimation, which can be solved by altering the learning structure of neural networks [93].

Maroto et al. developed a dual-depth Q network algorithm to manage the energy of the system under the simulation of commercial software, and at the same time used the convolutional neural network to characterize the pollution emissions of the hybrid system so as to reduce fuel consumption and improve fuel economy [94].

Reddy et al. developed an EMS based on Q-learning algorithms, which interacted with the simulation model of on-board hybrid systems to achieve autonomous learning. The simulation results showed that the control strategy can regulate the SOC to about 0.7, reduce SOC variation, extend battery life, improve control stability, and improve the efficiency of the hydrogen fuel system, thereby contributing to improved fuel economy [95].

In order to manage the energy of multi-mode vehicles, Hua et al. proposed a multi-agent deep reinforcement learning strategy. Even though the two agents work together in the DDPG, by introducing the correlation degree and analyzing the factors affecting the learning performance of the two agents, the unified setting of the two agents is obtained, and then the optimal working mode under the strategy is obtained through the parameterization study of the correlation degree [96]. In addition, Maroto et al. developed a dual-depth Q-network algorithm to manage the energy of the system under the simulation of commercial software and used the convolutional neural network to characterize the pollution emissions of the hybrid system, so as to reduce fuel consumption and improve fuel economy [97].

Chen Zeyu proposed an EMS for plug-in HEVs with online learning capabilities, based on the deep Q-network deep reinforcement learning algorithm. The EMS establishes an adaptive optimal power distribution law using the change in engine power as the action space and constructs an offline interactive learning + online updating learning algorithm framework [98].

Simulations were conducted both in software and under real driving conditions to validate its performance compared with PSO-optimized control strategies. The results indicated that the proposed deep reinforcement learning strategy can reduce the online running cost of the vehicle’s hybrid system by 6.9% compared to PSO optimization. When the driving conditions change significantly, the newly developed DDPG-based EMS demonstrates good road condition adaptability. Compared to the EMS without online learning capabilities, it further reduces fuel consumption by 3.64%. The cost comparisons of different energy management strategies are shown in Table 10. The energy consumption comparison before and after controller updates is shown in Table 11. The DRL EMS can effectively improve energy efficiency during online operation and has the ability to adapt to sudden changes in driving characteristics.

In order to achieve real-time optimization and improve some parameters of hybrid vehicles, Su Mingliang et al. proposed an energy management strategy based on deep reinforcement learning, introduced neural networks to predict the working conditions in the simulation process, and carried out simulations. The results showed that different algorithms in deep reinforcement learning are used to control and improve the battery power, but according to the optimization results obtained under different algorithms, the strategy based on DQN is better than that based on DQN and DDPG. The algorithm demonstrates better optimization and power regulation performance for the battery. In addition, in terms of convergence, the EMS based on DDPG converges faster, indicating that the algorithm model is more accurate and superior. In terms of suppressing battery power fluctuations, DDPG has fewer power fluctuations than the DQN and DDQN batteries, which can make the system more stable [99]. Table 12 shows the results of the comparison of energy management strategies. In terms of suppressing battery power fluctuations, the DDPG-based EMS exhibits smaller fluctuations compared to the DQN- and DDQN-based strategies, leading to a more stable system.

Xie Si Wuo pointed out that hybrid electric vehicles require balancing fuel economy, battery life, and driving performance. Traditional energy management strategies such as rule-based control and dynamic programming have limitations in real-time performance or show dependence on global state information. A novel optimization strategy based on DRL, combined with model-free learning and environment adaptability, was proposed to achieve real-time optimization. Through various algorithm variants and improvement methods, the validity of this strategy was verified [100].

Dynamic programming was adopted as a global optimal benchmark for subsequent strategy comparison. With state variables such as SOC and engine torque as control inputs, simulations were conducted under the NEDC (Norgren Emission Drives CO₂-neutral) driving cycle, achieving a fuel economy of 4.352 L/100 km. However, its limitation is the need for prior knowledge of complete driving conditions, making it unsuitable for real-time application. Algorithmic improvements were made, including the following:

Double DQN: reducing overestimation of Q-values to enhance stability.
Dueling DQN: separating state value functions from action advantage functions to improve generalization.
D3QN: combining double and dueling structures to further optimize performance.

The experimental results showed that the fuel economy of D3QN was optimal at 4.697 L/100 km, achieving 90.2% of DP’s value, but with larger training fluctuations. Dueling DQN demonstrated more stable convergence, making it suitable for practical applications. Based on strategy gradient in DRL, improvements were made to integrate heuristic rules such as engine/motor torque constraints and SOC limits to avoid unreasonable actions and accelerate convergence.

Combined with historical best experience and real-time experience, the adaptability to environmental disturbances is improved, and fuel economy is improved by 3.2%. The continuous operation space processing capability is better than that of the DQN, and the fuel economy is 94.2% that of the DP. The results of the comparison with the model predictive control show that the fuel economy is 5.0025 L/100 km at the predicted step size of 50 s, which is 86.9% of the DP, but the calculation is large and the real-time performance is poor. Table 13 shows the performance of the strategy.

This approach demonstrated improved real-time optimization and adaptability, though its computational demands necessitated further algorithmic refinement for broader practical application.

The standard deviation of robustness is only 1.24% under multiple working conditions, and the overall results show that DRL, especially the improved DDPG, is superior, which provides an important foundation for follow-up research and practical application. However, the disadvantage is that the training relies on the simulation environment and does not take into account actual traffic uncertainty such as random traffic flow. Neural network parameters such as the number of layers and activation functions are not systematically optimized. Future research can consider combining traffic prediction models to improve the adaptability of working conditions and carrying out hardware-in-the-loop verification of real-time performance, which can be combined with more practical scenarios and multi-objective optimization to further deepen the research.

Li Jiaxi et al. proposed that energy management strategies in HEVs are critical to fuel economy and battery life. Traditional methods, such as rule-based or PID-controlled ECMS, have the problems of complex parameter tuning and poor adaptability. In this paper, a parallel DRL method based on DDPG was proposed to optimize the equivalent factor to achieve fuel economy improvement and stable control of battery SOC. The DDPG algorithm is optimized, combined with the Actor–Critic framework, and the optimization of the equivalent factor is solved through offline training and online adjustment, which solves the limitation of the traditional ECMS relying on fixed equivalent factors. The edge computing architecture is introduced by using a parallel framework, and multiple edge devices work together to train the global network, which significantly improves the convergence speed, and the experiment shows that the acceleration is 334%. The state design includes the battery SOC, remaining driving range, and previous action, combined with intelligent network technology information such as real-time positioning, to enhance the adaptability of the strategy. The simulation experiments were carried out under FTP72 driving cycles, compared with the traditional PID-controlled A-ECMS and the globally optimized Nonadaptive-ECMS. The results show that in terms of fuel economy, the DDPG strategy reduces fuel consumption by 7.2% compared with PID-A-ECMS, and the fuel consumption per 100 km decreases from 8.3 L to 7.7 L. In terms of SOC retention, both DDPG and Nonadaptive-ECMS are effective in maintaining SOC, but DDPG allows for greater SOC fluctuations over long distances to optimize engine efficiency. When eight edge devices are trained in parallel, the convergence speed is increased by 334%. The reward function adopts the Gaussian form, balancing the SOC deviation weight of 0.7 and the instantaneous fuel consumption weight of 0.3. The Actor–Critic network structures are three-layer neural networks, with 120 neurons in each layer, and the learning rates are 0.0001 and 0.0002, respectively. Synchronize the gradient between the cloud global network and edge devices in parallel, breaking the data correlation and improving training efficiency. The theoretical contribution of the research is to combine DDPG with edge computing to provide a real-time and adaptive solution for HEV energy management. The engineering value significantly reduces fuel consumption, while meeting the SOC maintenance requirements, and is suitable for dynamic and changeable real-world driving scenarios. It is technologically forward-looking, and provides a new idea for the design of energy management strategies for intelligent networked vehicles. However, the research also has limitations, and the experiment is based on the simulation environment, which needs to be further verified by real vehicles. Parallel frameworks are sensitive to communication latency. Multi-agent reinforcement learning, combined with more complex driving conditions such as urban congestion, and hardware-in-the-loop (HIL) testing can be explored. This study implements the real-time optimization of HEV energy management strategy through the DDPG and edge computing framework, which is better than traditional methods in terms of fuel economy and algorithm efficiency, providing an innovative solution for energy management of intelligent vehicles [101].

In addition, Pang S Z et al. mentioned that there are also relevant improved algorithms based on DRL in the field of aeronautical engineering and vehicle engineering energy management, such as hierarchical reinforcement learning, safety reinforcement learning, multi-agent reinforcement learning, and meta-reinforcement learning, but there is no in-depth discussion [102].

Reinforcement learning shows strong adaptive optimization capabilities in hybrid energy management, and its core value lies in the ability to respond to dynamic environments in real time without accurate condition prediction. End-to-end decision-making reduces the burden of manual rule design. The multi-objective balance balances economy, battery life and emissions. In the future, with the deepening of edge computing and transfer learning, DRL is expected to become the core decision-making brain of intelligent hybrid systems, promoting the evolution of hybrid systems to full-scenario intelligence.

For the comparative analysis of rule-based control strategies, DRL intelligent control strategies, and multi-objective optimization control strategies in HEV EMSs, Zhou Jinping et al. proposed a heuristic control strategy for parallel mixed electric vehicles, aiming to optimize fuel economy and battery state of charge stability [103]. Zhang Song et al., targeting dual-planet mixed electric buses, applied DRL methods such as DDQN and twin delayed deep deterministic policy gradients (TD3) to optimize fuel economy and SOC balance under complex operating conditions [104]. Miao Dongxiao et al. proposed a heuristic control strategy based on the non-dominated sorting genetic algorithms-II (NSGA-II) optimization of logic thresholds for ship-series hybrid power systems, aiming to reduce fuel consumption and carbon emissions by dynamically adjusting power thresholds according to SOC and engine speed. Based on rule-based heuristic control strategies, combined with threshold change mechanisms and load following methods. According to SOC and engine speed, dynamically adjust the power thresholds to achieve coordinated operation between the engine and motor. Rule-based control strategies are simple, with low computational requirements, but rely heavily on expert experience. DRL control strategies utilize discrete DDQN and continuous TD3 algorithms to address the limitations of conventional approaches. The strategy is suitable for discrete action spaces, by using double Q networks, it reduces value overestimation issues. For continuous action spaces, through dual delay and truncation techniques, improved stability can be achieved. The algorithm autonomously learns optimal control logic, with enhanced adaptability [105]. The fuel consumption comparison results under the three strategies are shown in Table 14.

Based on the control strategy of multi-objective optimization, the NSGA-II algorithm is used to optimize the logic threshold rules. The optimization of logic threshold values includes diesel engine power limits and SOC boundaries, aiming to achieve multi-objective optimization of fuel consumption and carbon emissions. The offline optimization is combined with online application, balancing real-time performance and optimization effects.

In terms of system performance, the load-following threshold change strategy (LTS) shows superior performance compared to the 3.1~10.4% improvement in fuel economy achieved by EACS and the 2.5~5.7% improvement achieved by ECMS. The DRL strategy achieves a fuel economy equivalent to 93% of that achieved by the DP strategy (around 19.5 L/100 km). Compared to traditional internal combustion engine systems, NSGA-II-optimized logic threshold rules reduce fuel consumption by 11.09% and further reduce emissions compared to unoptimized hybrid powertrain systems by approximately 1.18%. In terms of SOC stability, the LTS strategy maintains SOC above 60%, while the DRL strategy employs a charge maintenance mode to achieve SOC balance. NSGA-II-optimized logic threshold rules enhance the SOC terminal value by more than 1.81% compared to unoptimized values.

In terms of carbon emissions, the optimization strategy of the NSGA-II optimization logic threshold rule reduces carbon emissions by 4.32% compared with traditional power systems. It can be seen that different strategies have their own characteristics, and the advantages of LTS strategy are simple rules, small amount of calculation, and easy practical application. The disadvantage is that it relies on expert experience and has limited global optimization capabilities. The advantages of the DRL strategy are self-learning, strong adaptability, and the ability to deal with continuous control problems. The disadvantage is that the training is complex and requires a lot of data and computing resources. The advantages of the NSGA-II optimization strategy are multi-objective optimization, taking into account fuel consumption and carbon emissions, and online application after offline optimization. The disadvantage is that the selection of optimized variables depends on experience, and the real-time performance is slightly inferior to that of pure rule strategies. Through comparative analysis, the LTS strategy is suitable for scenarios with high real-time requirements and limited computing resources, such as traditional hybrid vehicles. The DRL strategy is more suitable for system control such as a hybrid bus route with a fixed driving style and stable driving style. The NSGA-II optimization strategy can take into account multi-objective optimization and real-time performance, which is suitable for large-scale hybrid systems such as ships. The analysis shows that the LTS strategy is combined with optimization algorithms such as genetic algorithm to dynamically adjust the threshold to improve the adaptability of multiple working conditions. The DRL strategy is to combine transfer learning or online learning to improve generalization capabilities and real-time performance. The NSGA-II optimization strategy aims to explore online optimization methods to further improve real-time performance. It further shows that the rule method is simple and efficient, and is suitable for traditional applications. The learning method is highly adaptable and suitable for complex scenarios. The optimization method is multi-objective optimization and is suitable for large-scale systems. Future research can explore multi-method fusion, such as using DRL to optimize the threshold of rule policies, or combining NSGA-II to optimize the reward function of DRL, so as to balance efficiency and performance.

Based on the above analysis, the main conclusions of the energy management strategy for hybrid power systems based on intelligent control are as follows:

(1): Using algorithms such as fuzzy logic and neural networks to process nonlinear features of the system. Deep reinforcement learning enhances the ability of traditional reinforcement learning to handle complex tasks by introducing neural networks, and ADP may provide a more efficient optimization method for this type of deep reinforcement learning.
(2): The energy management strategy for hybrid power systems based on reinforcement learning is optimized through data-driven strategies, which have strong adaptability but require a large amount of training data. The model’s generalization ability depends on algorithm design.
(3): Adaptive dynamic programming can provide a theoretical framework or acceleration algorithm for reinforcement learning, especially in solving complex control problems. A-ECMS introduces SOC feedback regulation on the basis of ECMS, dynamically optimizes equivalent factors, balances energy consumption and battery life, but the disadvantage is that it still relies on accurate subsystem models and parameter calibration is complex.
(4): Typical algorithms of deep reinforcement learning (DRL) such as DQN, PPO, DDPG, etc. are widely used in hybrid power systems. The DQN algorithm is trained on historical operating data to achieve real-time mapping between required power and SOC. DDPG is suitable for continuous actions such as power distribution ratio, optimizing engine start stop frequency and motor torque distribution.
(5): With the development of artificial intelligence and networking, the control strategy of hybrid power systems is becoming more “intelligent” and closer to human learning.
(6): The control strategy continues to evolve towards deep learning, allowing the system to learn optimal control rules from massive historical data and even predict future operating conditions, achieving more accurate and adaptive energy management.

The intelligent control strategy drives the hybrid power system to shift from “parameter competition” to “energy efficiency revolution”, with its core values being real-time, predictive, and reliable. The main evolutionary logic can be summarized as: rule static optimization dynamic learning autonomy. In the future, with the improvement of AI chip computing power and the popularization of V2X technology, global optimization based on intelligent control will become mainstream, further narrowing the economic gap with traditional fuel transportation vehicles and providing key technical support for carbon neutral transportation. The learning-based energy management strategy is shifting from optimizing a single vehicle to promoting transportation energy collaboration, and from algorithm innovation to system reconstruction. Its development will profoundly reshape the application form and value ecology of hybrid power technology. With the breakthrough of basic theories and the maturity of enabling technologies, this field will usher in explosive innovation in the coming years, providing core support for low-carbon transportation.

6. Conclusions and Future Trends

6.1. Strategy Application Challenges

Complex system integration in hybrid system requires coordination among the engine, electric motor, energy storage equipment, and related power devices. Due to high system complexity, it is necessary to develop lightweight and high-reliability coupling devices. The real-time and robustness are balanced, and the real-time deployment of multi-objective optimization algorithms under the limitation of airborne computing resources still faces challenges, and it is necessary to further compress the computing amount of the algorithm or to adopt edge computing architecture. The computational complexity is high, and the global optimization algorithm usually requires a large number of iterations and computing resources. For hybrid vehicles, the regenerative braking energy recovery efficiency is limited by the line slope, passenger capacity, and energy storage SOC, and the recovery strategy needs to be dynamically adjusted to avoid energy waste.

The balance between system life and economy needs to reduce battery cycle loss and fuel costs, and at the same time meet the power redundancy requirements under emergency working conditions. For the adaptability of dynamic working conditions, such as vehicle urban congestion and high-speed cruise, the real-time road conditions can be comprehensively predicted by the vehicle network.

6.2. Strategy Model Limitations

Although global optimization, intelligent control, and other methods have shown many advantages in the energy management of hybrid power systems, there are still some limitations. The difficulty of parameter adjustment is different for different problems, with different requirements for algorithm parameters, and it is challenging to select the appropriate parameter combination. The convergence speed of some algorithms is relatively slow, especially when dealing with high-dimensional complex problems. Enhance the stability of the system, which is not affected by the drastic change of load power, and enhance stability and robustness. High-dimensional and nonlinear problems: Hybrid systems exhibit complex, multi-dimensional, and nonlinear relationships, requiring optimization algorithms with strong global search capabilities. The influence of dynamically changing environments such as temperature, humidity, wind speed, etc., on system performance is also a factor that needs to be considered in actual operation, striving to align the working conditions with real-world scenarios. In actual hybrid aircraft flights, load fluctuations and changes in environmental conditions impose higher demands on energy management strategies, and it is a challenge to design an optimization method with strong adaptability and real-time response. In terms of computational efficiency and solution quality, multi-objective collaborative optimization algorithms may encounter issues such as slow convergence speed and insufficient solution diversity when solving high-dimensional problems. Balancing the computational efficiency and the quality of the solution and improving the practical application value of the algorithm are critical directions for future research. Hardware requirements for edge devices can also limit actual deployments. The training data relies on simulation. Hybrid vehicles, airplanes, and ships exist to varying degrees in real-world deployment, with actual flight and navigation data for airplanes and ships being limited by comparison.

6.3. Verification of Strategy Validity

The validity and effectiveness verification of different strategies in energy management for hybrid systems is an essential phase, which can be achieved by establishing high-precision digital models capable of real-time control through hardware-in-the-loop validation. A comprehensive verification framework should include three dimensions: stage testing, simulation testing, and on-board real-time testing.

The bench test focuses on the response speed of the control strategy to sudden working conditions. Simulation testing requires the construction of a digital twin that includes a variety of typical scenarios. The actual test focuses on the continuous optimization ability of the strategy. For example, to verify the global optimization method, the energy management model is established first, and the objective function, constraints, and decision variables are defined. Then, according to the characteristics of the problem, select the appropriate optimization algorithm or hybrid optimization strategy, along with parameter setting, initialization methods, initial population configuration, fitness function design, control parameter optimization, etc. Run the optimization process, and through iterative calculations, gradually update the individuals in the solution space to find the optimal solution. Finally, the results are verified, and the performance indicators of different algorithms, such as convergence speed and solution quality, are compared and verified using actual data. Experimental verification: The reliability of the simulation results is verified through actual tests. Sufficient data can be obtained by increasing practical experiments to effectively verify the real-time effectiveness of the verification control strategy.

Extreme scenario tests are carried out to safely simulate rare but dangerous working conditions and evaluate the robustness of the algorithm. Continuous learning optimization reduces the risk and cost of real-world trial and error by continuously updating and improving control strategies through a closed-loop of virtual-reality data.

6.4. Comparison of Different Strategies

As the requirements for high performance and energy-saving in various types of vehicles (e.g., aviation, ground vehicles, rail vehicles, watercraft) become increasingly stringent, hybrid power systems are widely adopted across these modes through coordinated use of multiple energy sources and power generation to enhance energy efficiency and overall performance. While hybrid power systems vary by mode due to differences in energy types, powertrains, and driving methods, they all face the common challenge of matching energy with power output efficiently. An efficient EMS plays a key role in addressing this issue.

This paper reviews EMSs from the perspectives of system classification, strategic objectives, methodologies, and future trends. In general, although rule-based, optimization-based, and intelligent control systems are not substitutable for managing multi-energy and multi-power systems under complex working conditions, intelligent control strategies offer advantages. In particular, their strong adaptability makes them well-suited for complex scenarios. Among them, deep reinforcement learning-based strategies show potential for optimization and improvement, representing a promising direction for future development. Traditional scheduling strategies are often based on fixed models or rules, which are difficult to adapt to complex working conditions and load conditions. The rule-based strategy has high real-time performance and low computational requirements. Although the optimization effect is only effective locally, it is simple and easy to implement in hybrid vehicles, drones, ships, and rail vehicles. The instantaneous optimization feature of the optimization-based strategies offers enhanced real-time performance compared to global optimization, and the global optimization effect is better. Instantaneous optimization has been applied in some vehicles, ships, and aircraft. In contrast, global optimization is mainly based on simulation and theoretical research, and its practical application needs to be further studied. Strategies based on intelligent control, such as adaptive dynamic programming and DRL, have real-time performance and computational requirements that lie between those of rule-based strategies and global optimization strategies. However, they offer good dynamic adaptability during the optimization process. By integrating sensors and data analysis systems, the intelligent strategy uses multi-source data to make up for the shortcomings of the mechanism model, improves the adaptability of dynamic scenarios, realizes multi-objective dynamic trade-offs through reinforcement learning and other algorithms, breaks through the upper limit of traditional control, and can monitor the operation status, progress, and energy consumption of equipment in real time.

In summary, advanced technologies such as cooperative evolution can further drive EMS toward globalization and intelligence, propelling hybrid power systems toward energy-efficient operation and digital transformation, with a focus on carbon neutrality and digitalization. These advancements will accelerate the achievement of high-performance, low-emission targets. The comparison and application of energy management strategies in hybrid power systems are shown in Table 15.

6.5. Future Outlook

Future research can be improved in the following aspects: Hybrid optimization strategies integrate the advantages of multiple algorithms, such as incorporating intelligent algorithms, including deep learning, to optimize energy management and improve adaptability in dynamic environments. A more efficient hybrid optimization strategy is designed to improve the search efficiency and the quality of the solution and further improve the performance of the system. Lightweight algorithms are used to explore more efficient network structures and reduce computing costs. Multi-objective optimization introduces emission control, battery life and other goals and constructs a multi-objective reward functions. Real-time adaptive control can adjust parameters in real time and adapt to environmental changes in energy management strategies, for hybrid aircraft, to cope with sudden conditions that may occur during flight. The integration of multi-objective collaborative optimization and machine learning uses machine learning technology to predict and guide the optimization process, improving the convergence speed and quality of the algorithm. These are combined with the safety constraints of the physical model, such as embedding engine thermodynamic constraints into the DRL strategy to avoid overrun operation. Future work should focus on real-time optimization and the development of dynamic programming variants with low computational complexity, such as adaptive dynamic programming. Multi-objective optimization incorporates goals such as battery life, emissions, and other goals to design a more comprehensive strategy. Parallel computing technology reduces computing time and improves optimization efficiency.

Multi-agent RL, used for hybrid vehicles, may not only be used for single vehicles in the future, but may also consider the coordinated scheduling of vehicle groups such as truck platooning to achieve global energy consumption optimization. Connected vehicle integration combined with V2X technology leverages real-time traffic information to further optimize energy management. Human–vehicle interaction learns personalization strategies to adapt driver habits such as aggressive/conservative styles. For hybrid vehicles, cloud-based collaborative vehicle–network–station collaborative optimization can be carried out, and big data and cloud computing can be used to optimize energy management and implement vehicle–road collaborative prediction. Autonomous driving and energy management are integrated, and energy-saving driving curve optimization, such as economic speed planning, is embedded in the autonomous driving decision-making loop to achieve the lowest overall energy consumption. Multi-strategy integration, combining rules, optimization, and intelligent algorithms, balances real-time performance and global optimality. Intelligent and autonomous learning requires deep applications of reinforcement learning and complete autonomous control through end-to-end algorithms, such as combining visual perception and high-precision maps to optimize energy allocation. The intelligent control strategy represented by reinforcement learning has great potential for future development due to its significant advantages. The operation results of the learning-based energy management strategy are close to the optimal solution of the actual situation, which is suitable for multi-objective optimal control problems. It has strong adaptability and is suitable for complex scenarios. The federated learning framework optimizes the global strategy of cloud big data, and edge computing implements local real-time control. New energy architectures, such as hydrogen energy systems and green hydrogen fuel cells combined with high-energy-density batteries, are driving the development of zero-carbon vehicles or drones. They can also be used to supplement electrical energy with solar energy; for example, solar-electric hybrid aircraft use solar energy during the cruise phase to recharge and extend the cruising time. Lightweight and efficient algorithms, using model compression technology, convert deep neural networks into embedded code and adapt to the limitations of on-board computing power. Event-triggered control reduces the frequency of calculations and energy consumption under low-load conditions. Cross-domain collaborative control and energy–thermal management are used to optimize battery temperature and engine cooling strategies, thereby improving system reliability. The energy management strategy of hybrid systems is evolving, from single-objective optimization to multi-objective collaboration, and from rule-based control to intelligence. In the future, it is necessary to further break through the adaptability of dynamic working conditions and the limitation of hardware cost and explore cutting-edge directions such as hydrogen energy and reinforcement learning. This lays the foundation for industrialization, and interdisciplinary integration will promote the development of hybrid transportation in the direction of efficiency, intelligence, and sustainability.

Author Contributions

Conceptualization, G.F., Z.F. and Z.C.; methodology, validation, P.S., G.F. and L.G.; formal analysis, G.F. and Z.F.; investigation, Z.F.; resources, P.S.; data curation, L.G.; writing—original draft preparation, Z.F. and G.F.; writing—review and editing, Z.F. and G.F.; visualization, Z.F.; supervision, Z.C.; project administration, G.F. and Z.C.; funding acquisition, Z.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Joint Fund Project of the National Natural Science Foundation of China (U24A6008).

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

Author Lulu Guo was employed by the company Research and Development Academy, China First Automobile Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from the Joint Fund Project of the National Natural Science Foundation of China (U24A6008). The funder was not involved in this study’s design, collection, analysis, interpretation of data, the writing of this article, or the decision to submit it for publication.

References

Tang, X.; Chen, J.; Qin, Y.; Liu, T.; Yang, K.; Khajepour, A.; Li, S. Reinforcement learning-based energy management for hybrid power systems: State-of-the-art survey, review, and perspectives. Chin. J. Mech. Eng. 2024, 37, 43. [Google Scholar] [CrossRef]
Wang, J.; Pan, C.F.; Li, Z.X. Energy management strategy that optimizes battery degradation for electric vehicles with hybrid energy storage system. Sci. China Tech. Sci. 2025, 68, 1220104. [Google Scholar] [CrossRef]
Sun, X.-J.; Yao, C.; Song, E.-Z.; Long, Y.; Liu, X.-Y.; Bai, H.-Q. A Review of Development of Hybrid Power Systems and Intelligent Energy Management Control Strategies for Ships. J. Propuls. Technol. 2022, 43, 19–41. [Google Scholar] [CrossRef]
Cem, D.; Korhan, K. Energy Management Strategies for Hybrid Electric Vehicles. In Proceedings of the 2024 13th International Conference on Renewable Energy Research and Applications, Nagasaki, Japan, 9–13 November 2024. [Google Scholar] [CrossRef][Green Version]
Guille des Buttes, A.; Jeanneret, B.; Kéromnès, A.; Le Moyne, L.; Pélissier, S. Energy management strategy to reduce pollutant emissions during the catalyst light-off of parallel hybrid vehicles. Appl. Energy 2020, 266, 114866. [Google Scholar] [CrossRef]
He, H.; Meng, X. A Review on Energy Management Technology of Hybrid Electric Vehicles. Trans. Beijing Inst. Technol. 2022, 42, 773–783. [Google Scholar] [CrossRef]
Wu, M.; Luo, Y.; Peng, Y.; Zhao, M.; Tian, S. An Authentic Bounded Proximal Policy-based Energy Management Strategy for UAV’s Hybrid Power Supply System. IEEE Trans. Aerosp. Electron. Syst. 2025, 1, 1–14. [Google Scholar] [CrossRef]
Wu, Y.; He, L.K.; Li, R.Z.; Liu, Z.R.; Xu, Z.X.; Li, W.L. Research on Energy Management Strategy of Hybrid Aircraft Based on Stability. Acta Aeronaut. Astronaut. Sin. 2025, 46, 331937. [Google Scholar]
Hu, J.; Liu, D.; Du, C.; Yan, F. Study on Energy Management Strategy of Hybrid Energy Storage System for Electric Vehicles. Mech. Sci. Technol. Aerosp. Eng. 2020, 39, 1606–1614. [Google Scholar] [CrossRef]
Li, K.; Zhou, J.; Jia, C.; Yi, F.; Zhang, C. Energy sources durability energy management for fuel cell hybrid electric bus based on deep reinforcement learning considering future terrain information. Int. J. Hydrogen Energy 2024, 52, 821–833. [Google Scholar] [CrossRef]
Wróblewski, P.; Kupiec, J.; Drożdż, W.; Lewicki, W.; Jaworski, J. The Economic Aspect of Using Different Plug-In Hybrid Driving Techniques in Urban Conditions. Energies 2021, 14, 3543. [Google Scholar] [CrossRef]
Ji, Y. Design and Optimization of Energy Management Strategy for Plug-in Hybrid Electric Vehicle. Master’s Thesis, Chongqing University, Chongqing, China, 2022. [Google Scholar] [CrossRef]
Zhuang, W.; Zhang, X.; Li, D.; Wang, L.; Yin, G.J. Mode shift map design and integrated energy management control of a multi-mode hybrid electric vehicle. Appl. Energy 2017, 204, 476–488. [Google Scholar] [CrossRef]
Zhang, H.; Peng, J.; Dong, H.; Tan, H.; Ding, F. Hierarchical reinforcement learning based energy management strategy of plug-in hybrid electric vehicle for ecological car-following process. Appl. Energy 2023, 333, 120599. [Google Scholar] [CrossRef]
Jia, C.; He, H.; Zhou, J.; Li, J.; Wei, Z.; Li, K. Learning-based model predictive energy management for fuel cell hybrid electric bus with health-aware control. Appl. Energy 2024, 355, 122228. [Google Scholar] [CrossRef]
Wang, Z.; Guo, Y. Hybrid power system modeling and energy management strategy application. Aeroengine 2019, 45, 7–11. [Google Scholar] [CrossRef]
Meng, Z.Y. Design of Dual- Energy System and Research of Energy Management Strategy for Light Fixed-wing UAV. Master’s Thesis, Jilin University, Changchun, China, 2024. [Google Scholar]
Zhang, Z.; Xiao, D.; Wang, J. Research of energy management strategy for fuel cell hybrid power system of multi-rotor unmanned aerial vehicle. Inf. Technol. Netw. Secur. 2018, 37, 122–126. [Google Scholar] [CrossRef]
Chen, L.; Gao, D.; Xue, Q. Energy management strategy for hybrid power ships based on nonlinear model predictive control. Int. J. Electr. Power Energy Syst. 2023, 153, 109319. [Google Scholar] [CrossRef]
Zhao, F.; Li, Y.; Zhou, M.; Zhang, Z.; Wei, W.; Zhang, C.; Li, G. Optimization of energy management strategy for marine multi-stack fuel cell hybrid power system. Ship Sci. Technol. 2024, 46, 134–140. [Google Scholar]
Xie, P.; Tan, S.; Guerrero, J.M.; Vasquez, J.C. MPC-informed ECMS based real-time power management strategy for hybrid electric ship. Energy Rep. 2021, 7, 126–133. [Google Scholar] [CrossRef]
Urooj, A.; Nasir, A. Review of intelligent energy management techniques for hybrid electric vehicles. J. Energy Storage 2024, 92, 112132. [Google Scholar] [CrossRef]
Wang, B.; Xiao, H.; Li, G.; Xiu, W.; Mo, Y.; Zhu, M.; Wu, Z. A Review of Energy Management Strategy for Hydrogen-Electricity Hybrid Power System Based on Control Target. Power Gener. Technol. 2023, 44, 452–464. [Google Scholar] [CrossRef]
Huang, K.D.; Nguyen, M.K.; Chen, P.T. A rule-based control strategy of driver demand to enhance energy efficiency of hybrid electric vehicles. Appl. Sci. 2022, 12, 8507. [Google Scholar] [CrossRef]
Lian, R.; Peng, J.; Wu, Y.; Tan, H.; Zhang, H. Rule-interposing deep reinforcement learning based energy management strategy for power-split hybrid electric vehicle. Energy 2020, 197, 117297. [Google Scholar] [CrossRef]
Lu, D.; Yi, F.; Hu, D.; Li, J.; Yang, Q.; Wang, J. Online optimization of energy management strategy for FCV control parameters considering dual power source lifespan decay synergy. Appl. Energy 2023, 348, 121516. [Google Scholar] [CrossRef]
Ortiz, J.P.; Ayabaca, G.P.; Cardenas, A.R.; Cabrera, D.; Valladolid, J.D. Continual refoircement learning using real-world data for intelligent prediction of SOC consumption in electric vehicles. IEEE Lat. Am. Trans. 2022, 20, 624–633. [Google Scholar] [CrossRef]
Wang, Y.; Tan, H.; Wu, Y.; Peng, J. Hybrid electric vehicle energy management with computer vision and deep reinforcement learning. IEEE Trans. Ind. Inform. 2021, 17, 3857–3868. [Google Scholar] [CrossRef]
Zhang, F.; Wang, L.; Coskun, S.; Pang, H.; Cui, Y.; Xi, J. Energy Management Strategies for Hybrid Electric Vehicles: Review, Classification, Comparison, and Outlook. Energies 2020, 13, 3352. [Google Scholar] [CrossRef]
Enang, W.; Bannister, C. Modelling and control of hybrid electric vehicles (A comprehensive review). Renew. Sustain. Energy Rev. 2017, 74, 1210–1239. [Google Scholar] [CrossRef]
Jia, C.; Liu, W.; He, H.; Chau, K.T. Health-conscious energy management for fuel cell vehicles: An integrated thermal management strategy for cabin and energy source systems. Energy 2025, 333, 137330. [Google Scholar] [CrossRef]
Liu, J.; Zhang, Y.; Wei, F.; Wang, Y.; Li, X.; Liu, P.; Si, F. Research on Rule-based Energy Management Strategy of Hybrid Mining Dump Truck. J. Syst. Simul. 2025, 37, 487–497. [Google Scholar] [CrossRef]
Zdravković, S.; Vujanović, D.; Stokić, M.; Pamučar, D. Evaluation of professional driver’s eco-driving skills based ontype-2 fuzzy logic model. Neural Comput. Appl. 2021, 33, 11541–11554. [Google Scholar] [CrossRef]
Savvaris, A.; Xie, Y.; Malandrakis, K.; Lopez, M.; Tsourdos, A. Development of a fuel cell hybrid-powered unmanned aerial vehicle. In Proceedings of the 2016 24th Mediterranean Conference on Control and Automation (MED), Athens, Greece, 21–24 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1242–1247. [Google Scholar] [CrossRef]
Zhang, X.; Liu, L.; Dai, Y.; Lu, T. Experimental investigation on the online fuzzy energy management of hybrid fuel cell/battery power system for UAVs. Int. J. Hydrogen Energy 2018, 43, 10094–10103. [Google Scholar] [CrossRef]
Liu, Y. Energy Management Strategy Optimization of Data-Driven Hybrid Electric Vehicle. Master’s Thesis, Hefei University of Technology, Hefei, China, 2022. [Google Scholar] [CrossRef]
Li, S. Research on Control Strategy of UAV Composite Power. Master’s Thesis, Northeast Forestry University, Harbin, China, 2020. [Google Scholar] [CrossRef]
Chen, H. Design and Research of Energy Management System for Hybrid Unmanned Aerial Vehicle. Master’s Thesis, Jiangsu University, Zhenjiang, China, 2020. [Google Scholar] [CrossRef]
Zhang, X.; Liu, L.; Dai, Y. Fuzzy State Machine Energy Management Strategy for Hybrid Electric UAVs with PV/Fuel Cell/Battery Power System. Int. J. Aerosp. Eng. 2018, 2018, 2852941. [Google Scholar] [CrossRef]
Hui, Y.N.; Dong, W.; Chai, J.Y. Energy management strategy of hybrid aircraft based on fuzzy logic. J. Propuls. Technol. 2025, 46, 21–29. [Google Scholar] [CrossRef]
Hu, C.; Yan, D.; Liu, N.; Song, X. Comparison and Simulation Research on Energy Management Strategies of Oil–Electric Hybrid Unmanned Aerial Vehicle. Chin. Intern. Combust. Engine Eng. 2022, 43, 74–83. [Google Scholar] [CrossRef]
Hu, C.; Li, C.; Liu, N.; Song, X.; Du, C. Simulation on dual-fuzzy energy management strategy of UAV extended range elec-tric propulsion system. J. Aerosp. Power 2021, 36, 2652–2662. [Google Scholar]
Chen, Z.; Yang, X. Research on energy management of parallel hybrid UAVs. Chin. J. Power Sources 2024, 46, 1066–1070. [Google Scholar] [CrossRef]
Zhou, S.; Zhu, L.; Qi, Y.; Wang, K. Study on the Energy Management System for Fuel Cell Vehicles Using Finite State Machine. Electr. Energy Manag. Technol. 2018, 19, 31–36. [Google Scholar] [CrossRef]
Fu, Z.; Zhu, L.; Tao, F.; Si, P.; Sun, L. Optimization based energy management strategy for fuel cell/battery/ultra capacitor hybrid vehicle considering fuel economy and fuel cell lifespan. Int. J. Hydrogen Energy 2020, 45, 8875–8886. [Google Scholar] [CrossRef]
Nengqi, X.; Xiang, X.; Ruiping, Z. Energy management and control strategy of ship diesel-electric hybrid power system. J. Harbin Eng. Univ. 2020, 1, 153–160. [Google Scholar] [CrossRef]
Song, Z.; Hofmann, H.; Li, J.; Han, X.; Ouyang, M. Optimization for a Hybrid Energy Storage System in Electric Vehicles Using Dynamic Programing Approach. Appl. Energy 2015, 139, 151–162. [Google Scholar] [CrossRef]
Hu, K. Research on Optimization of Energy Management Strategy of Plug-In Hybrid Electric Vehicle. Master’s Thesis, Yanshan University, Hebei, China, 2023. [Google Scholar]
Rezaei, A.; Burl, J.B.; Zhou, B.; Rezaei, M. A New Real-Time Optimal Energy Management Strategy for Parallel Hybrid Electric Vehicles. IEEE Trans. Control Syst. Technol. 2017, 27, 830–837. [Google Scholar] [CrossRef]
Gu, Y.; Wu, X.; Xu, M. Adaptive Driving Pattern Identification Strategy of EREV. Veh. Engine 2021, 1, 1–8. [Google Scholar] [CrossRef]
Zhang, X.; Zhou, T. Energy Management Strategy of Parallel HEV Based on Dual-Mode Optimization Algorithm. Veh. Engine 2020, 6, 48–52. [Google Scholar] [CrossRef]
Wei, C.; Wu, Y.; Lin, C.; Xu, E.; Yu, Z. Energy management strategy of hybrid electric truck based on driving condition recognition. J. Shanxi Univ. Sci. Technol. 2022, 40, 158–164. [Google Scholar] [CrossRef]
Xu, L.; Ouyang, M.; Li, J.; Yang, F.; Lu, L.; Hua, J. Application of Pontryagin’s minimal principle to the energy management strategy of plugin fuel cell electric vehicles. Int. J. Hydrogen Energy 2013, 38, 10104–10115. [Google Scholar] [CrossRef]
Uebel, S.; Murgovski, N.; Tempelhahn, C.; Bäker, B. Optimal energy management and velocity control of hybrid electric vehicles. IEEE Trans. Veh. Technol. 2018, 67, 327–337. [Google Scholar] [CrossRef]
Ma, R.; Song, J.; Wang, Y.; Zhang, H.; Liang, B.; Li, Y. Fuel Cell Electric Propulsion System Energy Management Strategy Based on Flight Mode and Energy Consumption Analysis. Proc. CSEE 2023, 43, 221–235. [Google Scholar] [CrossRef]
Guo, X.; Yuan, Y.; Tong, L. Research on an energy management strategy for diesel-electric hybrid ships based on Pontryagin′s minimum principle. J. Harbin Eng. Univ. 2024, 45, 2176–2184. [Google Scholar] [CrossRef]
Pan, L. Research on Energy Management Strategy of Hydrogen Fuel Cell Long Endurance Fixed-Wing UAV. Master’s Thesis, Taiyuan University of Technology, Taiyuan, China, 2023. [Google Scholar] [CrossRef]
Motapon, S.N.; Dessaint, L.-A.; Al-Haddad, K. A robust H₂-consumption-minimization-based energy management strategy for a fuel cell hybrid emergency power system of more electric aircraft. IEEE Trans. Ind. Electron. 2014, 61, 6148–6156. [Google Scholar] [CrossRef]
Yan, D. Simulation Study on Comparison of Energy Management Strategies of Oil-Electric Hybrid UAV. Master’s Thesis, Tianjin University, Tianjin, China, 2022. [Google Scholar] [CrossRef]
Han, J.; Park, Y.; Kum, D. Optimal adaptation of equivalent factor of equivalent consumption minimization strategy for fuel cell hybrid electric vehicles under active state inequality constraints. J. Power Sources 2014, 267, 491–502. [Google Scholar] [CrossRef]
Boukoberine, M.N.; Zia, M.F.; Benbouzid, M.; Zhou, Z.; Donateo, T. Hybrid fuel cell powered drones energy management strategy improvement and hydrogen saving using real flight test data. Energy Convers. Manag. 2021, 236, 113987. [Google Scholar] [CrossRef]
Man, Y. Research on model predictive control strategy of hybrid electric UAV. Adv. Aeronaut. Sci. Eng. 2024, 5, 1–7. [Google Scholar]
Moura, S.J.; Fathy, H.K.; Callaway, D.S.; Stein, J.L. A stochastic optimal control approach for power management in plug-in hybrid electric vehicles. IEEE Trans. Control Syst. Technol. 2011, 19, 545–555. [Google Scholar] [CrossRef]
Zhu, X.; Dai, M.; Peng, X.; Zhong, F.; Zhang, X.; Yang, J. Design of energy management strategy for hybrid propulsion system of compound-wing VTOL unmanned aerial vehicle. Intern. Combust. Engine Powerpl. 2023, 40, 35–45. [Google Scholar] [CrossRef]
Feng, J. Research on Dynamic ECMS Energy Management Strategy for Hybrid Electric Vehicles. Ph.D. Thesis, Tongji University, Shanghai, China, 2022. [Google Scholar] [CrossRef]
Han, J.; Kum, D.; Park, Y. Synthesis of predictive equivalent consumption minimization strategy for hybrid electric vehicles based on closed-form solution of optimal equivalence factor. IEEE Trans. Veh. Technol. 2017, 66, 5604–5616. [Google Scholar] [CrossRef]
Yang, M.; Hu, C.; Xu, Y.; Du, C. Simulation of adaptive energy management strategy for multi-rotor hybrid UAVs. J. Nanjing Univ. Aeronaut. Astronaut. 2023, 55, 1004–1015. [Google Scholar] [CrossRef]
Chen, L.; Liao, Z.; Ma, X.; Liu, C. Hierarchical Control-based Real-time Energy Management Strategy for Hybrid Electric Vehicles. Acta Armamentarii 2021, 42, 1580–1591. [Google Scholar] [CrossRef]
Liang, T.; Yuan, Y.; Tong, L. An Energy Management Strategy for Hybrid Ship Based on SVM and MPC. J. Transp. Inf. Saf. 2024, 42, 125–135. [Google Scholar] [CrossRef]
Qin, D.; Qin, L. Energy Management Strategy for a Power Split Hybrid Electric Vehicle Based on Explicit Stochastic Model Predictive Control. J. South China Univ. Technol. 2019, 47, 112–120. [Google Scholar] [CrossRef]
Ma, Y.; Li, C.; Wang, S. Multi-objective energy management strategy for fuel cell hybrid electric vehicle based on stochastic model predictive control. ISA Trans. 2022, 131, 178–196. [Google Scholar] [CrossRef]
Jia, C.; Liu, W.; He, H.; Chau, K.T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Convers. Manag. 2024, 321, 119032. [Google Scholar] [CrossRef]
Ansari, S.; Ayob, A.; Hossain Lipu, M.S.; Hussain, A.; Saad, M.H.M. Multi-Channel Profile Based Artificial Neural Network Approach for Remaining Useful Life Prediction of Electric Vehicle Lithium-Ion Batteries. Energies 2021, 14, 7521. [Google Scholar] [CrossRef]
Małek, A.; Marciniak, A. The use of deep recurrent neural networks to predict performance of photovoltaic system for charging electric vehicles. Open Eng. 2021, 11, 377–389. [Google Scholar] [CrossRef]
Zhu, Y.; Zhao, D.; Li, X.; Wang, D. Control limited adaptive dynamic programming for multi-battery energy storage systems. IEEE Trans. Smart Grid 2019, 10, 4235–4244. [Google Scholar] [CrossRef]
Shojaeighadikolaei, A.; Ghasemi, A.; Jones, K.; Dafalla, Y.; Bardas, A.G.; Ahmadi, R.; Hashemi, K.M. Distributed Energy Management and Demand Response in Smart Grids: A Multi-Agent Deep Reinforcement Learning Framework. arXiv 2022. [Google Scholar] [CrossRef]
Dorokhova, M.; Martinson, Y.; Ballif, C.; Wyrsch, N. Deep reinforcement learning control of electric vehicle charging in the presence of photovoltaic generation. Appl. Energy 2021, 301, 117504. [Google Scholar] [CrossRef]
Jin, H.; Zhang, Z. Review of Research on HEV Energy Management Based on Adaptive Dynamic Programming. Automot. Eng. 2020, 42, 1490–1496. [Google Scholar] [CrossRef]
Jia, C.; Liu, W.; He, H.; Chau, K.T. Superior energy management for fuel cell vehicles guided by improved DDPG algorithm: Integrating driving intention speed prediction and health-aware control. Appl. Energy 2025, 394, 126195. [Google Scholar] [CrossRef]
Jo, S.; Jung, S.; Roh, T. Battery State-of-Health Estimation Using Machine Learning and Preprocessing with Relative State-of-Charge. Energies 2021, 14, 7206. [Google Scholar] [CrossRef]
Tan, H.; Zhang, H.; Peng, J.; Jiang, Z.; Wu, Y. Energy management of hybrid electric bus based on deep reinforcement learning in continuous state and action space. Energy Convers. Manag. 2019, 195, 548–560. [Google Scholar] [CrossRef]
Zhang, D.; Shen, T. Review of research on energy management strategies for hybrid aircraft propulsion system. Aeroengine 2025, 51, 12–20. [Google Scholar] [CrossRef]
Kong, Z.; Liu, G.; Wu, H. Energy Management Strategies for Hybrid Vehicles Based on Intensive Learning. Front. Discuss. 2024, 3, 19–21. [Google Scholar]
Li, C.; Hu, G.; Zhu, Z.; Wang, X.; Jiang, W. Adaptive equivalent consumption minimization strategy and its fast implementation of energy management for fuel cell electric vehicles. Int. J. Energy Res. 2022, 46, 16005–16018. [Google Scholar] [CrossRef]
Jin, H.; Chen, K.; Wang, Y.; Gao, D. Energy Management Strategy for Hybrid-powered Ships Considering Emission Control Areas. J. Guangdong Ocean. Univ. 2024, 44, 127–133. [Google Scholar] [CrossRef]
Moeini-Aghtaie, M.; Dehghanian, P.; Davoudi, M. Energy management of Plug-In Hybrid Electric Vehicles in renewable-based energy hubs. Sustain. Energy Grids Netw. 2022, 32, 100932. [Google Scholar] [CrossRef]
Hu, B.; Li, J. An Adaptive Hierarchical Energy Management Strategy for Hybrid Electric Vehicles Combining Heuristic Domain Knowledge and Data-Driven Deep Reinforcement Learning. IEEE Trans. Transp. Electrif. 2022, 8, 3275–3288. [Google Scholar] [CrossRef]
Chen, J.; Liu, J.; Wang, Z.; Zeng, J.; Hong, X. Energy management strategy of hybrid ship based on deep reinforcement learning. China Meas. Test 2020, 2, 9–15. [Google Scholar]
Ghode, S.D.; Digalwar, M. Deep dyna reinforcement learning based energy management system for solar operated hybrid electric vehicle using load scheduling technique. J. Energy Storage 2024, 102 Pt B, 114106. [Google Scholar] [CrossRef]
You, J. Energy Management Strategy of Hybrid Electric Vehicle Based on Q-learning. Automob. Electr. Parts Electron. 2024, 8, 24–30. [Google Scholar]
Tian, Z. Energy Management Strategy of Plug-in Hybrid Electric Vehicles Based on Q-learning. Automob. Appl. Technol. 2022, 20, 5–8. [Google Scholar] [CrossRef]
Wei, H.; He, S. Multi-Objective Optimal Control Strategy for Plug-in Diesel Electric Hybrid Vehicles Based on Deep Reinforcement Learning. J. Chongqing Jiaotong Univ. Nat. Sci. 2021, 40, 44–52. [Google Scholar] [CrossRef]
Liu, J.; Gao, F.; Luo, X. Survey of Deep Reinforcement Learning Based on Value Function and Policy Gradient. Chin. J. Comput. 2019, 42, 1406–1438. [Google Scholar] [CrossRef]
Estrada, P.M.; de Lima, D.; Bauer, P.H.; Mammetti, M.; Bruno, J.C. Deep learning in the development of energy management strategies of hybrid electric vehicles: A hybrid modeling approach. Appl. Energy 2022, 329, 120231. [Google Scholar] [CrossRef]
Recalde, A.; Cajo, R.; Velasquez, W.; Alvarez-Alvarado, M.S. Machine Learning and Optimization in Energy Management Systems for Plug-In Hybrid Electric Vehicles: A Comprehensive Review. Energies 2024, 17, 3059. [Google Scholar] [CrossRef]
Hu, D.; Zhang, Y. Deep Reinforcement Learning Based on Driver Experience Embedding for Energy Management Strategies in Hybrid Electric Vehicles. Energy Technol. 2022, 10, 2200123. [Google Scholar] [CrossRef]
Wang, H.; Ye, Y.; Zhang, J.; Xu, B. A comparative study of 13 deep reinforcement learning based energy management methods for a hybrid electric vehicle. Energy 2023, 266, 126497. [Google Scholar] [CrossRef]
Chen, Z.; Fang, Z.; Yang, R.; Yu, Q.; Kang, M. Energy Management Strategy for Hybrid Electric Vehicle Based on the Deep Reinforcement Learning Method. Trans. China Electrotech. Soc. 2022, 37, 6157–6168. [Google Scholar] [CrossRef]
Su, M.; Yao, F. Energy Management Strategy of Hybrid Electric Vehicle Based on Deep Reinforcement Learning. Electr. Autom. 2023, 45, 115–118. [Google Scholar] [CrossRef]
Xue, S. Research on Energy Management Strategy of Hybrid Electric Vehicle Based on Deep Reinforcement Learning. Master’s Thesis, Nanjing University of Aeronautics and Astronautics, The Graduate School College of Energy and Power Engineering, Nanjing, China, 2021. [Google Scholar] [CrossRef]
Li, J.; Sun, Y.; Pang, Y.; Wu, C.; Yang, X.; Hu, B. Energy Management Strategy of Hybrid Vehicle Based on Parallel Deep Reinforcement Learning. J. Chongqing Univ. Technol. Nat. Sci. 2020, 34, 62–72. [Google Scholar] [CrossRef]
Pang, S.; Zhao, S.; Cheng, B.; Tian, W.; Mao, Z. A Review of Reinforcement Learning Based Energy Management Strategies for Hybrid Transportation Vehicles. Unmanned Syst. Technol. 2025, 8, 1–19. [Google Scholar] [CrossRef]
Zhou, Y.; Xu, L.; Cheng, Q.; Liu, Y.; Chen, Y. Based on load following threshold change rule energy management strategy for parallel hybrid electric vehicles. Highw. Automot. Appl. 2021, 205, 1–7. [Google Scholar] [CrossRef]
Zhang, S.; Wang, K.; Yang, R.; Huang, W. Research on Energy Management Strategy for Hybrid Electric Bus Based on Deep Reinforcement Learning. Chin. Intern. Combust. Engine Eng. 2021, 42, 10–16. [Google Scholar] [CrossRef]
Miao, D.; Chen, L.; Wang, X. Energy management strategy of marine series hybrid system based on NSGA-II optimization. Ship Sci. Technol. 2022, 44, 113–118. [Google Scholar] [CrossRef]

Figure 1. Series hybrid system.

Figure 2. Parallel hybrid system.

Figure 3. Combined hybrid system.

Figure 4. Battery output power under the three strategies.

Figure 5. Computing time under the three strategies.

Table 1. Comparison of fuel economy strategies.

Strategy	Average Fuel Consumption Rate (g/kW·h)	Cumulative Fuel Consumption (g)
Dynamic Programming	334.2	5578.3
Fixed Rules	352.1	5847.9
Fuzzy Logic	355.2	5973.1

Table 2. Comparison of global and instantaneous optimization.

Characteristic	Global Optimization	Instantaneous Optimization
Objective	Find the global optimum solution	Find a sufficiently good solution in real-time
Computational Complexity	High	Moderate or low
Convergence Speed	Slow	Faster
Application Scenarios	Suitable for optimization problems requiring long-term effects and comprehensive consideration, such as engineering design, neural network training in machine learning.	Applicable to scenarios with strong real-time requirements, such as control systems, signal processing, high-frequency trading.
Algorithm Selection	Exact methods or approximate algorithms.	Feedback control, sliding window optimization, online learning algorithms, etc.

Table 3. Comparison of simulation results from different method strategies.

Methods	System Equivalent Hydrogen Consumption/g	System Average Efficiency/%	FC Stress	BC Stress	SC Stress
EMCS	95.85	40.50	10.63	13.60	16.42
PMP	94.41	41.87	10.89	8.62	13.56
SAPMP	91.42	44.17	6.39	7.86	8.54
Optimization Effect	4.6%, 3.2%	10.3%, 5.5%	39.8%, 41.3%	42.2%, 8.8%	47.9%, 37.0%

Table 4. Comparison of results from two energy management strategies.

Strategy	Average Specific Fuel Consumption/(g/(kW·h))	Total Fuel Consumption/kg
Rule-Based Energy Management	243.40	302.250
PMP Strategy	207.20	204.053

Table 5. Index comparisons.

Strategy	SOC Root Mean Square Error (%)	Voltage RMS Error (V)	Engine Fuel Consumption(L/100 km)	Equivalent Fuel Consumption (L/100 km)	Maximum Single-Step Calculation Time (ms)
Fuzzy control	10.26	9.87	77.35	74.55	22.49
Model predictive control	5.29	41.75	69.77	68.81	233.52
Hierarchical control	4.32	40.71	65.71	64.82	350.94

Table 6. Simulation comparison results of energy management strategies.

Strategy	Total Consumption/g	Fuel Consumption Rate/(g/kWh)	Ratio Compared to DP Strategy
DP	4224.9877	194.1970	1.0000
MPC	4496.6746	211.5331	1.0643
SVM + MPC	4404.5561	202.9737	1.0425

Table 7. Key comparison between optimization strategy and rule-based strategy.

Characteristics	Rule-Based Strategies	Optimization Strategies (Especially ECMS/MPC)
Core principles	Preset logical conditions,	Mathematical optimization and solution
Objective	Empirically balancing fuel consumption, electricity, and power	Pursuing the optimal fuel consumption or equivalent fuel consumption under specific constraints
Real-time performance	High, easy to calculate	Medium to high (ECMS high, MPC medium)
Computing resources	Low demand	High demand (especially MPC)
Performance limit	Low (based on expert experience)	High (close to theoretical optimal)
Adaptability	Calibration for specific operating conditions has limited adaptability	Better adaptability
Robustness	High, insensitive to model errors	Medium, dependent on model and prediction accuracy
Parameter tuning	Large scale engineering calibration (threshold, boundary)	Fine parameter calibration (equivalent factors, weights, etc.)
Realize complexity	Relatively simple	Complex (especially MPC)
Relying on future information	No	MPC Yes, ECMS/MPP No

Table 8. Comparison of two energy management strategies.

Strategy	Fuel Consumption (g)	Fuel Economy (%)
Based on Rules	426.30	-
Reinforcement Learning	420.20	12.00%

Table 9. Results of Q-learning energy management strategy.

Strategy	Final SOC	Total Price/CNY
Q-learning	0.300	52.1861
PMP	0.300	50.6104

Table 10. Cost comparison of different energy management strategies.

Control Strategy	Coasting Cost (Rmb)	Coasting Cost Reduction Ratio (%)	Fuel Cost (Rmb)	Fuel Consumption (L/100 km)	Fuel Cost Reduction Ratio (%)
CD-CS	34.77		29.21	4.999
PSO	28.96	16.71	23.31	3.990	20.18
DDPG	31.92	8.20	26.26	4.494	10.10

Table 11. Energy consumption comparison before and after controller updates.

Feature	Fuel Economy (L/100 km)	Final SOC (%)	Fuel Consumption (Rmb)
Coasting Cost	48.14	46.39	3.64
Fuel Consumption (L/100 km)	3.336	3.199	4.11

Table 12. Energy management strategy comparison.

Energy Management Strategy	Coasting Fuel Economy (L/100 km)	Final SOC (%)	Fuel Consumption (L/100 km)
DQN	2.713	0.520	0.160
DDQN	2.957	0.547	0.174
DDPG	2.803	0.566	0.165

Table 13. Strategy performance comparison.

Strategies	Fuel Economy (L/100 km)	Comparison with DP (%)	Real-Time Performance	Adaptability
DP	4.352	100	Poor (Offline)	Depends on Global Information
D3QN	4.697	90.2	Moderate	Strong (Robust to Environmental Interference)
DDPG	4.519	94.2	Excellent	Adaptability
MPC	5.002	86.9	Poor (Computational Complexity)	Average

Table 14. Comparative results of fuel consumption under three strategies.

Strategy	Final SOC	Fuel Economy (L/100 km)
DP	0.599	18.34
DDQN	0.598	19.48
TD3	0.598	19.51

Table 15. Comparison and application of energy management strategies in hybrid power systems.

Strategy Type	Advantages	Disadvantages	Application
Based on Rules	Deterministic rules	High real-time performance	Local sub-optimality
	Fuzzy control	Moderate optimization	Moderate computational demands
Optimization-based	Instantaneous optimization	Moderate priority over global optimization	Limited application scope
Global Optimization	Low real-time performance	Achieves global optimum	High computational demands
Intelligent Control	Self-adaptive dynamic programming	Dynamic adaptability	Moderate computational demands
	Deep reinforcement learning

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, G.; Feng, Z.; Sun, P.; Guo, L.; Chen, Z. A Overview of Energy Management Strategies for Hybrid Power Systems. Energies 2025, 18, 4769. https://doi.org/10.3390/en18174769

AMA Style

Feng G, Feng Z, Sun P, Guo L, Chen Z. A Overview of Energy Management Strategies for Hybrid Power Systems. Energies. 2025; 18(17):4769. https://doi.org/10.3390/en18174769

Chicago/Turabian Style

Feng, Guoyu, Zhishu Feng, Peng Sun, Lulu Guo, and Zhiyong Chen. 2025. "A Overview of Energy Management Strategies for Hybrid Power Systems" Energies 18, no. 17: 4769. https://doi.org/10.3390/en18174769

APA Style

Feng, G., Feng, Z., Sun, P., Guo, L., & Chen, Z. (2025). A Overview of Energy Management Strategies for Hybrid Power Systems. Energies, 18(17), 4769. https://doi.org/10.3390/en18174769

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Overview of Energy Management Strategies for Hybrid Power Systems

Abstract

1. Introduction

2. Classification of Hybrid System Types and Energy Management Strategies

2.1. Types and Characteristics of Hybrid Vehicles

2.1.1. System Architecture and Characteristics of Diesel-Electric Hybrid Passenger Vehicles

2.1.2. Hybrid Electric Vehicle (PHEV) System Architecture and Characteristics

2.2. Types and Characteristics of Hybrid Aircraft

2.3. Hybrid Ship System Architecture and Characteristics

2.3.1. Hydraulic-Electric Hybrid Propulsion System Architecture

2.3.2. Other Hybrid System Architectures

2.4. Control Objectives of Hybrid Power Systems

2.5. Energy Management Strategy Classification and Characteristics

2.5.1. Energy Management Strategy Classification

2.5.2. Hybrid Power System Energy Management Core Challenges

2.5.3. Energy Management Strategy Design Process

3. Rule-Based Energy Management Strategies

3.1. Rules of Certainty

3.2. Fuzzy Control

3.3. Hybrid Power System EMS Based on Rule-Based Control

4. Strategies Based on Optimizing Energy Management

4.1. Global Optimization Methods

4.2. Instantaneous Optimization Methods

4.3. Energy Management Strategies Based on Optimization in Hybrid Power Systems

4.3.1. EMS Based on DP/PMP in Hybrid Power Systems

4.3.2. EMS Based on Equivalent Consumption Minimization Strategy

4.3.3. Multi-Objective Energy Prediction for HEVs

5. Intelligent Control Energy Management Strategies

5.1. Neural Network Control

5.2. EMS of Hybrid System Based on ADP

5.3. Reinforcement Learning Strategies

5.4. EMS of Hybrid Power System Based on RL

5.4.1. Energy Management Strategy Based on Reinforcement Learning

5.4.2. Energy Management Strategy of Hybrid Power Systems Based on Deep Reinforcement Learning

6. Conclusions and Future Trends

6.1. Strategy Application Challenges

6.2. Strategy Model Limitations

6.3. Verification of Strategy Validity

6.4. Comparison of Different Strategies

6.5. Future Outlook

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI