Next Article in Journal
The Possibilities of Using Non-Traditional Raw Materials for Fertilizing Products
Previous Article in Journal
Variety-Seeking Shopping Behaviours in the Age of Green Content Marketing, Affiliate Marketing, and Shopping Motives: An Agenda for Future Research Using a TCCM Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Objective Scheduling Method for Integrated Energy System Containing CCS+P2G System Using Q-Learning Adaptive Mutation Black-Winged Kite Algorithm

Faculty of Electrical and Control Engineering, Liaoning Technical University, Huludao 125105, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(13), 5709; https://doi.org/10.3390/su17135709
Submission received: 21 April 2025 / Revised: 25 May 2025 / Accepted: 16 June 2025 / Published: 20 June 2025

Abstract

This study proposes an improved multi-objective black-winged kite algorithm (MOBKA-QL) integrating Q-learning with adaptive mutation strategies for optimizing multi-objective scheduling in integrated energy systems (IES). The algorithm dynamically selects mutation strategies through Q-learning to enhance solution diversity and accelerate convergence. First, an optimal scheduling model is established, incorporating a carbon capture system (CCS), power-to-gas (P2G), solar thermal, wind power, and energy storage to minimize economic costs and carbon emissions while maximizing energy efficiency. Second, the heat-to-power ratio of the cogeneration system is dynamically adjusted according to load demand, enabling flexible control of combined heat and power (CHP) output. The integration of CCS+P2G further reduces carbon emissions and wind curtailment, with the produced methane utilized in boilers and cogeneration systems. Hydrogen fuel cells (HFCs) are employed to mitigate cascading energy losses. Using forecasted load and renewable energy data from a specific region, dispatch experiments demonstrate that the proposed system reduces economic costs and CO2 emissions by 14.63% and 13.9%, respectively, while improving energy efficiency by 28.84%. Additionally, the adjustable heat-to-power ratio of CHP yields synergistic economic, energy, and environmental benefits.

1. Introduction

With socioeconomic development, population growth, and rising living standards, the demand for electricity, heating, and cooling resources has surged, while traditional fossil energy reserves are progressively depleting [1]. Renewable energy sources (e.g., wind and solar power) have emerged as essential alternatives to fossil fuels due to their cleanliness, sustainability, and cost-effectiveness [2,3]. CHP systems, recognized for their high efficiency, low-carbon footprint, and energy-saving potential, are widely adopted in modern power systems to simultaneously generate electricity and utilize waste heat [4]. Recently, the integration of fossil fuel-based integrated energy systems (IES) with renewable energy has garnered significant attention, as it enhances energy diversity while reducing carbon emissions [5]. However, the operational complexity of IES—driven by diverse equipment and dynamic conditions [6]—necessitates strategic optimization to ensure efficient performance [7]. To address this challenge, this study proposes an intelligent optimization algorithm that combines Q-learning with adaptive mutation strategies. The proposed approach aims to resolve the multi-device coordination and coupling issues in IES and improve overall operational efficiency and energy utilization.

1.1. Literature Review

In IES, pollutant emissions primarily originate from fossil fuel-based equipment. Integrating renewable energy (e.g., solar/wind) and clean technologies is critical for emission reduction. Zhang et al. [8,9,10] demonstrated that incorporating solar and wind power into IES not only alleviates supply–demand imbalances and climate impacts but also enhances economic performance by minimizing energy waste through wind curtailment utilization. However, the inherent intermittency of renewables remains a research challenge. Addressing this, Karolina et al. [11] developed a hydrogen storage solution using electrolysis for stable energy recovery.
Recent studies have demonstrated that integrating energy conversion and storage devices significantly enhances system performance. Li et al. [12] proposed a two-layer alternating optimization dispatch model incorporating CCS systems, quantifying both economic benefits and carbon emission reductions. Chen et al. [13,14,15] implemented P2G technology, utilizing surplus electricity for water electrolysis to produce hydrogen, which subsequently reacted with CO2 captured by CCS to synthesize methane. Their approach incorporated hydrogen storage devices and HFC to effectively reduce energy cascade losses. For renewable energy management, Marco et al. [16] verified that battery energy storage systems effectively mitigate renewable generation and load fluctuations. Through comparative scenario analysis, Rakibul et al. [17] established that a thermal energy storage tank (TES) combined with excess energy recovery could increase renewable energy penetration while reducing carbon emissions. As a core component of IES, CHP systems require the coordinated optimization of both equipment design and operational strategies. Common strategies such as following electric load (FEL) and following thermal load (FTL) exhibit scenario-dependent performance variations. Song et al. [18] systematically evaluated CHP operation strategies, emphasizing that optimal strategy selection is critical for maximizing system efficiency.
The integration of multiple devices significantly increases the coupling complexity of Integrated Energy Systems (IES). To efficiently obtain optimal solutions under complex constraints, intelligent optimization algorithms have been increasingly adopted in IES scheduling problems. Commonly used algorithms include the sparrow search algorithm (SSA) [19], genetic algorithm (GA) [20], and whale optimization algorithm (WOA) [21]. Recent advancements have focused on incorporating multi-objective optimization mechanisms and adaptive mutation strategies to enhance global search capability and solution diversity. For instance, Li et al. [22] proposed the MOAOA algorithm, which integrates non-dominated sorting and mutation operations to effectively optimize the system configuration in multi-objective scenarios. Algorithmic improvements often focus on the design of mutation strategies. Yu et al. [23] employed a time-varying Gaussian mutation strategy to address the loss of population diversity in the later stages of Particle Swarm Optimization (PSO). Li et al. [24] developed the Improved Dung Beetle Optimizer (IDBO) based on an adaptive t-distribution, which significantly enhanced both the optimization performance and search accuracy. While these methods demonstrate effective solution space exploration through distinct mutation strategies, most still employ static, predefined mechanisms without real-time environmental interaction.
In recent years, deep reinforcement learning (DRL) has demonstrated significant potential for IES modeling and scheduling owing to its strong generalization capabilities and adaptability to complex environments. Dong et al. [25] developed a soft actor–critic (SAC) algorithm and built an environmental interaction model to simulate real-time feedback for DRL training. Suo et al. [26,27] employed deep learning to reduce modeling complexity and uncertainty in IES, significantly enhancing algorithmic computational efficiency. Chen et al. [28,29] implemented Q-learning for energy management optimization, improving both sample efficiency and generalization ability. As a model-free reinforcement learning method, Q-learning dynamically learns optimal strategies through environmental interaction. When integrated into mutation selection processes, it facilitates adaptive strategy adjustment based on real-time search-state feedback, thereby effectively balancing global exploration and local exploitation.
The black-winged kite algorithm (BKA), proposed by Wang et al. in 2024 [30], is a novel metaheuristic optimization method inspired by the migratory and predatory behaviors of black-winged kites. This algorithm innovatively integrates a Cauchy mutation strategy with a leader-following mechanism. Comprehensive benchmark tests have demonstrated its effectiveness in solving constrained optimization problems. Given these advantages, we adopt BKA in this study to ensure reliable and efficient energy management in integrated energy systems.

1.2. Research Gap

(1) Existing studies have rarely integrated energy storage and conversion units—including renewable energy sources, CCS, P2G systems, and oxygen storage tanks (OST)—into a unified IES. Furthermore, the selection of CHP operation strategies, which critically impacts IES performance, often lacks synergistic coordination with equipment configurations. Most current approaches adopt fixed heat-to-power ratio strategies while failing to accommodate dynamic variations in system load fluctuations and energy demands, thereby constraining system adaptability.
(2) Current heuristic optimization methods for IES scheduling frequently exhibit premature convergence and slow convergence rates, largely attributable to static mutation mechanisms. Although Q-learning has demonstrated potential in energy management applications, its implementation has been restricted to policy-level optimization without integration into mutation operations. This limitation results in insufficient autonomous learning capacity, compromising adaptability in dynamic operating conditions. Moreover, critical multi-objective evaluation metrics (e.g., hypervolume (HV), the multi-objective value index (MOVI)) remain underutilized, highlighting persistent challenges in multi-objective optimization.

1.3. Research Contribution

A review of the literature on the optimal dispatch of IES reflects that coupling wind and solar energy with these systems has become essential. The introduction of the CCS+P2G system can significantly improve system operating efficiency, reduce carbon emissions, curtail the common wind power curtailment phenomenon, and reinforce system stability and reliability. The CHP system with an adjustable heat-to-power ratio can flexibly adapt to demand, optimize electric and heat output, strengthen efficiency, and lower costs. The introduction of an OST has enhanced oxyfuel combustion and diminished natural gas consumption. To address the complexity of a multi-variable IES, this paper proposes a MOBKA-QL algorithm with adaptive variation, which leverages Q-learning to interact with the environment and enhance optimization performance.
Therefore, the main research contributions of this paper are drawn as follows:
(1)
A multi-objective optimal scheduling model for the IES was established, aiming to minimize economic cost and CO2 emissions while maximizing energy efficiency. The IES integrates multiple energy conversion and storage devices, involving technologies such as power generation, energy storage, gas production, and CCS. It can efficiently satisfy the combined load demand for power, heat, and cooling.
(2)
MOBKA-QL integrates five mutation strategies and employs Q-learning for adaptive selection during iterations, enabling environment-aware and self-learning capabilities. The original BKA is extended to handle multi-objective optimization. In this framework, solution sets are first evaluated by MOVI, then ranked based on crowding distance, and finally, the optimal Pareto solution is selected using TOPSIS, making it well-suited for IES scheduling.
(3)
The operation of CHP adopts an adjustable heat-to-power ratio strategy, which combines real-time data on demand-side loads and renewable energy to dynamically adjust energy supply. Using the same algorithm and model, we compared this strategy with a constant heat-to-power ratio strategy in a test, which verified the superiority of the former in terms of economy, environment, and energy.
The rest of this paper is organized as follows. Section 2 describes the structure and mathematical model of IES. Section 3 presents MOBKA-QL and the scheduling model of IES. In Section 4, the performance of the algorithms is tested, as well as energy utilization, economic benefits, the environmental benefits of IES, and the choice of CHP thermoelectric ratio strategy through a typical season; optimization experiments are conducted in different scenarios. The results are analyzed and discussed in Section 5.

2. Integrated Energy System Modeling

The internal energy demand of the IES was addressed through the coordinated integration of diverse energy sources and supply equipment. An extended CCS model is proposed in this study, improving upon conventional frameworks. As illustrated in Figure 1, the study focuses on the efficient utilization of hydrogen energy, the adjustable heat-to-power ratio of CHP systems, and the power equipment involved in the two-stage P2G conversion process.
As illustrated in Figure 1, the IES proposed in this study comprises wind turbines, CHP units, solar collector panels, and gas boilers to supply energy for overall system operation. On the demand side, the system accommodates thermal, electrical, and cooling loads. The cooling load is met by an absorption chiller (AC) and an electric chiller (EC). The intermediate energy conversion and storage infrastructure includes a carbon capture unit, a heat storage tank, a battery, an OST, and a two-stage P2G unit. The P2G system generates hydrogen for HFC, which supplies electricity to the system. This setup facilitates the decoupling of heat and power, thereby improving operational flexibility and enabling better utilization of wind power. The integration of CCS with P2G technology reduces system emissions and mitigates wind power curtailment while supplying natural gas, electricity, and heat. Additionally, the oxygen produced during the two-stage P2G process is stored in oxygen tanks and supplied to CHP units for oxygen-enriched combustion, which decreases natural gas consumption, lowers costs, and enhances overall system efficiency.

2.1. Two-Stage P2G Operational Process

Hydrogen energy, recognized for its purity and efficiency, holds significant potential across various applications, such as hydrogen-powered vehicles and hydrogen fuel cells [31]. The two-stage operation process of P2G is illustrated in Figure 2.
In the two-stage P2G system, the direct utilization of HFCs reduces energy losses associated with multi-step conversions compared to the conventional methanation reaction (MR) pathway. Additionally, an OST is incorporated into the P2G model to store oxygen generated via water electrolysis and supply it to the CHP unit for oxyfuel combustion, thereby reducing the unit’s reliance on natural gas. The corresponding energy conversion process is formulated as follows.
(1)
Electrolytic cell
W el , H 2 , t = η el E e , EL , t E e , el min E e , el , t E e , el max Δ E e , el min E e , el , t + 1 E e , el , t Δ E e , el max
where E e , el , t represents the electrical energy input to the electrolytic cell (EL) in time period t; W el , H 2 , t denotes the hydrogen energy output from the EL in time period t; η e l indicates the energy conversion efficiency of the EL; E e , el max and E e , EL min specify the upper and lower limits of the electrical energy input to the EL, respectively; and Δ E e , el max and Δ E e , el max refer to the upper and lower limits of the climb of the EL, respectively.
(2)
Methane reactor
W mr , g , t = η mr W H 2 , mr , t W H 2 , mr min W H 2 , mr , t W H 2 , mr max Δ W H 2 , mr min W H 2 , mr , t + 1 W H 2 , mr , t Δ W H 2 , mr max
where W H 2 , mr , t represents the hydrogen energy input to the MR in time period t; W mr , g , t indicates the natural gas power output from the MR in time period t; η mr denotes the energy conversion efficiency of the MR; W H 2 , mr max and W H 2 , mr min stand for the upper and lower limits of the hydrogen energy input to the MR, respectively; and Δ W H 2 , mr max and Δ W H 2 , mr min refer to the upper and lower limits of the MR’s climb, respectively.
(3)
Hydrogen fuel cell
Since the efficiency of converting heat to electricity in HFC can be viewed as a constant, HFC is modeled as follows:
E hfc , e , t = η hfc , e W hfc , H 2 , t Q hfc , h , t = η hfc , h W hfc , H 2 , t W H 2 , hfc min W H 2 , hfc , t W H 2 , hfc max Δ W H 2 , hfc min W H 2 , hfc , t + 1 W H 2 , hfc , t Δ W H 2 , hfc max
where W hfc , H 2 , t represents the hydrogen energy input to the HFC in time period t; E hfc , e , t and Q hfc , h , t denote the electric and thermal energy output from the HFC in time period t, respectively; η hfc , e and η hfc , h indicate the efficiency of converting the HFC to electric and thermal energy, respectively; W H 2 , hfc max and W H 2 , hfc min refer to the upper and lower limits of the hydrogen energy input to the HFC, respectively; and Δ W H 2 , hfc max and Δ W H 2 , hfc min stand for the upper and lower limits of the HFC’s climb, respectively.

2.2. CCS+P2G System

When CCS operates independently, it faces challenges such as the high cost of carbon sequestration and long-distance transportation. To address these issues, this study proposes the coordinated operation of P2G and CCS to achieve dual benefits. On one hand, CCS can supply the captured C O 2 directly to the P2G process, thereby reducing overall system emissions. On the other hand, P2G enhances the IES utilization of clean energy, supports carbon recycling, and reduces dependence on purchased natural gas [32]. Figure 3 illustrates the carbon cycle between CCS and P2G.
The CCS+P2G coupling model is described as follows.
E ccs , t = β M ccs , t W p 2 g , t = ρ C O 2 V p 2 g , C H 4 , t W C O 2 , H 2 , t = ρ H 2 V p 2 g , C H 4 , t V p 2 g , C H 4 , t = 3 . 6 η p 2 g E p 2 g , t / L C H 4 W H 2 , t = W C O 2 , H 2 , t + W hfc , H 2 , t E hfc , e , t = W hfc , H 2 , t η hfc , e Q hfc , h , t = W hfc , H 2 , t η hfc , h
where E ccs , t represents energy consumption of the CCS system in time period t; M ccs , t denotes what is captured by the CCS system in time period t; W p 2 g , t indicates what is consumed by the P2G in time period t; V p 2 g , CH 4 , t stands for the amount of natural gas produced by the P2G system at time moment t; E p 2 g , t refers to the energy consumption of the P2G system in time period t; β describes the electrical energy required to capture the units in the carbon capture system; ρ CO 2 is the generation units CH 4 consumed CO 2 ; η p 2 g specifies the P2G conversion efficiency; L CH 4 signifies the calorific value of the natural gas; ρ H 2 symbolizes the generation units CH 4 consumed H 2 ; W CO 2 , H 2 , t embodies H 2 reacted with CO 2 in MR; W H 2 , t exemplifies all H 2 produced by EL equipment; and W hfc , H 2 , t expresses H 2 consumed by HFC.
In this setup, the energy required by the CCS+P2G integrated system was entirely supplied by wind power, aiming to maximize wind energy utilization and reduce curtailment. Any surplus wind power was subsequently allocated to the IES for electricity generation and other operational demands. The energy flow within the CCS+P2G system is detailed as follows:
E w t , t = E c c s , t + E p 2 g , t + E wte , t
where E wt , t represents the Wind turbines output in time period t; E ccs , t indicates the CCS energy consumption in time period t; E p 2 g , t denotes the P2G energy consumption in time period t; and E wte , t signifies the electricity supplied by the WTGs for continued participation in the power system.

2.3. Adjustable Thermoelectric Ratio for CHP

In cogeneration systems, electricity is typically generated by consuming natural gas. In this study, the CHP unit is capable of small-scale oxygen-enriched combustion, enhancing efficiency through the utilization of oxygen stored in the OST, which is generated during the P2G process [33]. Conventional CHP systems are typically categorized into two modes: heat-led and power-led, both of which operate under a fixed heat-to-power ratio. However, this study investigates a CHP system with an adjustable heat-to-power ratio, which dynamically adjusts to daily heating and electricity demands to improve overall operational efficiency. The corresponding operational model is described as follows:
E chp , e , t = η chp , e W g , chp , t Q chp , h , t = η chp , h W g , chp , t W g , chp min W g , chp , t W g , chp max Δ W g , chp min W g , chp , t + 1 W g , chp , t Δ W g , chp max κ chp min Q chp , h , t / E chp , e , t κ chp max
where W g , c h p , t represents the natural gas power input to the CHP in time period t; E CHP , e and Q CHP , h indicate the electrical and thermal energy output from the CHP in time period t, respectively; η chp , e and η chp , h denote the efficiency of conversion of the CHP to electrical and thermal energy, respectively; W g , chp max and W g , chp min stand for the upper and lower limits of the natural gas power input to the CHP, respectively; Δ W g , chp max and Δ W g , chp min signify the upper and lower limits of the CHP’s creep, respectively; and κ chp max and κ chp min refer to the upper and lower limits of the CHP’s creep, respectively.

2.4. Integrated Energy System Equipment

(1)
Wind turbine
The electrical power output of wind turbines is calculated by:
W w t = 0 ( v x < v c i ) ( v x > v c o ) W w t , r ( v x v c i ) / ( v r v c i ) ( v c i v x v r ) W w t , r ( v r v x v c o )
where W w t represents the electrical power of the wind turbine; v x denotes the actual wind speed at the site; v c i and v c o signify the minimum and maximum wind speed of the wind turbine, respectively; and v r indicates the rated wind speed of the wind turbine.
(2)
Solar thermal collector
The thermal power of the solar thermal collectors is calculated by:
Q st = γ η st S W st
where Q st , γ , and η st represent the collector power, unit conversion coefficient, and efficiency of the collector, respectively; S and W s t stand for the collector area and solar radiation intensity, respectively.
(3)
Gas boiler
The GB is activated when the system is insufficiently generating heat and natural gas is needed for heating. The heat produced is calculated by:
Q gb = V gb η gb
where Q gb , V gb , and η gb represent the heat generation, gas consumption, and thermal efficiency of the GB equipment, respectively.
(4)
Refrigeration equipment
The system was equipped with two types of refrigeration units: absorption chillers (ACs) and electric chillers (ECs). The AC utilized thermal energy (typically from CHP waste heat) to meet cooling demand, while the EC operated using electrical energy. The allocation of cooling load between these two units depended on the CHP system’s operational mode, particularly the availability of thermal energy.
The corresponding calculation models for the refrigeration units are presented as follows:
C ac = Q ac η ac C ec = E ec η ec
where C ac , Q ac , and η ac represent the cooling power, heat energy absorbed, and refrigeration efficiency of an absorption chiller, respectively; C ec , E ec , and η ec signify the cooling power, electrical power consumed, and refrigeration efficiency of an electric chiller, respectively.
(5)
Battery
The battery in the system functions to store and release electrical energy. When there is surplus electricity, the battery stores the excess energy; when the electricity supply is insufficient, it discharges the stored energy to meet the demand.
The formula for calculating battery charging and discharging is given as follows:
E t b a = E char , t 1 b a η char b a E dis , t 1 b a / η dis b a + ( 1 η loss b a ) E t 1 b a
where E t ba , E char ba , and E dis ba denote the state of charge, charging power, and discharging power of the battery in time period t, respectively. η char ba , η dis ba , and η loss ba represent the charging and discharging efficiency and loss coefficient of the battery, respectively.
(6)
Thermal energy storage tank
The TES system improves thermal energy utilization by capturing and storing excess heat, thereby helping to compensate for thermal energy deficits within the system. The processes of heat absorption and release in TES are calculated as follows:
Q t tes = Q char , t 1 tes η char tes Q dis , t 1 tes / η dis tes + ( 1 η loss tes ) Q t 1 tes
where Q t tes , Q char , t tes , and Q dis , t tes represent the heat storage capacity, heat absorption power, and heat release power of the heat storage tank at time t, respectively; η char tes , η dis tes , and η loss tes indicate the heat storage efficiency and heat release efficiency of the heat storage tank and the loss coefficient, respectively.
(7)
Oxygen storage tank
The OST recovered the by-products of the electrolysis reaction to avoid the waste of oxygen. The stored oxygen was injected into the cogeneration for oxygen-enriched combustion and power generation to enhance equipment operating efficiency and curtail energy consumption. The OST storage is calculated as follows:
V t ost = V t 1 ost + η char ost V char , t ost V dis , t ost / η dis ost
where V t 1 ost represents the volume of oxygen remaining in the oxygen storage tank at time t − 1; V char , t ost and V dis , t ost signify the amount of oxygen charged and discharged at time t, respectively; and η char ost and η dis ost denote the efficiency of oxygen charging and discharging in the tank, respectively.

3. Multi-Objective Optimization Method for IES Based on MOBKA-QL

Based on this framework, a multi-objective optimization dispatch model for the IES was developed to minimize economic costs and pollutant emissions while maximizing energy utilization efficiency. The model was addressed using the MOBKA-QL algorithm. Detailed formulations of the model and the corresponding solution methodology are presented below.

3.1. Decision Variables

The CHP unit, as the core equipment in the IES system, operates in an “adjustable heat-to-power ratio” mode and outputs both heat and electricity. In this way, it significantly enhanced the overall energy efficiency of the system and played a crucial role in the performance of other equipment. The gas boiler (GB) was engaged to meet the heat demand when the heat supply was inadequate. Both the AC and the EC were considered components of the heat and electric load, respectively. They fulfilled the cooling load requirements by utilizing energy conversion equipment. The two chillers assisted in diversifying how heat and power were used and met cooling requirements. Meanwhile, the addition of the CCS+P2G system curtailed carbon emissions and consumed wind energy. The resulting methane and oxygen provided combustion energy for the CHP unit. The use of HFCs also diminished multi-stage energy losses. However, the output of these resources cannot be controlled artificially.
The following decision variables were set to coordinate the optimal operating states of the various devices in the IES:
X = [ E buy , E chp , e , Q chp , h , E hfc , e , Q hfc , h , Q ac , E ec , E ccs , E el , Q gb , E ba , V ost , Q tes ]
Other variables can be obtained through coupling constraints.

3.2. Objective Function

The synergistic benefits of multi-objective scheduling in IES were realized by optimizing equipment configurations and operational strategies and addressing complex constraints, all while ensuring system feasibility. A high-dimensional, multi-objective optimization model was established to minimize economic costs, enhance energy efficiency, and reduce carbon emissions. The proposed model offers a comprehensive framework for IES operation optimization, demonstrating significant improvements in economic performance and environmental impact.
(1)
Economic dispatch: Economic cost minimization
The economic cost of the system encompasses several components: the operational expenses associated with the CCS unit, the costs for operating and maintaining energy conversion and storage equipment, and the expenditure for procuring energy to optimize both electrical and thermal energy use. These components are mathematically formulated as follows:
f 1 = t = 1 T ( μ t E ccs , t + ϖ M ccs , t )
f 2 = i = 1 m ( c i t T W i , t )
f 3 = f buy + f gas
f buy = t = 1 T c buy , t E buy , t f gas = c gas , t t = 1 T ( V chp , t + V gb , t )
min F 1 = f 1 + f 2 + f 3
where μ t represents the operating coefficient of the carbon capture system; ϖ denotes the cost coefficient of carbon capture and storage of C O 2 ; M ccs , t signifies the amount of carbon capture and storage of C O 2 in time period t; c buy , t stands for the price of electricity in time period t; c i refers to the coefficient of operation of device i in time period t; m is the number of the devices; W i , t symbolizes the output of device i in time period t; f buy and f gas embody the cost of purchasing electricity and natural gas; n c gas , t exemplifies the price of natural gas in time period t.
(2)
Energy dispatch: Energy efficiency maximizing
The IES reinforced energy utilization efficiency by incorporating various energy sources. In this study, both the quantity and quality of energy were considered to maximize energy efficiency, which served as the second optimization objective [29]. Energy use efficiency effectively evaluated the high-quality consumption of energy and ensured the optimal use of different energy types. The specific expressions for this are:
max F 2 = t = 1 T E Eload , t + E char , t ba + ω 1 ( Q Qload , t + Q char , t tes ) + ω 2 C Cload , t + E ccs , t + E p 2 g , t max / t = 1 T E buy , t + E wt , t + E dis , t ba + ω 1 ( Q dis , t tes + Q hfc , h , t + Q st , t ) + E hfc , e , t + L CH 4 ( V gas , t + V mr , t ) min
where the numerator and the denominator represent the required load and energy supply of the integrated energy system, respectively; E Eload , t , Q Qload , t , and C Cload , t indicate the electrical, thermal, and cooling loads in time period t, respectively; E char , t ba denotes the electrical energy stored by the battery in time period t; Q char , t tes signifies the thermal energy stored in the thermal energy storage tank in time period t; E dis , t ba describes the electrical energy released from the battery in time period t; Q dis , t tes expresses the thermal energy released by the thermal energy storage tank in time period t; V gas , t indicates natural gas purchased by the system; V mr , t reflects the natural gas supplied by MR in the P2G process; ω 1 specifies the conversion coefficient between thermal and electrical energy sources; and ω 2 denotes the conversion coefficient between cold and electrical energy sources.
(3)
Low-carbon dispatch: Carbon dioxide emissions minimizing
The total carbon emissions of the system were diminished to reduce environmental pollution, highlighting the environmental advantages of the IES. The objective function for the system’s pollutant gas emissions is defined as:
min F 3 = t = 1 T E chp , t j = 1 n c j m chp , j + Q gb , t j = 1 n c j m gb , j + E buy , t j = 1 n m buy , j M ccs , t
where m chp , j , m gb , j , and m buy , j represent the pollutant emission factors for CHP, GB, and electricity purchased from the grid, respectively; n denotes the amount of pollutant gases, mainly including CO 2 , SO 2 , and NO x ; and M ccs , t indicates the amount of CO 2 captured by the carbon capture and storage system.

3.3. Constraints

(1)
Output constraints for devices in the system
0 E wt , t E wt max 0 Q st , t Q st max 0 E ccs , t E ccs max Q gb , t = η gb W g a s , gb W g a s , gb min W g a s , gb W g a s , gb max Δ W g a s , gb min W g a s , gb , t + 1 W g a s , gb , t Δ W g a s , gb max
where E wt max signifies the upper limit of wind power output; Q st max refers to the upper power limit of solar collectors to capture thermal energy; E ccs max reflects the upper limit of CCS; W g a s , gb max and W g a s , gb min describe the upper and lower limits of input power to GB, respectively; and Δ W g a s , gb max and Δ W g a s , gb min symbolize the upper and lower limits of climb of GB, respectively.
(2)
Electricity purchase constraints
On-system purchases of power from the main grid are restricted.
0 E buy , t E buy max
where E b u y m a x represents the maximum power of purchased electricity in time period t.
(3)
Energy storage device constraints
This paper has the same operation mechanism for the three types of energy storage devices: electrical, thermal, and oxygen. Thus, the energy storage devices were modeled in a standardized manner [34], expressed as:
0 W es , n , t char B es , n , t char W es , n max 0 W es , n , t dis B es , n , t dis W es , n max W es , n , t = W es , n , t char / η es , n char W es , n , t dis / η es , n dis S n , t = S n , t 1 + W es , n , t / W es , n cap S n , 1 = S n , T B es , n , t char + B es , n , t dis = 1 S n min S n , t S n max
where W es , n char and W es , n dis represent the charging and discharging power of the nth type of energy storage device in time period t, respectively; W es , n max denotes the maximum power of the nth type of energy storage device in a single charging and discharging; B es , n , t char and B es , n , t dis are binary variables, reflecting the charging and discharging state parameters of the nth type of energy storage device in time period t; B es , n , t char = 1 and B es , n , t dis = 0 indicate that it is in the charging state; B es , n , t char = 0 and B es , n , t dis = 1 suggest that it is in the discharging state; W es , n , t describes the final output power of the nth type of energy storage device in time period t; η ES , n char and η ES , n dis refer to the charging and discharging power of the nth type of energy storage device, respectively; S n , t stands for the capacity of the nth type of energy storage device in time period t; W es , n cap embodies the rated capacity of the nth type of energy storage device; and S n max and S n min signify the upper and lower limits of the capacity of the nth type of energy storage device, respectively.
(4)
The constraints of CCS, P2G, and CHP equipment are expressed in Equations (1)–(3) and (6).
(5)
Cold power balance constraints
Q ac min Q ac , t Q ac max E ec min E ec , t E ec max Q ac , t + E ec , t = C Cload , t
where Q ac , t represents the power required by the absorption chiller in time period t; Q ac min and Q ac max signify the upper and lower limits of the power of the absorption chiller in time period t, respectively; E ec , t indicates the power required by the electric chiller in time period t; E ec min and E ec max denote the upper and lower limits of the power of the electric chiller in time period t, respectively; and C Cload , t embodies the cooling load demand in time period t.
(6)
Electric power balance constraints
E chp , t + E wt , t + E hfc , t + E char , t ba + E buy , t = E ec , t + E p 2 g , t + E ccs , t + E Eload , t + E dis , t ba
(7)
Thermal power balance constraints
Q st , t + Q chp , t + Q gb , t + Q hfc , t + Q tes , char , t = Q ac , t + Q Qload , t + Q tes , dis , t

3.4. Improved Multi-Objective Black-Winged Kite Algorithm with Adaptive Mutation Based on Q-Learning

The Black-winged Kite Algorithm (BKA), as a novel metaheuristic approach, demonstrates efficient optimization capabilities for constrained problems while exhibiting strong robustness and superior convergence performance. To address the challenges of multi-objective IES optimization—including obtaining well-distributed Pareto solutions and avoiding local optima—this study enhances the original BKA by integrating multi-objective optimization with Pareto ranking, Q-learning for adaptive parameter tuning, and multiple mutation strategies to maintain population diversity. The resulting MOBKA-QL algorithm effectively solves the proposed model, achieving balanced optimization across competing objectives.

3.4.1. BKA

The BKA was developed by simulating the predatory and migratory behaviors of black-winged kites in nature. It adopts a global search strategy inspired by whole-map migration patterns. The mechanism is defined as follows:
(1)
Initialization phase
y i , j = B K d l + r a n d ( B K u l B K d l )
(2)
Attacking behavior
y t + 1 i , j = y t i , j + n ( 1 + sin ( r ) ) × y t i , j p < r y t i , j + n ( 2 r 1 ) × y t i , j else
n = 0.05 × e 2 × ( t T ) 2
(3)
Migration behavior
y t + 1 i , j = y t i , j + c ( 0 , 1 ) × ( y t i , j L t j ) F i < F r i y t i , j + c ( 0 , 1 ) × ( L t j m × y t i , j ) else
m = 2 × sin ( r + π / 2 )
f ( x , δ , μ ) = δ / π ( δ 2 + ( x μ ) 2 ) < x <
where B K d l and B K u l represent the next and previous sessions of the ith black-winged kite in the jth dimension, respectively; y t i , j and y t + 1 i , j denote the position of the ith black-winged kite in the jth dimension in the tth and (t + 1)th iteration, respectively; r indicates a random number with a value ranging from 0 to 1; p signifies the parameter controlling the behavior of different attacks; T refers to the total number of iterations; t stands for the number of iterations that have been completed so far; L t j describes the leading scorer of the jth dimensional black-winged kite in the tth iteration so far; F i expresses the jth dimensional current position obtained by any black-winged kite in the tth iteration; F r i embodies the fitness value of any black-winged kite in the jth dimensional random position in the tth iteration; and C(0, 1) symbolizes the Cauchy mutation defined as in Equation (33).
The traditional black-winged kite algorithm has shortcomings such as a lack of global optimal exploration ability, parameter sensitivity, and slow convergence speed, which were addressed in this paper.

3.4.2. Selection of Multiple Mutation Strategies for MOBKA-QL

The original BKA tends to excessively focus on certain regions of the solution space while neglecting others, leading to premature convergence near local optima and hindering its ability to locate the global optimum. In IES, the complexity increases with the number of devices and operational constraints, resulting in an extensive range of possible scheduling plans. Therefore, it is essential for the optimization algorithm to escape local optima to explore more feasible and diverse operating strategies. Mutation strategies address this issue by enhancing population diversity and expanding the search space. By incorporating adaptive mutation strategies, our research group has significantly improved BKA’s performance in solving complex optimization problems.
This paper integrates thirteen commonly used mutation strategies into the BKA algorithm, as summarized in Table 1. The mutation operations generate updated individuals (denoted as mx), which represent modified versions of the original solutions x [24,25,35,36,37,38,39]. Through multi-iteration performance evaluation, five superior mutation strategies were selected, with their comparative effectiveness illustrated in Figure 4.
As shown in Figure 4, the five most effective mutation strategies were identified as periodic variation, random-elite differential variation, random differential variation, elite differential variation, and heterogeneous variation. Therefore, all five mutation strategies were incorporated into the migration behavior update formula (Equation (31)).

3.4.3. Implementation of Adaptive Mutation Strategies Based on Q-Learning

In this study, Q-learning serves as a control mechanism to dynamically select optimal mutation strategies during each iteration, based on real-time system states and evaluation results. The algorithm updates the Q-table through a reward mechanism that evaluates Multi-Objective Variation Index (MOVI) differences between consecutive iterations, thereby adaptively adjusting strategy selection probabilities. This approach enables optimal action selection at each iteration stage, significantly enhancing the algorithm’s adaptability and search efficiency.
MOVI quantitatively measures solution set diversity and evaluates mutation strategy effectiveness. When MOVI increases, it provides positive feedback indicating improved solution set distribution quality. This mechanism guides the algorithm toward more diverse exploration patterns, effectively reducing local optima entrapment risks while promoting stable convergence behavior.
In this mechanism, the algorithm’s actions interact with the environment: each selected action modifies the system state, while the environment provides feedback through a reward function that quantifies the action’s effectiveness [40]. This feedback subsequently guides future decision-making processes. This design achieves an effective balance between reinforcing successful mutation strategies and preserving solution diversity, eliminating the need for additional performance indicators. Consequently, it establishes a dynamic yet stable search scheduling mechanism. The Q-table update formula is formally defined as:
Q ( s , a ) Q ( s , a ) + α [ r + γ max a Q ( s , a ) Q ( s , a ) ]
where Q ( s , a ) represents the Q-value of taking an action in states; r denotes the reward obtained after the execution of that action; γ signifies the decay coefficient; and Q ( s , a ) embodies the Q-value of taking an action in the next state.
In this paper, Q-learning was employed to achieve adaptive mutation of the update formula, with the following parameters requiring definition:
(1)
State
The one-step Q-learning method implemented in this study utilizes only immediate state–action pairs for Q-value updates, fulfilling dynamic search requirements during iterations. This computationally efficient approach ensures fast environmental adaptation and excellent real-time decision-making performance.
(2)
Action
Five common variant strategies with good improvements to MOBKA-QL were selected following the above discussion. Hence, these strategies were chosen to form action sets to dynamically adjust the search capability and position at different stages.
Action 1: Mutation strategy selects periodic variation.
Action 2: Variation strategy selects random elite differential variation.
Action 3: Mutation strategy selects random difference variation.
Action 4: Mutation strategy selects elite differential variation.
Action 5: Variation strategy selects heterogeneous variation.
(3)
Award
In this paper, the Multi-Objective Variance Indicator (MOVI) is used to evaluate optimization performance by comparing MOVI values between consecutive iterations. A negative reward (−1) is assigned when the population shows poor performance and a positive reward (+2) is given otherwise. To reinforce effective exploration while avoiding premature elimination of strategies, an asymmetric reward mechanism is adopted. This design balances learning efficiency and policy stability by encouraging the retention of well-performing mutation strategies while maintaining exploration capability.
To verify the effectiveness of the reward settings, a sensitivity analysis was conducted using three schemes: (1) symmetric (+1/−1), (2) over-penalized (+1/−5), and (3) the proposed asymmetric positive scheme (+2/−1). As shown in Table 2, the proposed reward configuration achieves better performance in MOVI improvement, convergence speed, and the number of non-dominated solutions, demonstrating higher learning efficiency and multi-objective optimization capability.
The final reward scheme is therefore set as follows:
r t = + 2 , i f M O V I t > M O V I t 1 1 , else
where r t and M O V I t represent the reward and metric values of state t, respectively.
(4)
Epsilon calculation
Dynamic computation of epsilon, which was used to control the trade-off between exploration and utilization in the ε g r e e d y strategy, was performed by the following equation. By adjusting epsilon dynamically, the algorithm adopts different strategies at different learning stages, thereby enhancing learning efficiency and effectiveness. In the initial stage, epsilon is set higher to encourage extensive exploration and comprehensive experience acquisition; in later stages, epsilon is gradually reduced to emphasize the exploitation of the learned knowledge and improve learning efficiency. This smooth transition balances exploration and exploitation, ultimately leading to improved overall learning performance.
e p s i l o n = W a ( W a W b ) ( t / T )
where Wa represents the initial high exploration weight and Wb indicates the final low exploration weight.

3.4.4. Multi-Objective Optimization of the MOBKA-QL Algorithm

The BKA was originally designed for single-objective optimization, using fitness values to evaluate solutions and guide search behavior. To address the simultaneous optimization of economy, carbon emissions, and energy efficiency in IES, this study extends the original BKA framework to a multi-objective version called MOBKA-QL. This new algorithm combines Q-learning with a Pareto dominance mechanism, enabling dynamic updates of the solution set and effective simultaneous optimization of multiple objectives.
In MOBKA-QL, the evaluation of individual solutions no longer depends on a single fitness value but is instead based on Pareto ranking, which improves the balance of the solution set. In particular, considering that the original BKA used fitness-based comparisons to determine migration behavior, this study proposed an improved strategy: update rules are selected based on the dominance relationship between the current individual y t i , j and the historical Pareto-optimal P * archive. As a convergence-oriented behavior, migration enhances local search ability and accelerates convergence to optimal solutions.
To further improve adaptability in complex multi-objective environments, a Q-learning-based dynamic mutation strategy was embedded into the migration process. This mechanism allows the algorithm to adjust mutation strategies based on feedback from the search environment, thus achieving a better balance between global exploration and local exploitation while enhancing solution diversity and convergence stability.
The specific logic of the improved migration behavior is as follows:
y t + 1 i , j = y t i , j + c ( 0 , 1 ) × ( y t i , j L t j ) if x * P * : x * y t i y t i , j + c ( 0 , 1 ) × ( L t j m × y t i , j ) else
where x * y t i represents that the historical Pareto archive x * dominates the current individual y t i ; L t j indicates the optimal solution in the j-th objective within the historical Pareto archive.

3.4.5. Optimization Result Selection for the MOBKA-QL Algorithm

In this paper, the optimal solutions from both the previous and current generations were combined and sorted by Pareto dominance to preserve optimal solutions across iterations. As the optimization progresses and the solution sets merge, the Pareto front gradually grows. To maintain an even distribution of individuals, the crowding distance was used to measure population density [41]. The size of the solution set was controlled by imposing an upper limit on the crowding distance. If the number of individuals at the optimal dominance level did not exceed this limit, all were included in the new population. Otherwise, individuals were selected based on their crowding distance in descending order until the limit was reached, and any remaining individuals were discarded.
The formula for calculating the multi-objective congestion distance is:
n d = n d i + ( F m ( i + 1 ) F m ( i 1 ) ) / ( F max F min )
where F m ( i + 1 ) and F m ( i 1 ) indicate the values of the objective functions corresponding to the two neighboring individuals before and after the black-winged kite individual i, respectively; n d i denotes the congestion distance for a particular objective function; and n d signifies the total congestion distance for multiple objective functions.
The solution set was selected from the above solution set with the TOPSIS value closest to 1 after the Pareto and crowding degree calculations as the optimal solution. At this point, the solution is the optimal scheduling solution.

3.4.6. MOBKA-QL Algorithm Steps

Figure 5 illustrates the detailed solution process of the MOBKA-QL.

3.4.7. Benchmark Testing and Result Analysis of the MOBKA-QL Algorithm

To verify the effectiveness of the improved MOBKA-QL algorithm in multi-objective optimization, four representative benchmark algorithms—MOSSA, MOPSO, MODE, and MOGOOSE—were selected for comparison. Performance evaluations were conducted on two standard test functions, Viennet2 and DTLZ2. The experimental results are presented in Figure 6, which illustrates the Pareto front distributions obtained by each algorithm under the respective test functions.
Figure 6a shows the results for the Viennet2 function. Due to its pronounced curvature and turning regions on the Pareto front, this function poses challenges to both global exploration and local exploitation capabilities. As can be seen from the figure, the solution set obtained by MOBKA-QL is overall well-distributed, with clear boundaries and moderate density. It not only covers a majority of the Pareto front but also effectively fills the sparse regions with significant curvature changes, demonstrating strong resolution and adaptability. Figure 6b,c depict the Pareto front distributions of the DTLZ2 test function, where Figure 6b provides a side view and Figure 6c a front view. From Figure 6b, it can be observed that the solution set of MOBKA-QL closely adheres to the theoretical Pareto front surface, forming a regular and coherent shape, indicating good convergence. In Figure 6c, the MOBKA-QL solutions exhibit higher density and more uniform distribution, achieving comprehensive coverage of the front surface in the objective space. In contrast, the other algorithms display varying degrees of deviation or non-uniformity in both views, with phenomena such as solution drift or sparsity, suggesting that MOBKA-QL offers superior stability and global coverage capability.

4. Simulation and Analysis

A series of experimental analyses were designed and conducted with load and energy forecast data for typical seasons in a specific region to validate the effectiveness of the proposed IES model and MOBKA-QL algorithm. The comparative analysis of the experimental results further revealed the feasibility and advantages of the proposed method in practical applications. The specific methods of this study are exhibited in Figure 7.

4.1. Original Data

The experiments in this section were based on typical seasonal forecast data for cooling, heating, electric load, and renewable energy in specific regions of China. The scheduling period spanned 24 h, with each hour representing one time step. The forecasts of renewable energy production, electricity, heating, and cooling loads are depicted in Figure 8. The parameters of each device in IES are provided in Table 3.

4.2. Optimization of Algorithm Parameters

The experimental parameters include population size (pop), number of iterations (T), and the parameter (p) controlling the attack behavior in the BKA algorithm and upper limit of congestion (AC). The levels of these four parameters are listed in Table 4. Taguchi’s method was employed to examine the effect of these parameters on the performance of the algorithm. The results of the orthogonal experiments are provided in Table 5, with RV as the response variable for the three target means [42]. Figure 9 illustrates the results of different parameters in three cases, reflecting that the MOBKA-QL algorithm performed optimally at pop = 300, T = 200, p = 0.9, and AC = 50.

4.3. Algorithm Comparison Test

In this paper, MODBO, MOSSA, and BKA were selected as comparison algorithms to evaluate the performance of MOBKA-QL, which incorporates an adaptive mutation strategy. Using the same model settings, each algorithm was independently executed 50 times to ensure fairness. According to the parameter optimization experiment in Section 3.2, the population size and maximum number of iterations were set to their optimal values. The algorithms’ performance was assessed using five multi-objective evaluation metrics: HV, confidence interval, sample mean, solution running time, and Spread. The running time was averaged over 10 runs.
The test results of the four algorithms under the same model are presented in Table 6. The test results suggest that MOBKA-QL was completely superior to the other three algorithms in the HV, Spread, and Mean metrics. The confidence interval dominated the other three algorithms. Regardless of some overlap in some objective functions, it did not weaken its overall advantage. The running time of MOBKA-QL (244.2983 s) was slightly slower than that of BKA (238.6687 s). Fortunately, this difference was acceptable considering the algorithm’s improvement in other performance aspects. Thus, the algorithm achieved a preferable balance between solution quality and running efficiency.
The Pareto frontier curves for the optimal solutions of the three objective functions are depicted in Figure 10 to provide a deeper analysis of the four algorithms’ performance. The figure demonstrates that the population distribution of the MOBKA-QL algorithm was more structured and spanned a wider range, showcasing a more balanced advantage across the three dimensions of energy efficiency, carbon emissions, and economic cost. Particularly, the solutions of the MODBO and MOSSA algorithms had a certain degree of concentration regarding energy efficiency and carbon emissions. Nevertheless, their economic costs were not as good as MOBKA-QL, and the population distribution was relatively dispersed. The BKA algorithm also performed well in terms of energy efficiency, whereas its performance concerning carbon emissions and economic costs was suboptimal, resulting in an uneven distribution of its overall solution. This verifies that MOBKA-QL possessed stronger exploration and exploitation capabilities than MODBO, MOSSA, and BKA.
Figure 11 shows the curves of the three objective functions over the number of iterations. At the 200th iteration, MOBKA-QL outperformed the other three algorithms in all three objective functions: economic costs (Figure 11a), carbon emissions (Figure 11b), and energy efficiency utilization rate (Figure 11c). Although MOBKA-QL escaped local optima slightly more slowly than the other algorithms, it showed a clear convergence trend within the first 50 iterations. Overall, MOBKA-QL demonstrated strong global search capabilities and produced superior solutions.

4.4. Model Comparison Test

In this study, a conventional integrated energy system (CIES) without the CCS+P2G model was used as a reference, and the performance improvement of the IES containing the CCS+P2G model was analyzed in depth. Figure 12 illustrates the block diagram of the CIES system. The three objective functions calculated were used to comprehensively evaluate the advantages and disadvantages of both systems, CIES and IES, through comparative analysis.
Three key metrics, energy utilization efficiency (EUE), the carbon dioxide emission reduction rate (CDERR), and the annual total cost savings rate (ATCSR), were adopted in this study to comprehensively assess the system’s performance. The formulas for these metrics are:
A T C S R = ( C O S T CIES C O S T IES ) / C O S T CIES E U E = ( E IES E CIES ) / E CIES C D E R R = ( C D E CIES C D E IES ) / C D E CIES
where COST represents the economic cost of the system; E denotes the energy utilization efficiency of the system; and CDE indicates the carbon dioxide emissions of the system.
Table 7 lists the performance metrics of the IES under the optimization model. The data suggest that the IES model introduced in this paper markedly outperformed the traditional IES model in economic, energy efficiency, and environmental benefits, revealing its significant advantages under multi-dimensional optimization.
Figure 13 presents the Pareto frontier of the three objective functions for the CIES and IES systems. Each point represents an optimal solution for the corresponding system. The surface illustrates the confidence regions associated with these optimal solutions, thereby better illustrating the distribution range of the solution sets. The figure indicates that the optimal solution surface of the IES system encompassed that of the CIES system, particularly in the low-carbon emission region. Moreover, the IES system achieved significantly higher energy efficiency and lower economic costs compared to the CIES system. Therefore, the IES system demonstrated significant advantages in overall optimization.

4.5. Operation Strategy Optimization Analysis

The heat-to-power ratio in cogeneration systems is a crucial metric for evaluating both energy efficiency and economic performance. It is essential for the design and optimization of such systems. This paper investigates the operational efficiency and economic benefits of CHP systems under different heat-to-power ratio strategies.
Strategy 1: Electricity determines Heat.
Strategy 2: Heat determines Electricity.
Strategy 3: Adjustable Thermoelectric Ratio.
Using the typical seasonal temperature and energy demand as an example, Figure 14 depicts the temporal variation of the system’s adjustable heat-to-power ratio.
MOBKA-QL was utilized to optimize the system. The Pareto solutions of the IES system under the three strategies are illustrated in Figure 15 (the three-dimensional plots of the three sets of solution sets on the three objective functions (Figure 15d) and two-dimensional plots with different angles (Figure 15a–c). The distribution of the solution sets of the ATR strategy was more concentrated and exhibited stability and equilibrium, so as to have less economic cost and carbon emissions while ensuring higher energy utilization. In contrast, the HDE strategy maintained a good energy utilization rate. Nevertheless, its economic cost and carbon emission performance were poorer. Moreover, the EDH strategy presented a more dispersed solution set distribution, and its energy utilization rate was significantly lower than that of the other two strategies. Hence, the ATR strategy demonstrated significant advantages in multiple objectives.
Five key metrics were utilized to evaluate the performance of the three strategies relative to the CIES system: energy utilization efficiency (EUE), the boiler energy saving rate (BESR), the carbon dioxide emission reduction rate (CDERR), the primary energy saving rate (PESR), and the annual total cost savings rate (ATCSR). The equations are:
B E S R = ( E S CIES E S IES ) / E S CIES P E S R = ( P E CIES P E IES ) / P E CIES
where ES represents the boiler energy consumption of the system and PE denotes the primary energy consumption of the system.
Figure 16 presents the evaluation results of the optimal solutions for different strategies. It demonstrates that the adjustable heat-to-power ratio strategy outperformed the constant heat-to-power ratio strategy in terms of EUE, CDERR, and ATCSR. Specifically, the electricity-determined-by-heat strategy achieved the highest BESR at 0.73, due to its prioritization of thermal energy supply, which maximizes boiler energy efficiency. The adjustable heat-to-power ratio strategy had a slightly lower BESR of 0.64 but still showed advantages in overall performance thanks to its flexibility. Regarding PESR, the electric heat strategy reached the highest value of 0.25. Although the adjustable heat-to-power ratio strategy’s economic cost-saving rate was slightly lower at 0.21, it still achieved superior overall economic cost optimization. Overall, the adjustable heat-to-power ratio strategy offers the best trade-off among EUE, ATCSR, and CDERR metrics, demonstrating significantly better comprehensive performance than alternative approaches.

4.6. Analysis of Typical Seasonal Operations

The IES model featuring an adjustable heat-to-power ratio was optimized using MOBKA-QL and TOPSIS. Figure 17 illustrates that the system output primarily supports user-side cooling, heating, and electricity load demands and the energy needs of internal system components.
As demonstrated by the adjustable heat-to-power ratio (Figure 14) and raw operational data (Figure 8), electrical load demand reaches its minimum during nighttime hours while thermal load remains consistently high and stable. During this period, wind turbines achieve peak output in the early night phase, satisfying the majority of power demand. When the system prioritizes CHP operation for thermal load supply, the accompanying electricity generation effectively fills the remaining power gap, resulting in the system’s maximum heat-to-power ratio. During daytime operation, electrical load rises significantly to high levels, coinciding with peak output from solar thermal collectors. Here, the system switches CHP operation to prioritize electricity generation. The relatively small accompanying thermal output is supplemented by gas boilers, thereby reducing the heat-to-power ratio to its minimum value. Throughout this operational cycle, battery storage, thermal energy storage tanks, and the upper-level grid participate in coordinated energy coupling. This multi-agent interaction ensures stable operation of the IES, enabling the efficient collaborative dispatch of multiple energy sources.
When the heat-to-power ratio exceeded 1, the AC served as the primary cooling source, with the EC handling supplementary load. When the ratio fell below 1, the EC assumed the dominant role while the AC provided auxiliary support. This operational strategy, based on heat-to-power ratio characteristics, significantly improved both energy efficiency and economic performance.
Figure 18 presents the action selection frequencies of the Q-learning mechanism during iterative optimization. The convergence of reward curves for all actions reflects the increasing challenge and varying pace in locating optimal solutions through mutation operations. These observations demonstrate the solution set’s asymptotic convergence toward Pareto-optimal solutions.
To visualize optimal CHP dispatch under different scenarios, 17:00 was selected for its representative device operational states. Figure 19 shows real-time energy flows from supply-side sources (wind power, CHP) to demand-side loads (electric/thermal/cooling, storage, and converters) via the multi-energy network. This demonstrates the system’s complexity and flexibility in multi-energy coordination across scenarios.

5. Conclusions

This study constructed an IES coupled with wind and solar energy and developed a dispatch model for this system. The MOBKA-QL algorithm was proposed to solve the optimization problem. Based on five evaluation metrics, the superiority of the IES was verified, the optimal heat-to-power ratio operation strategy was selected, and the stability of the MOBKA-QL algorithm was analyzed. The findings are as follows:
(1)
The proposed IES integrated with CCS+P2G demonstrated significant advantages over CIES during seasonal evaluation, achieving a 14.6% reduction in economic costs, a 13.9% decrease in carbon emissions, and a 28.8% improvement in energy efficiency. These results clearly indicate that the CCS+P2G-enhanced integrated energy system outperforms conventional systems in terms of operational efficiency, energy conservation, emission reduction, and cost-effectiveness. The performance metrics validate the substantial improvements offered by this innovative system configuration compared to traditional approaches.
(2)
The experimental results for other typical seasons further confirm that the ATR strategy consistently outperformed the constant-ratio strategies. Specifically, compared with the EDH strategy, the ATR strategy reduced economic costs by 9.54%, decreased C O 2 emissions by 11.5%, and improved system energy efficiency by 3.3%. When compared with the HDE strategy, the ATR strategy achieved reductions of 16.1% in economic cost and 20.1% in carbon emissions, along with a 0.8% improvement in energy efficiency. These results demonstrate that the ATR strategy provides significant advantages in minimizing operating costs, reducing environmental impact, and enhancing overall energy performance.
(3)
An adaptive mutation strategy based on Q-learning was integrated into the BKA algorithm. Evaluated through MOVI, this approach prevented the population from converging to local optima and increased mutation diversity. Furthermore, multi-objective optimization was applied to enhance the algorithm’s adaptability to complex problems. The results demonstrate that MOBKA-QL outperformed both the original BKA and other representative algorithms (e.g., MOPSO, MODE, and MOSSA, among others) in the IES system, yielding a wider Pareto front and higher solution accuracy, thus confirming its superiority.
However, this study still has certain limitations. For example, the impact of wind power forecast errors has not been systematically considered, the equipment models are relatively simplified, and the generalization ability of the algorithm in complex scenarios requires further validation. In addition, the current research is primarily based on small- to medium-scale systems, and a systematic evaluation of computational scalability and operational efficiency in larger or real-time IES systems is lacking. Future work will focus on enhancing the adaptability and practicality of the algorithm in complex systems, including scenarios such as multi-time-scale scheduling, carbon trading mechanisms, and demand response integration. Meanwhile, the regional adaptability of the model will be assessed under varying energy pricing mechanisms and infrastructure conditions to improve its generalizability.

Author Contributions

Conceptualization, R.S. and N.T.; methodology, R.S.; validation, N.T.; investigation, Z.F.; resources, Z.F.; data curation, X.Y.; writing—original draft preparation, R.S., X.Y., Z.F. and N.T.; writing—review and editing, R.S.; supervision, N.T. All authors have read and agreed to the published version of the manuscript.

Funding

National Natural Science Foundation of China (61601212, 52177047), and Liaoning Provincial Department of Education Fund (LJ2019JL011, LJ2017QL012).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

Abbreviations
ACAbsorption chiller
ATCSRAnnual total cost savings rate
ATRAdjustable thermoelectric ratio
BaBattery
BKABlack-winged kite algorithm
CCSCarbon capture system
CDERRCarbon dioxide emission reduction ratio
CHPCombined heat and power
CIESConventional integrated energy systems
CloadCooling load
ECElectric chiller
EDHElectrically determined heat
ELElectrolytic cell
EloadElectric load
BESRBoiler energy savings rate
EUEEnergy utilization efficiency
GBGas boiler
HDEHeat determined electricity
HFCHydrogen fuel cell
HVhypervolume
IESIntegrated energy systems
MOBKA-QLMulti-objective black-winged kite algorithm based on Q-learning
MODBOMulti-objective dung beetle optimizer
MOSSAMulti-objective sparrow search algorithm
MOVIMulti-objective variation index
MRMethane reactor
OSTOxygen storage tank
P2GPower to gas
PESRPrimary energy saving rate
RVResponse variable
STSolar Thermal
TESThermal energy storage tank
TloadThermal load
WTWind Turbine
Parameters
E e , el Electrical energy input to the electrolytic cell, kW
W el , H 2 Hydrogen energy output by an electrolytic water, kW
η el Energy conversion efficiency of electrolytic cell
W mr , H 2 Hydrogen energy input to the methane reactor, kW
W mr , g Methane reactor output of natural gas, kW
η mr Energy conversion efficiency of methane reactor
W hfc , H 2 Hydrogen fuel cell input hydrogen energy, kW
E hfc , e Electrical energy output from hydrogen fuel cells, kW
Q hfc , h Thermal energy output from a hydrogen fuel cell, kW
η hfc , e Efficiency of hydrogen fuel cell conversion to electricity
η hfc , h Efficiency of hydrogen fuel cell conversion into heat energy
E ccs Electricity consumed by carbon capture systems, kW
M ccs Carbon dioxide captured by carbon capture systems
W p 2 g Carbon dioxide consumed by power to gas
V p 2 g , CH 4 The amount of gas produced by power to gas
E p 2 g Electricity consumed by power to gas, kW
ρ CO 2 Carbon dioxide consumed to produce unit methane
η p 2 g Energy conversion efficiency of power to gas
L CH 4 Calorific value of natural gas
ρ H 2 Hydrogen gas consumed to produce unit methane
W CO 2 , H 2 The methane reactor reacts with carbon dioxide as hydrogen
W H 2 Electrolysis of water produces all the hydrogen
W hfc , H 2 Hydrogen consumed by a hydrogen fuel cell
E wt The electricity output of the wind turbine, kW
Q st The electricity output of the solar thermal, kW
W g , chp Natural gas power input by combined heat and power, kW
E chp , e Power output of the combined heat and power, kW
Q chp , h Heat energy output by combined heat and power, kW
η chp , e Energy conversion rate of combined heat and power
η chp , h Thermal energy conversion rate of combined heat and power
κ chp Thermoelectric ratio of combined heat and power
Q gb The heat output of the gas boiler, kW
V gb Gas consumed by gas-fired boilers
η gb Energy conversion efficiency of gas fired boilers
C ac Absorption of the cooling power of the refrigerator, kW
C ec The heat energy absorbed by the absorption refrigerator, kW
Q ac Refrigeration efficiency of absorption chillers
E ec The cooling power of the electric refrigerator
η ac Refrigeration efficiency of electric refrigerator
η ec Refrigeration efficiency of electric refrigerator
E char ba The charging power of the battery, kW
E dis ba The discharge power of the battery, kW
η char ba The charging efficiency of the battery
η dis ba The discharge efficiency of the battery
η loss ba Battery loss factor
Q char tes Heat charging power of heat storage tank, kW
Q dis tes Heat discharge power of heat storage tank, kW
η char tes Heat storage efficiency of heat storage tank
η dis tes Heat release efficiency of heat storage tank
η loss tes Loss coefficient of heat storage tank
V ost The volume of oxygen in the tank
V char ost Oxygen storage tank
V dis ost Oxygen from the tank
η char ost Oxygen storage coefficient of oxygen storage tank
η dis ost Oxygen discharge coefficient of oxygen storage tank
W g a s , gb The input power of the gas boiler, kW
E buy Electricity purchased from the grid, kW
B es , n Binary variables of n energy storage devices
W es , n Charge and discharge power of the NTH energy storage device, kW

References

  1. Ul’yanin, Y.A.; Kharitonov, V.V.; Yurshina, D.Y. Forecasting the dynamics of the depletion of conventional energy resources. Stud. Russ. Econ. Dev. 2018, 29, 153–160. [Google Scholar] [CrossRef]
  2. Olabi, A.G.; Obaideen, K.; Abdelkareem, M.A.; AlMallahi, M.N.; Shehata, N.; Alami, A.H.; Mdallal, A.; Hassan, A.A.M.; Sayed, E.T. Wind energy contribution to the sustainable development goals: Case study on London array. Sustainability 2023, 15, 4641. [Google Scholar] [CrossRef]
  3. Pourasl, H.H.; Barenji, R.V.; Khojastehnezhad, V.M. Solar energy status in the world: A comprehensive review. Energy Rep. 2023, 10, 3474–3493. [Google Scholar] [CrossRef]
  4. Bagherian, M.A.; Mehranzamir, K.; Pour, A.B.; Rezania, S.; Taghavi, E.; Nabipour-Afrouzi, H.; Dalvi-Esfahani, M.; Alizadeh, S.M. Classification and analysis of optimization techniques for integrated energy systems utilizing renewable energy sources: A review for CHP and CCHP systems. Processes 2021, 9, 339. [Google Scholar] [CrossRef]
  5. Zhao, J.; Luo, X.; Tu, Z.; Chan, S.H. A novel CCHP system based on a closed PEMEC-PEMFC loop with water self-supply. Appl. Energy 2023, 338, 120921. [Google Scholar] [CrossRef]
  6. Zou, D.; Gong, D.; Ouyang, H. A non-dominated sorting genetic approach using elite crossover for the combined cooling, heating, and power system with three energy storages. Appl. Energy 2023, 329, 120227. [Google Scholar] [CrossRef]
  7. Pan, C.; Jin, T.; Li, N.; Wang, G.; Hou, X.; Gu, Y. Multi-objective and two-stage optimization study of integrated energy systems considering P2G and integrated demand responses. Energy 2023, 270, 126846. [Google Scholar] [CrossRef]
  8. Chen, Z.; Yiliang, X.; Hongxia, Z.; Yujie, G.; Xiongwen, Z. Optimal design and performance assessment for a solar powered electricity, heating and hydrogen integrated energy system. Energy 2023, 262, 125453. [Google Scholar] [CrossRef]
  9. Meng, Q.; Xu, J.; Ge, L.; Wang, Z.; Wang, J.; Xu, L.; Tang, Z. Economic optimization operation approach of integrated energy system considering wind power consumption and flexible load regulation. J. Electr. Eng. Technol. 2024, 19, 209–221. [Google Scholar] [CrossRef]
  10. Li, Z.; Zhu, X.; Huang, X.; Tian, Y.; Huang, B. Sustainability design and analysis of a regional energy supply CHP system by integrating biomass and solar energy. Sustain. Prod. Consum. 2023, 41, 228–241. [Google Scholar] [CrossRef]
  11. Zaik, K.; Werle, S. Solar and wind energy in Poland as power sources for electrolysis process-A review of studies and experimental methodology. Int. J. Hydrogen Energy 2023, 48, 11628–11639. [Google Scholar] [CrossRef]
  12. Li, J.; He, X.; Li, W.; Zhang, M.; Wu, J. Low-carbon optimal learning scheduling of the power system based on carbon capture system and carbon emission flow theory. Electr. Power Syst. Res. 2023, 218, 109215. [Google Scholar] [CrossRef]
  13. Chen, Z.; Zhang, Y.; Ji, T.; Cai, Z.; Li, L.; Xu, Z. Coordinated optimal dispatch and market equilibrium of integrated electric power and natural gas networks with P2G embedded. J. Mod. Power Syst. Clean Energy 2018, 6, 495–508. [Google Scholar] [CrossRef]
  14. Calise, F.; Cappiello, F.L.; Cimmino, L.; D’aCcadia, M.D.; Vicidomini, M. Dynamic simulation and thermoeconomic analysis of a power to gas system. Renew. Sustain. Energy Rev. 2023, 187, 113759. [Google Scholar] [CrossRef]
  15. He, K.; Zeng, L.; Yang, J.; Gong, Y.; Zhang, Z.; Chen, K. Optimization Strategy for Low-Carbon Economy of Integrated Energy System Considering Carbon Capture-Two Stage Power-to-Gas Hydrogen Coupling. Energies 2024, 17, 3205. [Google Scholar] [CrossRef]
  16. Stecca, M.; Elizondo, L.R.; Soeiro, T.B.; Bauer, P.; Palensky, P. A comprehensive review of the integration of battery energy storage systems into distribution networks. IEEE Open J. Ind. Electron. Soc. 2020, 1, 46–65. [Google Scholar] [CrossRef]
  17. Hassan, R.; Das, B.K.; Al-Abdeli, Y.M. Investigation of a hybrid renewable-based grid-independent electricity-heat nexus: Impacts of recovery and thermally storing waste heat and electricity. Energy Convers. Manag. 2022, 252, 115073. [Google Scholar] [CrossRef]
  18. Song, Z.; Liu, T.; Lin, Q. Multi-objective optimization of a solar hybrid CCHP system based on different operation modes. Energy 2020, 206, 118125. [Google Scholar] [CrossRef]
  19. Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
  20. Gen, M.; Lin, L. Genetic algorithms and their applications. In Springer Handbook of Engineering Statistics; Springer: London, UK, 2023; pp. 635–674. [Google Scholar]
  21. Rana, N.; Latiff, M.S.A.; Abdulhamid, S.M.; Chiroma, H. Whale optimization algorithm: A systematic review of contemporary applications, modifications and developments. Neural Comput. Appl. 2020, 32, 16245–16277. [Google Scholar] [CrossRef]
  22. Li, L.L.; Ren, X.Y.; Tseng, M.L.; Wu, D.-S.; Lim, M.K. Performance evaluation of solar hybrid combined cooling, heating and power systems: A multi-objective arithmetic optimization algorithm. Energy Convers. Manag. 2022, 258, 115541. [Google Scholar] [CrossRef]
  23. Yu, H.; Gao, Y.; Wang, J. A multiobjective particle swarm optimization algorithm based on competition mechanism and gaussian variation. Complexity 2020, 2020, 5980504. [Google Scholar] [CrossRef]
  24. Li, S.; Li, J. Chaotic dung beetle optimization algorithm based on adaptive t-Distribution. In Proceedings of the 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 26–28 May 2023; Volume 3, pp. 925–933. [Google Scholar]
  25. Dong, Y.; Zhang, H.; Wang, C.; Zhou, X. Soft actor-critic DRL algorithm for interval optimal dispatch of integrated energy systems with uncertainty in demand response and renewable energy. Eng. Appl. Artif. Intell. 2024, 127, 107230. [Google Scholar] [CrossRef]
  26. Suo, L.; Peng, T.; Song, S.; Zhang, C.; Wang, Y.; Fu, Y.; Nazir, M.S. Wind speed prediction by a swarm intelligence based deep learning model via signal decomposition and parameter optimization using improved chimp optimization algorithm. Energy 2023, 276, 127526. [Google Scholar] [CrossRef]
  27. Li, Y.; Bu, F.; Li, Y.; Long, C. Optimal scheduling of island integrated energy systems considering multi-uncertainties and hydrothermal simultaneous transmission: A deep reinforcement learning approach. Appl. Energy 2023, 333, 120540. [Google Scholar] [CrossRef]
  28. Chen, L.; Wu, J.; Tang, H.; Jin, F.; Wang, Y. A Q-learning based optimization method of energy management for peak load control of residential areas with CCHP systems. Electr. Power Syst. Res. 2023, 214, 108895. [Google Scholar]
  29. Dong, Y.; Wang, C.; Zhang, H.; Zhou, X. A novel multi-objective optimization framework for optimal integrated energy system planning with demand response under multiple uncertainties. Inf. Sci. 2024, 663, 120252. [Google Scholar] [CrossRef]
  30. Wang, J.; Wang, W.; Hu, X.; Qiu, L.; Zang, H.-F. Black-winged kite algorithm: A nature-inspired meta-heuristic for solving benchmark functions and engineering problems. Artif. Intell. Rev. 2024, 57, 98. [Google Scholar] [CrossRef]
  31. Mohammad, J.; Shahriyar, H.G.; Ata, C.; Song, J.; Markides, C.N. Electrolyzer cell-methanation/Sabatier reactors integration for power-to-gas energy storage: Thermo-economic analysis and multi-objective optimization. Appl. Energy 2023, 329, 120268. [Google Scholar]
  32. Hu, J.; Zou, Y.; Zhao, Y. Robust operation of hydrogen-fueled power-to-gas system within feasible operating zone considering carbon-dioxide recycling process. Int. J. Hydrogen Energy 2024, 58, 1429–1442. [Google Scholar] [CrossRef]
  33. Wu, M.; Wu, Z.; Shi, Z. Low carbon economic dispatch of integrated energy systems considering utilization of hydrogen and oxygen energy. Int. J. Electr. Power Energy Syst. 2024, 158, 109923. [Google Scholar] [CrossRef]
  34. Gao, J.; Meng, Q.; Liu, J.; Wang, Z. Thermoelectric optimization of integrated energy system considering wind-photovoltaic uncertainty, two-stage power-to-gas and ladder-type carbon trading. Renew. Energy 2024, 221, 119806. [Google Scholar] [CrossRef]
  35. Liang, J.; Tian, M.; Liu, Y.; Zhou, J. Coverage optimization of soil moisture wireless sensor networks based on adaptive Cauchy variant butterfly optimization algorithm. Sci. Rep. 2022, 12, 11687. [Google Scholar] [CrossRef] [PubMed]
  36. Wen, J.; Wu, X.; Jiang, K.; Cao, B. Particle swarm algorithm based on normal cloud. In Proceedings of the 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–6 June 2008; pp. 1492–1496. [Google Scholar]
  37. Cui, L.; Li, G.; Zhu, Z.; Lin, Q.; Wong, K.-C.; Chen, J.; Lu, N.; Lu, J. Adaptive multiple-elites-guided composite differential evolution algorithm with a shift mechanism. Inf. Sci. 2018, 422, 122–143. [Google Scholar] [CrossRef]
  38. Lin, M.; Wang, Z.; Chen, D.; Zheng, W. Particle swarm-differential evolution algorithm with multiple random mutation. Appl. Soft Comput. 2022, 120, 108640. [Google Scholar] [CrossRef]
  39. Saadaoui, D.; Elyaqouti, M.; Assalaou, K.; Ben Hmamou, D.; Lidaighbi, S. Parameters optimization of solar PV cell/module using genetic algorithm based on non-uniform mutation. Energy Convers. Manag. X 2021, 12, 100129. [Google Scholar] [CrossRef]
  40. Ren, X.Y.; Li, L.L.; Ji, B.X.; Liu, J.-Q. Design and analysis of solar hybrid combined cooling, heating and power system: A bi-level optimization model. Energy 2024, 292, 130362. [Google Scholar] [CrossRef]
  41. Li, Q.; Zeng, X.; Wei, W. Multi-objective particle swarm optimization algorithm using Cauchy mutation and improved crowding distance. Int. J. Intell. Comput. Cybern. 2023, 16, 250–276. [Google Scholar] [CrossRef]
  42. Yu, H.; Li, J.; Chen, X.; Niu, W.; Sang, H.-Y. An improved multi-objective imperialist competitive algorithm for surgical case scheduling problem with switching and preparation times. Clust. Comput. 2022, 25, 3591–3616. [Google Scholar] [CrossRef]
Figure 1. Integrated energy system structure.
Figure 1. Integrated energy system structure.
Sustainability 17 05709 g001
Figure 2. Two-stage P2G operation process.
Figure 2. Two-stage P2G operation process.
Sustainability 17 05709 g002
Figure 3. Carbon flow in the CCS+P2G system.
Figure 3. Carbon flow in the CCS+P2G system.
Sustainability 17 05709 g003
Figure 4. Comparison of the improved BKA with the original BKA algorithm for the 13 mutation strategies.
Figure 4. Comparison of the improved BKA with the original BKA algorithm for the 13 mutation strategies.
Sustainability 17 05709 g004
Figure 5. Specific flow of MOBKA-QL algorithm.
Figure 5. Specific flow of MOBKA-QL algorithm.
Sustainability 17 05709 g005
Figure 6. Pareto front distributions of different algorithms on standard test functions.
Figure 6. Pareto front distributions of different algorithms on standard test functions.
Sustainability 17 05709 g006
Figure 7. Overall research content of IES based on MOBKA-QL.
Figure 7. Overall research content of IES based on MOBKA-QL.
Sustainability 17 05709 g007
Figure 8. Typical seasonal raw data and renewable energy projections.
Figure 8. Typical seasonal raw data and renewable energy projections.
Sustainability 17 05709 g008
Figure 9. Calculation of trend clusters of key parameter factor levels.
Figure 9. Calculation of trend clusters of key parameter factor levels.
Sustainability 17 05709 g009
Figure 10. 3D Pareto plots of the objective function computed by the four algorithms.
Figure 10. 3D Pareto plots of the objective function computed by the four algorithms.
Sustainability 17 05709 g010
Figure 11. Convergence curves of the optimal solutions of the four algorithms with different objective functions.
Figure 11. Convergence curves of the optimal solutions of the four algorithms with different objective functions.
Sustainability 17 05709 g011
Figure 12. Conventional integrated energy system structure.
Figure 12. Conventional integrated energy system structure.
Sustainability 17 05709 g012
Figure 13. Pareto plot of the objective function for CIES and IES calculations.
Figure 13. Pareto plot of the objective function for CIES and IES calculations.
Sustainability 17 05709 g013
Figure 14. Combined heat and power ratios by time period.
Figure 14. Combined heat and power ratios by time period.
Sustainability 17 05709 g014
Figure 15. Pareto plots of the computed objective function for different strategies.
Figure 15. Pareto plots of the computed objective function for different strategies.
Sustainability 17 05709 g015
Figure 16. Performance metrics under different strategies.
Figure 16. Performance metrics under different strategies.
Sustainability 17 05709 g016
Figure 17. Optimized scheduling results of IES for different loads in a typical season.
Figure 17. Optimized scheduling results of IES for different loads in a typical season.
Sustainability 17 05709 g017
Figure 18. Q Learning process curve.
Figure 18. Q Learning process curve.
Sustainability 17 05709 g018
Figure 19. Real-time energy flow diagram for optimal scheduling of 17 h system.
Figure 19. Real-time energy flow diagram for optimal scheduling of 17 h system.
Sustainability 17 05709 g019
Table 1. Summary of the 13 variant strategy formulas.
Table 1. Summary of the 13 variant strategy formulas.
Variation StrategyVariation Formula
Gaussian variation x N μ , σ 2 , m x = x
Gaussian elite variation r N μ , σ 2 , m x = r . x
Cauchy variation m x = x best + σ π ( ( x μ ) 2 + σ 2 . x best
Inverse cumulative distribution function m x = tan π r p 1 2
t-distribution variation m x = x best + x best . t rnd ( t )
Adaptive t-distribution variation m x = x best + x best . t rnd ( exp t T 2 )
Normal cloud variation E n = exp ( t T ) , r a = N ( x b e s t , E n ) , m x = exp r a x best 2 2 E n 2
Periodic variation m x = x . 1.5 r a n d 1 , d i m
Elite differential variation 1 m x = x best + r a n d . ( x r 1 x r 2 )
Random elite differential variation m x = x + r a n d . ( x best x ) + r a n d . ( x r 1 x r 2 )
Random difference variation m x = x r 1 + r a n d . ( x r 2 x r 3 ) + r a n d . ( x r 4 x r 5 )
Elite differential variation 2 m x = x best + r a n d . ( x r 1 x r 2 ) + r a n d . ( x r 3 x r 4 )
Heterogeneous variation p = 1 t T , m x = x + ( u l x ) . ( 1 r a n d p b ) i f F = 0 x ( x d l ) . ( 1 r a n d p b ) i f F = 1
Table 2. Comparative effect experiment of reward mechanism design.
Table 2. Comparative effect experiment of reward mechanism design.
Reward MechanismAverage MOVI ImprovementConvergence TimesNumber of Non-Dominated Solutions
Symmetrical (+1/−1)0.0837835
Excessive punishment (+1/−5)0.0719228
Asymmetric (+2/−1)0.0965140
Table 3. IES system device parameters setting.
Table 3. IES system device parameters setting.
ParametersValuesParametersValuesParametersValues
η e l 0.85 η c h a r t h s 0.95 η p 2 g 0.56
η m r 0.7 η d i s t h s 0.95 η g b 0.85
η hfc , e 0.785 η l o s s t h s 0.01 η a c 0.93
η hfc , h 0.613 η c h a r o s t 0.95 η e c 0.6
m chp , CO 2 0.724 t/kWh η d i s o s t 0.95 E hfc max 800 kWh
m chp , SO 2 0.00328 t/kWh ϖ 35 E el max 1000 kWh
m chp , NO x 0.00376 t/kWh ω 1 0.5 E mr max 800 kWh
m buy , CO 2 0.55 t/kWh ω 2 0.25 E buy max 3000 kWh
η c h a r b a 0.95 κ c h p max 2 m gb , CO 2 0.392 t/kWh
η d i s b a 0.95 κ c h p min 0.5 m gb , SO 2 0.0016 t/kWh
η l o s s b a 0.1 ρ H 2 4 m3 m gb , NO x 0.00197 t/kWh
β 0.33 kWh/h L CH 4 11 kWh c ec 0.01 ¥/kWh
ρ C O 2 1 m3 E ccs max 500 kW c el 0.096 ¥/kWh
E wt max 3000 kWh Q gb max 1500 kW c mr 0.122 ¥/kWh
Q st max 2000 kWh c chp 0.13 ¥/kWh c ost 0.065 ¥/kWh
E ec max 800 kWh c hfc 0.0835 ¥/kWh c ac 0.024 ¥/kWh
Q ac max 800 kWh c gb 0.028 ¥/kWh c tes 0.01 ¥/kWh
W chp max 4000 kWh m gb , SO 2 0.0012 t/kWh c b a 0.01 ¥/kWh
Table 4. Levels of key parameters.
Table 4. Levels of key parameters.
Parameter Level
123
pop100200300
T100200300
p0.80.850.9
AC506580
Table 5. Algorithm parameter combinations.
Table 5. Algorithm parameter combinations.
Number FactorRV
popTpAC
111110.4677735
212220.5148920
313330.4838482
421230.4016394
522310.7434524
623120.5579683
731320.6305002
832130.6367929
933210.6099820
Table 6. Calculation of multi-objective evaluation indicators for the four algorithms.
Table 6. Calculation of multi-objective evaluation indicators for the four algorithms.
Confidence IntervalSample MeanHVTimeSpacing
MOBKA-QLEconomic cost36,92039,04437,9820.0388244.298317,202.8877
Carbon emission17,11518,31617,716
Energy efficiency0.81210.829430.82076
MODBOEconomic cost39,00640,59139,7980.0240476.171413,204.9437
Carbon emission18,76019,58319,171
Energy efficiency0.80620.82420.8152
MOSSAEconomic cost42,12242,64442,3830.0054239.03501434002.8724
Carbon emission20,40920,73720,573
Energy efficiency0.79820.80920.8037
BKAEconomic cost41,07642,49241,7840.0270238.66879745.7401
Carbon emission20,30421,07620,690
Energy efficiency0.79280.82280.8078
Table 7. Performance metrics for the IES system.
Table 7. Performance metrics for the IES system.
ATCSREUECDERR
14.63%28.84%13.90%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, R.; Yan, X.; Fan, Z.; Tu, N. Multi-Objective Scheduling Method for Integrated Energy System Containing CCS+P2G System Using Q-Learning Adaptive Mutation Black-Winged Kite Algorithm. Sustainability 2025, 17, 5709. https://doi.org/10.3390/su17135709

AMA Style

Shi R, Yan X, Fan Z, Tu N. Multi-Objective Scheduling Method for Integrated Energy System Containing CCS+P2G System Using Q-Learning Adaptive Mutation Black-Winged Kite Algorithm. Sustainability. 2025; 17(13):5709. https://doi.org/10.3390/su17135709

Chicago/Turabian Style

Shi, Ruijuan, Xin Yan, Zuhao Fan, and Naiwei Tu. 2025. "Multi-Objective Scheduling Method for Integrated Energy System Containing CCS+P2G System Using Q-Learning Adaptive Mutation Black-Winged Kite Algorithm" Sustainability 17, no. 13: 5709. https://doi.org/10.3390/su17135709

APA Style

Shi, R., Yan, X., Fan, Z., & Tu, N. (2025). Multi-Objective Scheduling Method for Integrated Energy System Containing CCS+P2G System Using Q-Learning Adaptive Mutation Black-Winged Kite Algorithm. Sustainability, 17(13), 5709. https://doi.org/10.3390/su17135709

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop