Next Article in Journal
Optimized High-Pressure Ultrasonic-Microwave-Assisted Extraction of Gingerol from Ginger: Process Design and Performance Evaluation
Next Article in Special Issue
An Electricity Market Pricing Method with the Optimality Limitation of Power System Dispatch Instructions
Previous Article in Journal
Experimental Study on the Treatment of Printing and Dyeing Wastewater by Iron–Carbon Micro-Electrolysis and Combined Processes
Previous Article in Special Issue
Cost Prediction for Power Transmission and Transformation Projects in High-Altitude Regions Based on a Hybrid Deep-Learning Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Virtual Power Plant Optimization Process Under the Electricity–Carbon–Certificate Multi-Market: A Case Study in Southern China

1
China Energy Engineering Group, Guangdong Electric Power Design Institute Co., Ltd., Guangzhou 510663, China
2
Institute of Energy, Environment and Economy, Tsinghua University, Beijing 100084, China
3
Management Science Research Institute of Guangdong Power Grid Corporation, Guangzhou 510308, China
*
Author to whom correspondence should be addressed.
Processes 2025, 13(7), 2148; https://doi.org/10.3390/pr13072148
Submission received: 2 June 2025 / Revised: 29 June 2025 / Accepted: 2 July 2025 / Published: 6 July 2025

Abstract

Over the past decade, China has vigorously supported the development of renewable energy and has initially established the electricity–carbon–certificate multi-market. As a typical market-oriented demand-side management model, studying the optimization process and cases of virtual power plants (VPPs) under the multi-market has significant importance for enhancing the operation level of VPPs, as well as promoting corresponding experiences. Based on the mechanisms and impacts of the electricity–carbon–certificate multi-market, this manuscript takes a VPP project in southern China as a case, constructs a sequential decision-making optimization model for the VPP under a diversified market, and solves it using reinforcement learning and Markov decision theory. The case analysis shows that, compared to energy supply income, although the proportion of income from certificate trading and carbon trading in the multi-market is relatively limited, participating in the electricity–carbon–certificate multi-market can significantly enhance VPPs’ willingness to accommodate the uncertainties of renewable energy and can significantly improve the economic and environmental performances of VPPs, which is of great significance for improving the energy structure and accelerating the process of low-carbon energy transformation.

1. Introduction

China’s power industry accounts for nearly 40% of the country’s total CO2 emissions [1,2]. To address the challenges posed by climate change, China has embarked on a series of explorations aimed at transforming its energy supply structure. However, relying solely on the large-scale construction of centralized renewable energy bases on the supply side is insufficient to fully meet the requirements for carbon neutrality. It is also necessary to fully tap into the potential of demand-side resources [3].
As a demand-side management model capable of transcending geographical boundaries to widely aggregate distributed devices and multi-energy loads, the virtual power plant (VPP) promotes the safe, large-scale, and efficient utilization of renewable energy by effectively coordinating the aggregated resources. This increases the share of renewable energy in the power consumption structure. Conducting research on the optimization methods and processes of VPPs under the electricity–carbon–certificate multi-market is of great significance for enhancing the comprehensive benefits of VPPs and supporting the green and low-carbon transformation of energy sector [4,5].
From the perspective of research scenarios, current scholarly research on VPPs mainly falls into two categories: the coordinated scheduling of VPP internal resources and strategy optimization for VPPs’ participation in the external market [6,7]. Ding et al. [8], Liao et al. [9], and Elgamal et al. [10] have studied the optimization methods for internal resource scheduling within the VPP, aiming to promote the absorption level of renewable energy and enhance economic benefits. Based on the forecasting of renewable power output, they considered the technical characteristics of combined heat and power equipment, carbon capture equipment, power-to-gas (P2G) facilities, energy storage, and other typical equipment operated by the VPP. Pandzic et al. [11] and Zhou et al. [12] further incorporated demand response load resources into the optimization scope, and the results indicated that demand response can effectively improve the load curve, but its contribution to enhancing environmental benefits is relatively limited. Ju et al. [13] and Ge et al. [14] further incorporated typical distribution network-side constraints such as renewable power uncertainty, peer-to-peer energy trading, and carbon emission characteristics into the VPP internal resource scheduling problems. Wu et al. [15], Song et al. [16], and Aldegheishem et al. [17] conducted research on optimizing market bidding strategies and the output of distributed energy resources (DERs), energy storage, and electric vehicles by robust optimization and two-stage optimization, providing corresponding references for VPPs to enhance their economic performance when participating in the electricity market. Furthermore, in light of the construction and reform demands for multi-markets in China, as exemplified by the electricity market and carbon market, scholars have conducted a series of in-depth studies on multi-market mechanisms, including coupling mechanisms [18,19], price transmission [20], and joint clearing methods [21,22]. Correspondingly, Sun et al. [23] and Wu et al. [24] conducted research on the scenario where VPPs participate in both the electricity market and carbon market, and based on designing carbon pricing mechanisms and analyzing the impact of carbon trading, they also proposed corresponding bidding strategies and scheduling strategies.
From the perspective of research methods, considering the accumulated research, the modeling and solving of VPP optimization problems can be categorized into two approaches: holistic modeling and solving and hierarchical modeling and solving. Under the holistic modeling and solving approach, scholars typically adopt game theory and multi-stage coordinated optimization to construct nonlinear mixed-integer VPP optimization models. They further employ linearization techniques based on McCormick envelopes and intelligent algorithms to solve these problems [25,26,27,28]. Considering the difficulty and limitations of solving the holistic modeling problem, Liu et al. [29], Li et al. [30], and Mao et al. [31] decomposed the optimization problem into upper-level problems and lower-level subproblems based on theoretical methods such as Lagrangian dual relaxation, column-and-constraint generation (C&CG), and multi-agent architecture. By solving these problems hierarchically, they derived optimization strategies for the VPP, effectively reducing the difficulty of solution and improving the computational efficiency. Specifically, Xie et al. [32], Aghdam et al. [33], Shen et al. [34], and Liu et al. [35] incorporated uncertainty factors that may be involved in the VPP optimization process into their research scope by utilizing methods such as Information Gap Decision Theory (IGDT), stochastic optimization, and chance constraints, thereby further enhancing the robustness of the solution results.
Based on the above research, the contributions and original work of this paper are primarily manifested in two aspects.
From the perspective of research scenarios, this paper analyzes the implementation and interaction mechanisms of China’s diverse “carbon neutrality”-themed policies, including renewable energy development and carbon quota systems. It investigates how these mechanisms drive and guide the multi-market. Based on these mechanisms, an optimized VPP model is constructed within the electricity–carbon–certificate multi-market. This approach addresses the limitations of the existing literature in research scenario construction, which often deviates significantly from real-world energy policy mechanisms, neglects the guiding role of policy mechanisms on the market, and overlooks the influence of RPS and CA mechanisms on VPPs’ willingness to absorb renewable energy and participate in the multi-market transactions.
From the perspective of research methodology, this paper constructs a Markov Decision Process (MDP) model for VPP optimization process under the multi-market, utilizing the principles of reinforcement learning. The model incorporates factors such as the uncertainty of renewable energy and flexible loads. By sequentially optimizing VPP decisions throughout the dispatch period in chronological order, it simulates the interaction between VPP decision-making behaviors and the decision-making environment, as well as the impact of prior decisions on subsequent problem-solving environments and decision-making processes. This study overcomes the limitations of existing research approaches, which typically employ either an “integrated modeling and solving” or a “hierarchical modeling and solving” strategy to derive VPP strategies for each time instant through a single one-time solution. These conventional approaches often overlook the influence of prior VPP strategies, RPS compliance progress, and other factors on the problem-solving process in subsequent time periods. In contrast, our research offers a valuable reference for VPP strategy formulation, behavior simulation, and policy effectiveness evaluation.
The rest of this paper is organized as follows: Section 2 analyzes the electricity–carbon–certificate multi-market mechanism and its related impacts and proposes an optimization framework for VPPs under the multi-market. Section 3 constructs the VPP optimization model and proposes a solution method based on the concept of reinforcement learning. In Section 4, a case study and discussion are carried out on a VPP located in southern China. Finally, Section 5 draws the relevant conclusions.

2. VPP Optimization Framework Under the Multi-Market

2.1. Electricity–Carbon–Certificate Multi-Market

Over the past decade, guided by initiatives such as the Clean Development Mechanism (CDM) and energy security strategies, the Chinese government has made a series of efforts centered around the construction of the electricity–carbon–certificate multi-market. Initially, two primary mechanisms have been formed: the “CA + carbon trading” mechanism, which is overseen by the Ministry of Ecology and Environment (MEE), and the “RPS + green electricity + green electricity certificates (GECs)” mechanism, which is managed by the National Development and Reform Commission (NDRC). Below are the typical mechanisms and their impacts within this multi-market framework.
Firstly, renewable energy electricity (green electricity) trading can be conducted under two basic models: “certificate–electricity integration” and “certificate–electricity separation.” Moreover, GECs can serve as one of the important means for entities to meet their RPS targets. These mechanisms clarify the relationships among GREs, green electricity, and green environmental value, thereby addressing the potential risk of double-counting green environmental value.
Secondly, the electricity generated by distributed renewable energy units will also be eligible for the GEC issuance. This provides an important policy basis for tapping into the potential of distributed renewable energy to participate in GEC trading and meet RPS targets.
Thirdly, at this stage, carbon allowances in the carbon trading market will be distributed free of charge, and the consumption of renewable energy will not be covered under carbon allowance management. Carbon trading has emerged as a crucial approach for correcting the externalities associated with the production and operation activities of relevant entities.
Fourthly, the integration of the electricity and carbon market has emerged as an important development trend, and the multi-market mechanism and policies have explicitly stated that the consumption of renewable energy is not included in the scope of carbon emission control. This means that distributed renewable energy production, direct consumption of renewable energy, and GEC trading will directly or indirectly influence the settlement process of RPS and carbon allowances, thereby strengthening the interconnections and mutual influences among the electricity, GECs, and carbon markets.
Lastly, but most importantly, for VPPs’ optimization process, under constraints of the multi-market and relevant compliance policies, VPPs demonstrate significant dual attributes of both suppliers and demander. That is, while operating various types of generation equipment, VPPs also aggregate demand-side loads represented by electricity and heat. Therefore, according to relevant policy requirements, VPPs are one of the entities subject to the constraints of RPS and CA. This means that while VPPs need to control their carbon emissions, they also need to absorb a certain proportion of renewable energy when providing energy to demand-side loads. In this scenario, VPPs will act as market participants and become one of the hubs connecting the multi-market, which includes electricity, carbon, and certificates. This implies that under the electricity–carbon–certificate multi-market, VPPs can maximize their overall benefits by flexibly adopting diversified strategies such as “certificate–electricity integration” trading, “certificate–electricity separation” trading, renewable energy unit dispatch, etc. This also provides an important basis and scenario for this research.

2.2. Optimization Assumptions and Framework

Based on the analysis of the multi-market mechanisms in Section 2.1, the VPP is required not only to meet the energy consumption demands but also to comply with the obligations for RPS and CA as verified by relevant authorities. Taking a VPP project situated in southern China as a case study, we establish the research assumptions and the framework for the VPP optimization process, as illustrated in Figure 1.
(a) The VPP aggregates several types of resources, including photovoltaic units (PVs), gas combined heat and power units (CHP), and electric and heat loads. The electric load resources consist of rigid loads and elastic loads. Among these, the elastic loads are predominantly temperature-controlled loads represented by air conditioning, and the VPP can acquire a certain range of load regulation capabilities by adjusting the retail electricity price.
(b) In this case, the electricity consumption demand is met by PVs and CHP. In situations where there is a deficiency or surplus in electricity supply, the VPP can apply to purchase or sell electricity from/to the grid through the balancing market. However, the power exchange between the VPP and the grid is constrained by interconnection bus capacity. The thermal energy consumption demand is supplied by the CHP units, and the natural gas required for the CHP is purchased in the upstream wholesale market. In addition, since CHP is the primary equipment for thermal energy production, to ensure that heat load energy consumption needs are met, it is further stipulated that the VPP will operate the CHP in a “power determined by heat” mode.
(c) Both the VPP and the aggregated elastic loads are boundedly rational entities. Among these, the VPP pursues the maximization of overall benefits within the dispatch day, whereas the elastic loads pursue the optimization of the comprehensive benefits in terms of economy and comfort.
(d) China’s RPS, CA, carbon trading, and other related mechanisms are still in their initial stages. Therefore, based on international experience and existing research, the following assumptions are proposed accordingly.
  • Regarding the RPS, firstly, as the responsible entity, the VPP shall be subject to certain economic penalties if it fails to meet the RPS target, with the unit penalty equivalent to one GEC. Secondly, the annual RPS target can be disaggregated into daily target based on the RPS weights approved by the government and the daily load, and assessments can be carried out accordingly.
  • In terms of CA, although the primary emission sources during VPP operation are CHP and electricity purchased from the grid, the root cause of VPP’s carbon emissions, when viewed from the essence of economic activity, lies in its responsibility to supply electricity and heat to the aggregated loads it serves. Drawing on Section 2.1, carbon trading is recognized as a significant means to address the externalities associated with the activities of entities. In this paper, it is stipulated that the CA for VPPs is calculated based on their load levels. Additionally, given the differences in settlement cycles across the multi–market, carbon trading settlement and assessment are conducted on a daily basis, with the carbon market adopting a stepped pricing mechanism. The difference between actual emissions and carbon allowances can be sold or purchased in carbon market. Finally, although some scholars have conducted relevant research on the mutual recognition, quota offsetting, or deduction in transactions involving products such as GECs, CA, and green electricity [36,37], under China’s current policy mechanisms, the trading of green electricity and GECs are managed separately by the NDRC, while CA and carbon trading are managed by the MEE. The outcomes of GEC trading do not impact the issuance, trading, or compliance of carbon quotas. Therefore, this paper assumes that CA will not be influenced by transactions involving GECs.
  • Concerning GECs, according to the multi-market mechanism, GECs can be traded through two modes: “certificate–electricity integration” and “certificate–electricity separation.” This implies that when the consumption of renewable energy by the VPP exceeds the RPS target, it can adopt the “certificate–electricity separation” mode to separate GECs from the renewable energy and generate profits by selling them; conversely, if the consumption falls short of RPS target, GECs need to be purchased to meet the target; otherwise, it will face penalties.

3. Modeling and Solution

3.1. VPP Modeling

3.1.1. Objective Function

Based on the assumptions outlined in Section 2, the VPP can pursue the maximization of comprehensive benefits over the scheduling period by optimizing the actions at each time step, including dispatching generation units (PV or CHP), regulating elastic loads, and engaging in transactions involving carbon, certificates, and electricity. Indeed, it is evident that optimizing the VPP’s strategies within a single scheduling day under the multi-market presents a typical sequential decision-making optimization problem. Based on this observation, the following model is constructed:
max R = I sale C CHP C ele C PV + C GEC + C C O 2
Here, I sale represents the energy supply income. C CHP represents the CHP cost. C ele represents the cost/income of purchasing/selling electricity from/to the balancing market. C PV represents the PV LCOE. C GEC and C C O 2 represent the GEC trading item and carbon trading item, respectively, with positive values indicating income and negative values indicating cost.
The incomes and costs in (1) can be further derived as follows:
  • Energy supply income
I sale = t = 1 T ( p t sale , e L t e + p t sale , h L t h )
Here, p t sale , e and p t sale , h represent the retail electricity price and retail heat price of the VPP at time t. L t e and L t h represent the electric load and heat load aggregated by the VPP.
Considering that the VPP can guide elastic loads to alter their electricity consumption characteristics by adjusting retail electricity price, this process can be depicted as follows:
L t e = j = 1 J ( P t , j base + P t , j DR ) p t sale , e = θ t DR p t e
where P t , j base and P t , j DR represent the power of load j before participating in the VPP regulation/demand response (DR) and the responsive power. For rigid loads, P t , j DR = 0 . θ t DR represents the discount coefficient for the retail electricity price of the VPP price-based DR. When θ t DR is greater than 0 and less than 1, it indicates that the retail electricity price is reduced to encourage an increase in load demand, at which point P t , j DR 0 . Conversely, when θ t DR is greater than 1, it signifies that the retail electricity price is increased to prompt a decrease in load demand, and P t , j DR 0 . p t e represents the retail electricity price before the VPP DR.
  • CHP operational cost
C CHP = t = 1 T p t gas V t gas V t gas = P t CHP η e q
Here, p t gas represents the natural gas price in wholesale market. V t gas represents the natural gas input volume of the CHP. P t CHP and H t CHP represent the electric output and heat output of the CHP at time t, respectively. η e represents the efficiency coefficients. q represents the lower heating value (LHV) of natural gas.
  • Balancing market cost
C ele = t = 1 T p t grid P t grid
Here, p t grid represents the real-time price in the balancing market. P t grid represents the electricity that the VPP declares to purchase/sell in the balancing market at time t.
  • PV cost
C PV = t = 1 T c PV P t PV
Here, c PV represents the PV LCOE cost coefficient. P t PV represents the PV output at time t.
  • GEC trading cost
C GEC = Q RPS p GEC ,       Q RPS 0 Q RPS p penalty ,       Q RPS < 0
Q RPS = ω t = 1 T L t e t = 1 T P t PV
Here, Q RPS represents the excess amount of RPS. When Q RPS is greater than 0, it indicates that the corresponding RPS target has been achieved, and under the “certificate–electricity separation”, GECs from the renewable energy can be stripped and sold to obtain income. When Q RPS is less than 0, it means that the VPP has failed to meet the RPS target and will face corresponding penalties. p GEC represents the GEC price in the certificate market. p penalty represents the penalty for failing to meet the RPS target. ω represents the RPS weighting.
  • Carbon trading cost
The excess carbon emissions are calculated as the difference between the actual carbon emissions and the government-approved carbon allowances. As shown in Figure 2, under the stepped carbon trading mechanism, when excess carbon emission is positive, the VPP is required to purchase carbon allowances from the carbon market. Conversely, when it is negative, the VPP can sell its surplus carbon allowances in the carbon market to generate income [38].
C C O 2 = p C O 2 d ( 2 + α ) + p C O 2 ( 1 + 2 α ) Q C O 2 + 2 d         2 d < Q C O 2 d p C O 2 d + p C O 2 ( 1 + α ) Q C O 2 + d                                                 2 d < Q C O 2 d p C O 2 Q C O 2                                                                                                                             d < Q C O 2 0 p C O 2 Q C O 2                                                                                                                                   0 < Q C O 2 d p C O 2 d p C O 2 ( 1 + β ) ( Q C O 2 d ) d < Q C O 2 2 d p C O 2 d ( 2 + β ) p C O 2 ( 1 + 2 β ) ( Q C O 2 2 d ) 2 d < Q C O 2 3 d
Here, Q C O 2 represents the excess carbon emissions of the VPP. p C O 2 represents the benchmark price for carbon trading. d represents the step size of carbon emission intervals in the stepped carbon trading mechanism, indicating the proportion of carbon emissions exceeded or reduced relative to the government-approved carbon allowances. α and β represents the coefficient for the increase/decrease of carbon price.
Furthermore, according to the current multi-market mechanism and assumptions, the carbon emissions of the VPP primarily stem from meeting the energy consumption demands of the aggregated loads. Consequently, the excess carbon emissions can be expressed as follows:
Q C O 2 = Q C O 2 , VPP Q C O 2 , CA Q C O 2 , VPP = t = 1 T ( Q t C O 2 , CHP + μ gird P t grid ) Q t C O 2 , CHP = μ CHP , a + μ CHP , b P t CHP + μ CHP , c ( P t CHP ) 2 Q C O 2 , CA = t = 1 T L t e ω C O 2 ω f r e e
where Q C O 2 , VPP and Q C O 2 , CA represent the actual carbon emissions and the government-approved carbon allowances of the VPP. μ gird represents the carbon emission factor of grid electricity issued by the government. μ CHP , a , μ CHP , b , and μ CHP , c are the carbon emission coefficients of the CHP. ω C O 2 is the CA weight for the VPP. ω f r e e represents the free portion of the government-approved carbon allowances for the VPP.

3.1.2. Constrains

  • Energy supply and demand balance
Based on the assumptions in Section 2, the VPP is required to meet the electricity and heat energy consumption demands of the aggregated loads, so the VPP must adhere to the energy balance constraints as follows:
L t e = P t PV + P t CHP + P t grid
L t h = H t CHP
  • Interconnection bus/pipeline capacity
The flow of electricity/natural gas transmitted between the VPP and the grid/natural gas network must be less than the maximum bus/pipeline capacity, with the constraint expressed as follows:
0 P t grid P max grid 0 V t gas V max gas
where P max grid and V max gas represent the maximum transmission capacity of the interconnection bus and pipeline.
  • PV output
0 P t PV P t , max PV P t , max PV = P ˜ t PV + Δ P t PV
f Guassian Δ P t PV = k K ω k f Guassian k Δ P t PV μ k , σ k 2 0 < ω k < 1 k K ω k = 1
Here, P ˜ t PV represents the predicted PV output at time t. Δ P t PV represents the prediction error of PV output, and the error follows a mixed Gaussian distribution f Guassian · . ω k is the weight of the k-th Gaussian model. μ k and σ k 2 represent the mean and variance of the k-th Gaussian model.
  • CHP output
According to the assumptions, the CHP operates in the “power determined by heat” mode, where although there is no strict one-to-one coupling relationship between the electrical and heat power outputs of the CHP, a constraint relationship exists within a certain range of flexibility [39].
P t CHP min λ 1 CHP H t CHP ,   λ 2 CHP P min CHP λ 3 CHP H t CHP P t CHP P max CHP λ 4 CHP H t CHP
H min CHP H t CHP H max CHP
R a m p CHP down P t CHP P t 1 CHP R a m p CHP up
Here, λ 1 CHP , λ 2 CHP , λ 3 CHP , and λ 4 CHP are the thermoelectric coupling coefficients, which represent the technical characteristics of the CHP. P min CHP , P max CHP , H min CHP , and H max CHP represent the minimum and maximum electrical and heat outputs of the CHP, respectively. R a m p CHP up and R a m p CHP down are the upward and downward power ramp rate limits for the CHP.
  • Typical elastic load
Based on the assumptions, the temperature-controlled loads aggregated by the VPP are typical elastic loads, which could adjust their electricity consumption characteristics within a certain range in response to the retail price and temperature.
The regulation capability of temperature-controlled loads is shown below:
P t , j down , θ P t , j DR 0           1 < θ t DR 0 P t , j DR < P t , j up , θ                   0 < θ t DR < 1
where P t , j up , θ and P t , j down , θ represent the maximum upward and downward adjustable powers of elastic load j when the retail electricity price discount factor is θ t DR .
According to (19), it is evident that the ability of the VPP to utilize temperature-controlled loads for operational optimization primarily hinges on the regulation potential of these loads under varying price discount factors. As boundedly rational entities, the regulation potential of temperature-controlled loads is often influenced by expectations regarding energy economics and comfort. According to the reference point dependence principle in decision-making optimization problems, temperature-controlled loads can be characterized as follows [40]:
min f j = c j actual + c j comfort + μ cost ( c j actual c j base )
c t , j actual = θ t DR p t e ( P t , j base + P t , j DR ) c k comfort = μ comfort T t , j T s e t , j c k base = p t e P j , t base
where min f j is the objective function of temperature-controlled load j. c j actual represents the actual energy consumption cost. c j com represents the comfort deviation cost. μ comfort and μ cost are the deviation coefficients for the energy consumption comfort and economy. μ cost ( c j actual c j base ) represents the deviation cost of energy consumption economy. T t , j and T s e t , j represent the real-time temperature and the set baseline temperature of the load.
Furthermore, regarding the mechanism of temperature control characteristics for elastic loads, using engineering thermodynamics, the temperature deviation of a single temperature-controlled load after participating in VPP regulation/DR can be described as follows [41]:
T t , j = T t 1 , j γ + ( 1 γ ) ( T t , j env ω t η P t , j elastic R ) γ = e Δ t R c ω t P min elastic P t , j elastic ω t P max elastic
where γ is the heat dissipation function. T t , j env represents the ambient temperature. ω t represents the operating status of the elastic load, which is either 0 or 1. η represents the energy efficiency ratio of the elastic load. P t , j elastic represents the power of the elastic load j at time t. R is the thermal resistance. Δ t is the time step for VPP regulation. c represents the specific heat capacity of air. P max elastic and P min elastic represent the upper and lower power limits.
It is important to note that for a cluster of temperature-controlled loads aggregated by the VPP, their regulation potential can be characterized as follows:
min F = j J f j = j J ( c j actual + μ comfort T t , j T s e t , j + μ cost ( c j actual c j base ) )
where J represents the cluster of temperature-controlled loads aggregated by the VPP.

3.2. Solution

Based on the assumptions and optimization framework outlined in Section 2, under the multi-market, the VPP strategy optimization process essentially constitutes a mixed logical–continuous decision optimization problem across a time series. At any given time t within the scheduling period, the VPP formulates units, output plans, and load regulation plans based on factors such as the PV output, energy price levels, load levels, and RPS target progress. The implementation of these plans subsequently alters the load characteristics, RPS target achievement, and other factors in the subsequent time step, which, in turn, influences the formulation of future units, output plans, and load regulation plans.
Based on the preceding analysis, the core idea of conducting strategy optimization for a VPP is consistent with the approach in reinforcement learning that optimizes the mapping relationship between “behavior policy” and “reward” through “planning–exploration–improvement.” Therefore, reinforcement learning concept can be utilized to solve the VPP strategy optimization problem. Therefore, the VPP strategy optimization problem is typically formulated as an MDP, denoted as M = I , S , A , P , R , λ [42,43].
  • I represents the agent in the MDP, which is the VPP.
  • S represents the state set, which reflects the decision-making environment faced by the VPP at each time step. S = { S 1 , S 2 , , S T } , where S t = ( s t 1 , s t 2 , , s t m ) , and m describes the characteristic dimensions of the environment in which the VPP operates, namely the constraints mentioned in Section 3, such as the PV output, balancing market prices, the interconnection bus/pipeline capacity, etc.
  • A is the action set, representing the possible decisions that the VPP can take at each time step. A = { a 1 , a 2 , , a T } , where a t = ( a t 1 , a t 2 , , a t n ) , and n represents the dimensions of the VPP decision variables. Based on the assumptions, the decision variables include the CHP output P t CHP , PV output P t PV , electricity that the VPP declares to purchase/sell in the balancing market P t grid , and elastic load scheduling P t , j DR . In the MDP, the decision variables mentioned above can be denoted as a t CHP , a t PV , a t grid , and a t DR . Each decision a t n must satisfy the constraints outlined in Section 3.
  • P represents the transition process to the next decision-making environment state after the VPP takes certain decisions (actions).
  • R is the immediate reward function, representing the reward that the VPP can obtain after taking certain decisions/actions. According to Section 3, the reward that the VPP can receive in the environmental state at time t is as follows:
    R S t a t = I t sale C t CHP C t ele C t PV t T I t sale C t CHP C t ele C t PV + C t GEC + C t C O 2     t = T
It should be noted that the uncertainty of PVs has a direct impact on the VPP operation. Especially when the actual PV output is less than the forecasted output, the VPP needs to purchase electricity from the grid to ensure the energy balance. At this point, the VPP’s renewable energy consumption, carbon emissions, and RPS compliance progress will all change, thereby affecting related revenue and cost items.
In light of this, considering the uncertainty of PVs, from the perspective of MDP, the constraints in the model regarding PV costs need to be revised as follows:
C t PV = s PV , c a t PV a t PV s t PV s PV , c s t PV + ( a t PV s t PV ) s t grid , p a t PV > s t PV
where s PV , c represents the LCOE cost state coefficient for PVs. a t PV represents the PV output scheduled by the VPP. s t PV represents the maximum technical output state of PVs. s t grid , p represents the electricity price in the balancing market.
  • λ is the discount factor, representing the VPP’s preference for immediate rewards versus long-term rewards. For λ ( 0 , 1 ] , the closer λ is to 0, the more the VPP values immediate rewards. Conversely, the more it values long-term rewards.
The key to solving the MDP lies in following Bellman’s optimality principle to find the sequence of actions (strategy π ) that maximizes the expected cumulative rewards over the scheduling period T. By referring to the relevant literature on the derivation process of reinforcement learning, it can be stated that, according to Bellman’s optimality principle, the value function can represent the mapping relationship between the current state, actions, and the cumulative rewards under strategy π , as shown in (26) [44,45].
V t π S = R S t a t + γ S t S T S t S t + 1 a t V t + 1 π S t + 1
Here, V t π S is the state value function, representing the immediate rewards and accumulation of expected future rewards after taking action a t at time t. T S t S t + 1 a t represents the process of transitioning from state S t to state S t + 1 after taking action a t .
Furthermore, the cumulative rewards of the VPP under the optimal strategy π over the entire scheduling period can be characterized as follows:
F π = V t π S = F t S t , π S t + γ E V t + 1 π S t + 1 S t
The optimal sequence of actions π can be extracted from the optimal value function based on the greedy algorithm, as shown below.
π = arg min a t A , S t S V t S t , a t
Combining the idea of Bellman’s optimality, the VPP strategy optimization problem can be summarized as a backward iterative process, and evolutionary algorithms can be used to extract the optimal actions for each scheduling period to achieve the solution. The solution process is shown in Figure 3. Specifically, the evolutionary algorithm adopted is the genetic algorithm (GA). The mathematical essence of this algorithm involves treating each individual within the population as a feasible solution (feasible decision in the MDP) within the solution space (state constraints in the MDP). By simulating the biological evolution process, it conducts genetic operations such as inheritance, mutation, crossover, and replication to search for the optimal solution (optimal actions) within the solution space, ultimately forming the optimal sequence of actions π over time [46,47].

4. Results and Discussions

4.1. Basic Data

The case study selected in this manuscript is a park-level VPP project located in southern China. The architecture and operational model of the VPP are described in Section 2, and the relevant equipment parameters are specified in Table 1. Notably, the stepped carbon trading mechanism implemented has a maximum floating tier of ±3 levels. For quantities that exceed or fall short of the carbon quota by more than 30%, the trading will still be carried out at the price corresponding to the ±3rd tier.
The loads aggregated by the VPP are shown in Figure 4; the retail electricity and heat prices, as well as the transaction prices in electricity balancing market and natural gas market, are presented in Figure 5; and the forecasted PV output is illustrated in Figure 6. Additionally, for elastic load control, the retail electricity price discount coefficients are set at ±5%, ±10%, and ±15%, making a total of six levels. The regulation range of the load at each level is derived from the accumulated data in the VPP dispatching system.

4.2. Scenarios and Results

To analyze the VPP strategy optimization problem under the electricity–carbon–certificate multi-market, four scenarios are established, as outlined in Table 2. The parameter settings for RPS weighting, GEC price, and carbon trading price are based on data from China’s official institutions, including NDRC (https://www.ndrc.gov.cn/), China’s Green Certificate Trading Platform (https://www.greenenergy.org.cn/), and Shanghai Environment and Energy Exchange (the designated organization for carbon trading management in China by MEE) (https://www.cneeex.com/).
In addition, for each scenario in the solution, the discount factor λ is set to 0.7, with a maximum iteration count of 150. For the evolutionary algorithm, a genetic algorithm is employed, with a population size of 80, a maximum number of generations set to 150, a mutation probability of 0.01, and a crossover probability of 0.7. Based on the objective function outlined in Section 3, the results for each scenario are presented in Table 3.
To analyze the convergence characteristics of the model and solving method, we employed the “holistic modeling and solving” approach outlined in Section 1 and selected the GA for comparative analysis. The convergence values (measured by VPP revenue) and convergence times (computational duration) of both methods are presented in Table 4.
It can be observed that, due to differences in iteration patterns and problem complexity during the solving process, the proposed method requires a longer convergence time than GA, indicating its higher demand for and reliance on computational resources to some extent. However, the proposed method exhibits superior performance in terms of convergence values, particularly as the variety of market types increases. In such cases, its achieved convergence values significantly outperform those obtained by GA.

4.3. Discussions

To analyze the specific impacts of the multi-market on VPPs, the following discussion will focus on three aspects: VPP strategy, environmental performance, and economic performance.
(a)
VPP Strategy
The VPP’s strategies in the four scenarios are shown in Figure 7.
In Scenario S1, firstly, from 1:00 to 10:00 and from 21:00 to 24:00, due to the fact that the electricity price in the balancing market is lower than or close to the CHP cost, the VPP tends to minimize the CHP output and prioritizes purchasing electricity from the grid to meet the load demand. However, influenced by the rapid price increase in the balancing market at 11:00 and the CHP ramp rate constraints, at 10:00, even though the balancing market price is significantly lower than the CHP cost, the VPP notably increases the CHP output at 10:00 to ensure the electrical output capability at 11:00. This may even require selling excess generated electricity to the balancing market at a price below cost.
Secondly, from 11:00 to 20:00, the VPP increases the output of both CHP and PV because the balancing market prices during this period are mostly significantly higher than the generation costs of CHP and PV. Specifically, from 11:00 to 15:00, the VPP even sells electricity to the grid through the balancing market to generate revenue. It is worth noting that from 13:00 to 14:00, even though the VPP is selling electricity to the balancing market and the CHP still has ample spare generation capacity, the VPP still adopts a strategy of reducing elastic load power through DR. The reason for this is that during this period, the balancing market price is significantly higher than the VPP’s retail electricity price, generation costs, and DR cost. Compared to supplying electricity to the load, selling as much electricity as possible in balancing market brings more revenue to the VPP.
In S2, from 1:00 to 10:00 and from 16:00 to 24:00, although the VPP is constrained by the RPS, its strategies are relatively similar to that in S1, as these periods do not coincide with the peak output of PV. From 11:00 to 15:00, which is the peak period for PV output, under the RPS, influenced by the potential economic penalties and the potential income from GEC transactions, the VPP increases the PV output level, aiming to reduce the expected penalties and maximize the expected GEC transaction income at the end of the dispatch period.
In S3, from 1:00 to 10:00 and from 21:00 to 24:00, compared to S1 and S2, the VPP significantly reduces the amount of electricity purchased from the balancing market. This is because, under CA, both the operation of CHP and the purchase of electricity from the balancing market incur corresponding carbon emission costs. During these two periods, the average carbon emission factor of CHP is lower than that of grid electricity, making the total cost of supplying electricity through CHP (including fuel cost and carbon emission cost) lower than that of purchasing electricity from balancing market (including transaction cost and carbon emission cost). Therefore, within the constraints of ramp rates, power balance, and other factors, the VPP tends to maximize CHP output to reduce comprehensive costs.
During the peak PV output period (11:00 to 16:00), compared to S2, the VPP further increases PV output and the amount of electricity sold in the balancing market. This is because the current unit carbon trading price in China is significantly higher than the unit GEC price, leading to a higher expected carbon trading income under CA compared to the expected GEC trading income under RPS.
In S4, the strategies of the VPP from 1:00 to 8:00 and from 20:00 to 24:00 is relatively similar to those in S3, both tending to maximize the CHP output to reduce the expected carbon emission costs at the end of the dispatch period (and increase the expected income from carbon trading). Different from S1, S2, and S3, during 9:00 to 18:00 when PVs have generating capacity, the VPP significantly increases PV output. This is due to the combined influence of RPS and CA, through which the VPP can obtain potential multiple benefits by consuming PV power. From 11:00 to 12:00 and from 15:00 to 16:00, the VPP further enhances its ability to absorb PV power by increasing the power of elastic loads. Correspondingly, from 11:00 to 14:00, the VPP also increases its electricity sales through the balancing market. In addition to being influenced by the “RPS + CA”, these strategies are also affected by the difference between the VPP’s retail electricity price, the balancing market price, and the DR cost.
(b)
Environmental performance
Figure 8 presents the change rates of environmental indicators in S2, S3, and S4 compared to S1. Compared to S1, in terms of renewable energy consumption, S2 shows an increase of 10.38%, S3 an increase of 24.31%, and S4 an increase of 30.21%. As for carbon emissions, S2 dropped to approximately 76,300 metric tons, a decrease of 17.16%; S3 fell to around 54,900 metric tons, a 33.37% reduction; and S4 decreased to roughly 46,900 metric tons, a 43.08% decline.
In terms of carbon emissions, S2 decreased by 17.16%, S3 by 33.37%, and S4 by 43.08%.
It is evident that participating solely in the GEC market or carbon trading market (that is, implementing either RPS or CA alone), or in combination, significantly enhances renewable energy consumption and reduces carbon emissions. Notably, compared to participating solely in the GEC market, participating solely in the carbon trading market can achieve more significant environmental benefits. One of the key reasons for this is that the price of GECs is significantly lower than that of carbon trading in China.
Additionally, compared to participating solely in GEC market or the carbon trading market, participating in the electricity–carbon–certificate multi-market can further enhance environmental performance, but the marginal increase is diminishing. Apart from market price factors, the limitations in total installed PV capacity, as well as the scale and regulation capabilities of elastic loads, are important reasons for this phenomenon.
As illustrated in Figure 9, the description of PV output across the four scenarios can be summarized as follows:
As RPS, CA, and the combination of “RPS + CA” are implemented (that is, the multi-market), the VPP enhances the PV output level, particularly evident during the period from 11:00 to 15:00. The reason for this is that the enforcement of RPS and CA imposes mandatory constraints on the VPP to absorb renewable energy and reduce carbon emissions. Simultaneously, the resulting GEC trading and carbon trading broaden the VPP’s income channels, enhancing its willingness and ability to bear the risks associated with PV uncertainty. Especially during the PV output peak period from 11:00 to 15:00, when the VPP’s retail electricity price is significantly higher than the balancing market price and the PV cost, the VPP adopts a more aggressive PV output strategy.
However, it is important to note that in S4, at 13:00, the PV output does not show a significant increase compared to S3. This is because the VPP’s retail electricity price during this “mid-peak period” is lower than the balancing market price, limiting the VPP’s willingness to tolerate the PV uncertainties. To some extent, this demonstrates that relying solely on strong policy compliance constraints is not sufficient to continuously enhance the VPP’s capacity to absorb renewable energy. Other factors, such as market trading prices, also need to be taken into account.
(c)
Economic performance
From an overall perspective, the growth rates of revenue among the four scenarios are illustrated in Figure 10. Compared to S1, the total revenues of the VPP in S2, S3, and S4 have increased by 3.91%, 34.75%, and 44.86%, respectively. This indicates that, whether it is participating in the carbon market/GEC market alone or participating in the multi-market (that is, implementing RPS and CA either separately or in combination), this can boost the total revenue of the VPP. Although this is closely related to the capacity and structural ratio of the PVs, CHP, and aggregated load operated by the VPP in the case, when combined with the discussions of environmental performance, it suggests, to a certain extent, that with reasonable planning and allocation of unit capacities and loads, participating in the multi-market can promote both economic and environmental benefits.
From a structural perspective, the cost and income structures for the four scenarios are as depicted in Figure 11.
In terms of cost structure, the primary cost for the VPP in all four scenarios is the CHP cost, accounting for more than 50% in each case. The proportion of CHP cost gradually increases from S1 to S4. In contrast, the proportions of PV cost and DR cost do not exhibit significant changes. This is attributed to the fact that the carbon emission factor of CHP is lower than that of electricity from the grid, as well as the CA weight. As CA constraints tighten, the VPP tends to replace electricity purchased from the grid through the balancing market with CHP, which offers more comprehensive advantages.
In terms of income structure, under the multi-market, energy supply income remains the main source of income for the VPP in all four scenarios, accounting for more than 80% in each case. Notably, the income from GEC trading is significantly lower than that from carbon trading, and in S4, it is even less than the income from the balancing market. To some extent, this reflects the impact of the current multi-market mechanism and policy environment, where there exists a disparity and imbalance in prices between GECs and carbon trading.

5. Conclusions

This manuscript analyzes the potential impacts of the electricity–carbon–certificate multi-market and related mechanisms on the operation of VPPs. It proposes an optimization framework and model for VPPs under the multi-market. Furthermore, based on the concept of reinforcement learning, it characterizes the VPP optimization process as an MDP for solution. Finally, a case study is conducted on a VPP located in southern China, and the results are as follows:
  • Whether participating solely in the certificate market, the carbon trading market, or engaging in the electricity–carbon–certificate multi-market, VPPs can significantly enhance their environmental performance. This includes improving their renewable energy consumption capacity and reducing total carbon emissions. Additionally, participating in the multi-market can broaden VPPs’ income streams and notably increase their total revenues. However, compared to energy supply income, the proportion of income derived from carbon trading, electricity sales in the balancing market, and certificate trading within the total income remains relatively limited. In the four scenarios analyzed in the case study, energy supply revenue accounted for over 80% of the total income in each case. Notably, the income from certificate trading accounts for an extremely minor portion of VPPs’ total income. Specifically, in the case analysis, green certificate trading income accounted for less than 1%, significantly lower than both carbon trading income and electricity sales income in the balancing market. This is attributed to factors such as the currently low price of GECs in China.
  • Participating in the electricity–carbon–certificate multi-market can enhance VPPs’ willingness and ability to undertake the uncertainty risks associated with renewable energy, engage in DR programs, and participate in electricity sales in balancing market. Specifically, in the case study, the VPP under the multi-market adopted more aggressive dispatch strategies during peak PV output periods, resulting in a 31.11% increase in renewable energy consumption compared to the scenario without multi-market participation. This is of great importance for tapping into demand-side resources, facilitating the source–grid–load interaction, enhancing the level of renewable energy integration, and improving the overall energy structure.
  • From a dispatch strategy perspective, under the multi-market, VPP entities can moderately increase the output plans for CHP and renewable energy generators within a certain range. This is because the potential revenues from certificates trading, carbon trading, and reduced electricity procurement costs can effectively offset—or even surpass—the risks and costs associated with renewable energy uncertainty. However, the timing and extent of output plan adjustments must be analyzed based on the differential margins among retail electricity price, balancing market price, and generation costs.
  • Compared to participating solely in the certificate market, engaging in the carbon trading market or the multi-market yields more significant improvements in both the economic and environmental performances of VPPs. Taking the case study as an example, when the VPP participated in the carbon trading market, the profits increased by 33.65%, renewable energy consumption rose by 11.71%, and carbon emissions decreased by 27.97%. When participating in the multi-market, profits increased by 38.49%, renewable energy consumption rose by 18.79%, and carbon emissions decreased by 38.47%. On one hand, this can be attributed to the relatively high prices in carbon trading market and the stepped pricing mechanism. On the other hand, it is linked to the multiple benefits derived from consuming renewable energy electricity under the carbon market or the multi-market. These benefits include the increase in expected income from GECs and carbon trading at the end of the dispatch period, as well as a reduction in the expected penalties for non-compliance with RPS.

Author Contributions

Conceptualization, Y.X., Y.L. and S.K.; methodology, Y.X. and J.M.; software, Y.X.; validation, Y.X., Y.L. and T.W.; formal analysis, Y.X., Y.L. and S.K.; data curation, Y.L. and S.K.; writing—original draft preparation, Y.X. and J.M.; writing—review and editing, Y.X. and J.M.; funding acquisition, Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the China Postdoctoral Science Foundation, grant number 2024M753545.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Yanbin Xu, Yi Liao, Shifang Kuang, and Ting Wen were employed by China Energy Engineering Group Guangdong Electric Power Design Institute Co., Ltd. Author Jiaxin Ma was employed by Management Science Research Institute of Guangdong Power Grid Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Liu, Y.; Li, Y.; Du, E.; Zhang, N.; Kang, C.; Du, S. Preliminary Exploration of Carbon Emission Estimation Technology Based on Power Big Data. Autom. Electr. Power Syst. 2025, 1–14. Available online: https://www.cnki.com.cn/Article/CJFDTotal-DLXT20250331001.htm (accessed on 5 June 2025).
  2. Zhou, C.; Chen, Y.; Xiong, B.; Lin, H. Mission of New Energy under Carbon Neutrality Goal in China. Bull. Chin. Acad. Sci. 2023, 38, 48–58. [Google Scholar]
  3. Kida, Y.; Hara, R.; Kita, H. Microgrid introduction for stabilization of power grid. IEEE Trans. Power Energy 2023, 143, 157–164. [Google Scholar] [CrossRef]
  4. Chen, W.; Zheng, X.; Yan, Q.; Wang, Y.; Liu, W.; Wang, X.; Kang, J. Optimal scheduling of virtual power plant under the green certificate carbon trading interaction mechanism. South. Power Syst. Technol. 2025, 1–12. [Google Scholar]
  5. Zeng, M.; Ma, J.; Xu, Y.; Li, R.; Bai, B.; Zhou, X. Mechanism and Path Exploration of Virtual Power Plant Participating in Peaking Product Trading under Carbon Peaking and Carbon Neutrality Goal. Price Theory Pract. 2022, 10, 9–14. [Google Scholar]
  6. Chen, Y.; Niu, Y.; Qu, C.; Du, M.; Liu, P. A pricing strategy based on bi-level stochastic optimization for virtual power plant trading in multi-market: Energy, ancillary services and carbon trading market. Electr. Power Syst. Res. 2024, 231, 110371. [Google Scholar] [CrossRef]
  7. Zhang, N.; Jia, J.; Li, B.; Shi, Z. Study on optimization of operation strategy of electric-gas coupling virtual power plant considering carbon trading. Electr. Meas. Instrum. 2024, 61, 20–28. [Google Scholar] [CrossRef]
  8. Ding, J.; Qin, H.; Su, P.; Zeng, X.; Li, J.; Hao, W. Optimal scheduling of virtual power plants based on improved Harris Hawk optimization algorithm. Renew. Energy Resour. 2025, 43, 6829–6838. [Google Scholar]
  9. Liao, Y.; Chen, J.; Yang, Y.; Ainiwaer, A. Optimal scheduling of virtual power plant with P2G and photo thermal power plant considering the flexible operation of carbon capture power plants. Electr. Power Constr. 2022, 43, 420–427. [Google Scholar]
  10. Elgamal, A.H.; Shahrestani, M.; Vahdati, M. Assessing and comparing a DDPG model and GA optimization for a heat and power virtual power plant operating in a power purchase agreement scheme. Heliyon 2024, 10, e24318. [Google Scholar] [CrossRef]
  11. Liu, J.; Hu, H.; Yu, S.S.; Trinh, H. Virtual Power Plant with Renewable Energy Sources and Energy Storage Systems for Sustainable Power Grid-Formation, Control Techniques and Demand Response. Energies 2023, 16, 3705. [Google Scholar] [CrossRef]
  12. Zhou, Y.; Wu, W.U.J.; Sun, G.; Han, H.; Zang, H. A Two-Stage Robust Trading Strategy for Virtual Power Plant in Multi-level Electricity-Carbon Market. Autom. Electr. Power Syst. 2024, 48, 1838–1846. [Google Scholar]
  13. Ju, L.; Zhe, Y.; Zhou, Q.; Li, Q.; Wang, P.; Tian, W.; Li, P.; Tan, Z. Nearly-zero carbon optimal operation model and benefit allocation strategy for a novel virtual power plant using carbon capture, power-to-gas, and waste incineration power in rural areas. Appl. Energy 2022, 310, 118618. [Google Scholar] [CrossRef]
  14. Ge, C.; Lin, S.; Tan, J.; Yang, F.; Li, D. A day-ahead optimal coordination strategy for a VPP with multiple prosumers. Power Syst. Technol. 2024, 49, 62521–62532. [Google Scholar]
  15. Wu, H.; Liu, X.; Ye, B.; Xu, B. Optimal dispatch and operation strategy of a virtual power plant based on a Stackelberg game. IET Gener. Transm. Distrib. 2020, 14, 552–563. [Google Scholar] [CrossRef]
  16. Song, J.; Yang, Y.; Xu, Q.; Liu, Z.; Zhang, X. Robust bidding game approach for multiple virtual power plants participating in day-ahead electricity market. Electr. Power Autom. Equip. 2023, 43, 77–85. [Google Scholar]
  17. Aldecheishem, A.; Bukhsh, R.; Airajeh, N.; Javaid, N. Faa vpp: Fog as a virtual power plant service for community energy management. Future Gener. Comput. Syst. 2020, 105, 675–683. [Google Scholar] [CrossRef]
  18. Dong, J.; Guo, H.; Jiang, T.; Du, E.; Zhang, N.; Kang, C. Research on Electricity-Carbon Coupling Trading in the New Power System. Proc. CSEE 2025, 1–12. [Google Scholar]
  19. Wang, Y.; Li, J.; Liu, J.; Tang, C.; Wang, J.; Lu, J.; Ren, H. Comprehensive cost analysis of multiple entities under the coupling mechanism of electricity and carbon. Elect. Power 2025, 58, 180–189. [Google Scholar]
  20. Ren, Y.; Sun, F.; Yang, X.; Zhang, J. Analysis of the dynamic transmission efficiency of carbon prices from the perspective of the electricity-carbon market linkage. Stat. Decis. 2024, 40, 183–188. [Google Scholar]
  21. Zeng, H.; Wang, M.; Li, T.; Duan, L.; Sun, K.; Xia, T. P2P-based carbon market trading model and electric carbon joint clearing method. Power Demand. Side Manag. 2024, 26, 95–100. [Google Scholar]
  22. Wu, Q.; Zhao, X.; Zhang, J.; Qiu, Z. Electricity-Carbon Market Coupling Incentive Clearing Mechanism to Promote Consumption of New Energy. Electr. Power Constr. 2023, 44, 14–27. [Google Scholar]
  23. Sun, X.; Ding, Y.; Bao, M.; Guo, C.; Liang, Z. Carbon-electricity market equilibrium analysis considering multi-time coupling decision of power producers. Autom. Electr. Power Syst. 2023, 47, 1–11. [Google Scholar]
  24. Wu, C.; Yang, X.; Wan, Z.; Yang, F.; Wu, Z.; Zhou, Q.; Xu, J. Two-stage robust optimal scheduling for virtual power plants considering carbon-green certificate interlinked trading. Smart Power 2025, 53, 16–23. [Google Scholar]
  25. Zhang, L.; Bao, F. Study on coordinated operation of multiple virtual power plants integrated with distributed renewable energy. Electr. Meas. Instrum 2024. Available online: https://kns.cnki.net/kcms2/article/abstract?v=_ISxPpdig3z-JuBmma_KoxHQuj3yq9wOe1hSZAEpWXD5xiJgsqMEbJzS6Izp8GsvFiD3Qz89snvoOU3uhsLvrVunchrTeTWp_L0iacOcPFgJF5wcdsAx7gMwllngebhwZNQ_EKBZrMjo0q1hj9sCfnA0oIES6pXpAsi3h8FZPMT58LSLXyvqHA==&uniplatform=NZKPT&language=CHS (accessed on 5 June 2025).
  26. Zhang, L.; Luan, H.; Du, H.; Zheng, L.; Lv, L. Low-carbon economic dispatch of a virtual power plant with carbon capture and compressed liquid carbon dioxide storage. Power Syst. Technol. 2024; to be published. [Google Scholar] [CrossRef]
  27. Yi, Z.; Xu, Y.; Zhou, J.; Wu, W.; Sun, H. Bilevel programming for optimal operation of an active distribution network with multiple virtual power plants. IEEE Trans. Sustain. Energy 2020, 11, 2855–2869. [Google Scholar] [CrossRef]
  28. Yu, K.; Wang, L.; Zhang, R.; Wang, S. Optimal scheduling of virtual power plants considering carbon emission allowances and user satisfaction. Control. Eng. China 2025, 1–10. [Google Scholar] [CrossRef]
  29. Liu, S.; Ai, Q.; Zheng, J.; Wu, R. Bi-level coordination mechanism and operation strategy of multi-time scale multiple virtual power plants. Proc. CSEE 2018, 38, 753–761. [Google Scholar]
  30. Li, X.; Zhao, D. Distributed coordinated optimal scheduling of multiple virtual power plants based on1 decentralized control structure. Trans. China Electro-Tech. Soc. 2023, 38, 71852–71863. [Google Scholar]
  31. Mao, T.; Li, J.; Zhou, B.; Cheng, R.; Zhao, W.; Wang, T. Research on Bidding Strategies of Virtual Power Plants in the Integrated Electricity and Carbon Market-Analysis based on uncertainty factors. Price Theory Pract. 2024, 12, 210–216+232. [Google Scholar] [CrossRef]
  32. Xie, M.; Ma, G.; Liu, B.; Pan, Z.; Shang, Y. Virtual power plant quotation strategy based on information gap decision theory. Electr. Power 2024, 57, 40–50. [Google Scholar]
  33. Aghdam, F.H.; Javadim, S.; Catalao, J.P.S. Optimal stochastic operation of technical virtual power plants in reconfigurable distribution networks considering contingencies. Int. J. Elec. Power. 2023, 147, 108799. [Google Scholar] [CrossRef]
  34. Shen, S.; Han, H.; Zhou, Y.; Sun, G.; Wei, Z. Electricity-carbon-reserve peer-to-peer trading model for multiple virtual power plants based on conditional value-at-risk. Autom. Electr. Power Syst. 2022, 46, 147–157. [Google Scholar]
  35. Liu, Y.; Lin, H. Bidding strategy of virtual power plant considering carbon trading and conditional value at risk. Electr. Power Eng. Technol. 2023, 42, 6179–6188. [Google Scholar]
  36. Zhao, A.P.; Li, S.; Xie, D.; Wang, Y.; Li, Z.; Hu, P.J.; Zhang, Q. Hydrogen as the nexus of future sustainable transport and energy systems. Nat. Rev. Electr. Eng. 2025. Available online: https://www.nature.com/articles/s44287-025-00178-2#citeas (accessed on 5 June 2025). [CrossRef]
  37. Shang, N.; Chen, Z.; Leng, Y. Mutual Recognition Mechanism and Key Technologies of Typical Environmental Interest Products in Power and Carbon Markets. Proc. CSEE 2023, 44, 2558–2578. [Google Scholar]
  38. Wang, L.; Dong, H.; Lin, J.; Zeng, M. Multi-objective optimal scheduling model with IGDT method of integrated energy system considering ladder type carbon trading mechanism. Int. J. Electr. Power Energy Syst. 2022, 143, 143–108386. [Google Scholar] [CrossRef]
  39. Lu, N. An evaluation of the HVAC load potential for providing load balancing service. IEEE Trans. Smart Grid 2012, 3, 1263–1270. [Google Scholar] [CrossRef]
  40. Zhu, X.; Sun, Y.; Yang, B.; Yang, J.; Wu, B. Calculation method of EV cluster’s schedulable potential capacity considering uncertainties and bounded rational energy consumption behaviors. Electr. Power Autom. Equip. 2022, 42, 245–254. [Google Scholar]
  41. Huang, P.; Zhan, H.; Peng, G.; Zhang, X.; Zhang, N. An optimization model of promoting accommodation of wind power combining demand response and CHP with heat storage. Electr. Meas. Instrum. 2017, 54, 1–6. [Google Scholar]
  42. Keerthisinghe, C.; Chapman, A.C.; Verbic, G. PV and Demand Models for a Markov Decision Process Formulation of the Home Energy Management Problem. IEEE Trans. Ind. Electron. 2019, 66, 1424–1433. [Google Scholar] [CrossRef]
  43. Li, T.; Wang, X.; Dou, J.; Liu, Z.; Liu, F.; Si, F.; He, J. Research on optimal dispatch of integrated energy systems based on constrained reinforcement learning. Power Syst. Prot. Control. 2024, 53, 1–14. [Google Scholar]
  44. Keerthisinghe, C.; Verbic, G.; Chapman, A.C. A fast technique for smart home management: ADP with temporal difference learning. IEEE Trans. Smart Grid 2016, 9, 3291–3303. [Google Scholar] [CrossRef]
  45. Hansen, T.M.; Chong, E.K.P.; Suryanarayanan, S.; Maciejewski, A.A.; Siegel, H.J. A partially observable Markov decision process approach to residential home energy management. IEEE Trans. Smart Grid 2018, 9, 1271–1281. [Google Scholar] [CrossRef]
  46. Fu, Z.; Li, X.; Zhu, J.; Yuan, Y. Intelligent optimization strategy of home energy management based on Markov decision process. Electr. Power Autom. Equip. 2020, 40, 141–148. [Google Scholar]
  47. Bi, C.; Tang, Y.; Luo, Y.; Lu, C. Review on critical problems in reinforcement learning methods applied in power system optimization and control scenarios. Proc. CSEE 2024, 44, 1–22. [Google Scholar]
Figure 1. VPP optimization framework under the multi-market.
Figure 1. VPP optimization framework under the multi-market.
Processes 13 02148 g001
Figure 2. Stepped carbon trading mechanism.
Figure 2. Stepped carbon trading mechanism.
Processes 13 02148 g002
Figure 3. VPP strategy optimization process.
Figure 3. VPP strategy optimization process.
Processes 13 02148 g003
Figure 4. Loads aggregated by the VPP.
Figure 4. Loads aggregated by the VPP.
Processes 13 02148 g004
Figure 5. Energy price information.
Figure 5. Energy price information.
Processes 13 02148 g005
Figure 6. PV output information.
Figure 6. PV output information.
Processes 13 02148 g006
Figure 7. VPP optimization strategies in Scenarios S1 to S4.
Figure 7. VPP optimization strategies in Scenarios S1 to S4.
Processes 13 02148 g007
Figure 8. Change rate of environmental indicators compared to S1.
Figure 8. Change rate of environmental indicators compared to S1.
Processes 13 02148 g008
Figure 9. PV output in S1 to S4.
Figure 9. PV output in S1 to S4.
Processes 13 02148 g009
Figure 10. Growth rate of revenue across scenarios.
Figure 10. Growth rate of revenue across scenarios.
Processes 13 02148 g010
Figure 11. Cost and income structures of the VPP in S1 to S4.
Figure 11. Cost and income structures of the VPP in S1 to S4.
Processes 13 02148 g011aProcesses 13 02148 g011b
Table 1. Parameter information.
Table 1. Parameter information.
ItemsValueUnits
CHP capacity10MW
CHP ramp rate R a m p CHP up , R a m p CHP down ±2MW/h
CHP carbon emission coefficients μ CHP , a = 0.01
μ CHP , b   = 0.2
μ CHP , c = 0.012
/
CHP thermoelectric coupling coefficients λ 1 CHP = 0.44
λ 2 CHP = 45.4
λ 3 CHP = 0.23
λ 4 CHP = 0.5
/
CHP efficiency coefficients0.9/
PV capacity10MW
PV LCOE cost c PV 250Yuan/MWh
Bus capacity P max grid 5MW
Pipeline capacity V max gas 1500Nm3
Grid carbon emission factor μ gird 0.85tCO2e/MWh
Table 2. Scenario settings.
Table 2. Scenario settings.
ScenariosGEC Market (RPS)Carbon Market (CA)
S1UnimplementedUnimplemented
S2RPS weighting ω = 15%, p GEC = p penalty = 50 CNY/MWh.Unimplemented
S3Unimplemented100% free CA; benchmark price p C O 2 = 100 CNY/tCO2e; carbon emission intervals in the stepped carbon trading d = 10%; pricing adjustment factor α = β = 10%; CA weight ω C O 2 = 0.7.
S4RPS weighting ω = 15%, p GEC = p penalty = 50 CNY/MWh.100% free CA; benchmark price p C O 2 = 100 CNY/tCO2e; carbon emission intervals in the stepped carbon trading d = 10%; pricing adjustment factor α = β = 10%; CA weight ω C O 2 = 0.7.
Table 3. Optimization results of objective function items (Units: CNY).
Table 3. Optimization results of objective function items (Units: CNY).
ItemsS1S2S3S4
Energy supply income I sale 156,633.56157,974.54158,005.38157,905.99
CHP cost C CHP 44,285.5052,524.3658,362.5163,983.05
Balancing market cost C ele 20,669.1310,708.52305.43−4,325.41
PV cost C PV 14,259.2515,739.2517,581.7518,696.00
GEC item C GEC /1449.74/2044.58
Carbon trading item C C O 2 //28,657.3730,553.59
VPP revenue77,419.6880,452.19108,413.06112,150.51
Table 4. Comparison of convergence characteristics.
Table 4. Comparison of convergence characteristics.
ItemsConvergence CharacteristicsMDPGA
S1Value (CNY)77,419.6875,901.65
Time (s)21.6415.52
S2Value (CNY)80,452.1976,621.13
Time (s)26.9216.12
S3Value (CNY)108,413.0696,797.37
Time (s)27.5115.91
S4Value (CNY)112,150.5197,522.18
Time (s)29.1916.22
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, Y.; Liao, Y.; Kuang, S.; Ma, J.; Wen, T. Virtual Power Plant Optimization Process Under the Electricity–Carbon–Certificate Multi-Market: A Case Study in Southern China. Processes 2025, 13, 2148. https://doi.org/10.3390/pr13072148

AMA Style

Xu Y, Liao Y, Kuang S, Ma J, Wen T. Virtual Power Plant Optimization Process Under the Electricity–Carbon–Certificate Multi-Market: A Case Study in Southern China. Processes. 2025; 13(7):2148. https://doi.org/10.3390/pr13072148

Chicago/Turabian Style

Xu, Yanbin, Yi Liao, Shifang Kuang, Jiaxin Ma, and Ting Wen. 2025. "Virtual Power Plant Optimization Process Under the Electricity–Carbon–Certificate Multi-Market: A Case Study in Southern China" Processes 13, no. 7: 2148. https://doi.org/10.3390/pr13072148

APA Style

Xu, Y., Liao, Y., Kuang, S., Ma, J., & Wen, T. (2025). Virtual Power Plant Optimization Process Under the Electricity–Carbon–Certificate Multi-Market: A Case Study in Southern China. Processes, 13(7), 2148. https://doi.org/10.3390/pr13072148

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop