1. Introduction
Globally, achieving the United Nations Sustainable Development Goals (SDGs), particularly those on affordable clean energy and climate action, requires a fundamental low-carbon transformation of the existing energy system. Against this backdrop, nations worldwide have formulated low-carbon development strategies and accelerated the pace of energy transformation [
1,
2]. China is promoting its low-carbon development strategy by vigorously developing renewable energy and establishing a national carbon emissions trading market. Similarly, EU countries, represented by Germany’s energy transition, are actively phasing out fossil fuels and systematically promoting energy efficiency improvements and grid modernization through the EU Green Deal.
To achieve national emission reduction targets, the low-carbon transformation of the power system has become a key path. For example, references [
3,
4,
5] provide detailed analysis of the policies and practices for promoting energy structure transformation under the “dual carbon” goal through large-scale grid-connected renewable energy and the establishment of a national carbon trading market. Demand response [
6] as an important means of demand-side management can not only improve the efficiency of power grid operation but also achieve low carbonization of power consumption by guiding users to change their electricity consumption behavior. However, the existing demand response mechanism [
7,
8,
9] mainly focuses on power grid security and economy. For example, reference [
9] minimizes users’ electricity costs by responding to dynamic electricity prices, a typical economic-oriented study. Similarly, reference [
7] focuses on utilizing demand-side resources to smooth grid frequency fluctuations and ensure system security. These works all fail to adequately consider carbon emissions, resulting in their environmental benefits not being maximized.
Traditional carbon emission assessment methods mostly use static carbon emission factors (CEFs), assuming a fixed amount of carbon emissions per unit of electricity generated. In their research on carbon efficiency in the transportation sector, refs. [
10,
11,
12] employed an average carbon emission factor to uniformly account for indirect emissions from electricity consumption. This approach ignores the dynamic nature of real-time carbon intensity caused by changes in grid scheduling, potentially leading to discrepancies between the assessment results and actual conditions. Although this method is simple and easy to implement, it ignores the influence of the real-time operation status of the power system on the carbon emission characteristics, which leads to the deviation of the assessment results from the actual situation. Some studies [
13,
14,
15,
16] have tried to improve the assessment accuracy by average carbon emission factor (ACEF) or marginal carbon emission factor (MCEF). Dixit et al. [
13] proposed the ACEF calculation method based on the power supply structure, and Li. et al. [
15] constructed the carbon emission assessment model considering marginal units. However, the results of these methods have their own limitations. Although the assessment based on ACEF is stable, its results cannot truly reflect the marginal impact of load increase and decrease on system carbon emissions; and although the assessment based on MCEF can reflect the marginal effect, its results may overestimate the emission reduction potential of demand response due to local fluctuations. To overcome these limitations, this paper innovatively proposes a dynamic CEF assessment method, which establishes a dynamic mapping relationship between load change and carbon emission by considering the impacts of MCEF and ACEF.
Accurate prediction technology [
17,
18,
19] is the basis for achieving low-carbon demand response. Existing research has made progress in electricity consumption forecasting. The literature [
20] uses LSTM network to achieve short-term load forecasting. Wan et al. [
21] propose a load forecasting method based on a hybrid CNN-LSTM model, which effectively combines the advantages of CNN in extracting deep data features and the ability of LSTM in processing time series information, thereby improving the accuracy of short-term load forecasting. However, in the field of low-carbon demand response, there is a relative lack of research on the coordinated forecasting of CEF and power consumption. Existing methods mainly use the method of directly inputting historical data for forecasting [
20,
21]. Although deep learning models such as LSTM can capture time series characteristics to a certain extent, it is difficult to fully reflect the pattern differences between carbon emissions and electricity data in different periods. For example, factors such as working days and rest days, as well as seasonal changes, cause carbon emissions and power consumption behaviors to present different characteristic patterns. Although traditional clustering methods (such as K-means) can divide these patterns, they need to pre-specify the number of clusters and are difficult to adapt to the dynamic changes in carbon emission characteristics [
22,
23,
24]. For example, the clustering algorithm used in the literature [
22] for hierarchical load forecasting requires the number of clusters to be manually set in advance. This method has poor adaptability when facing dynamic systems where unknown new patterns may appear, and it is difficult to ensure the optimality of the clustering results. In order to improve the prediction accuracy, a method is needed to adaptively identify the carbon emission–power consumption pattern and input the extracted pattern features into the prediction model together with historical data, which can achieve a more accurate prediction of carbon emission factors and loads. This collaborative optimization method of feature extraction and prediction is still under further study in the field of low-carbon demand response.
Existing demand response optimization studies mainly focus on economy and grid security [
25,
26,
27,
28]. Research [
25] uses demand response to enhance the security of wind power systems, while [
27] designs an economic demand response strategy for data centers to optimize electricity procurement costs. Although some scholars [
29,
30,
31] have begun to pay attention to the environmental benefits of demand response, the current research still faces multiple challenges. For example, in studies [
29,
30,
31], carbon trading mechanisms are introduced as part of system operating costs, but carbon emissions are still considered an economic constraint or secondary objective in the model, rather than the core driver of optimization. First, it is significantly difficult to construct a demand response optimization framework that focuses on carbon emission reduction. The literature [
32,
33] tends to treat carbon emissions only as a secondary objective or auxiliary constraint and lack a systematic approach that puts it at the center of decision-making. Second, how to accurately quantify and maximize the potential for carbon emission reduction under a fixed daily electricity consumption constraint is a complex issue. More importantly, how to effectively balance the system safety and user experience while pursuing low-carbon goals remains an unresolved challenge.
In summary, this paper proposes a low-carbon demand response resource optimization method based on DCEFs through the DPMM-LSTM algorithm. The main contributions are as follows:
- (1)
A DCEF calculation method is proposed by integrating MCEF and ACEF, which overcomes the limitations of traditional static CEF in reflecting real-time operating status. Moreover, this method not only avoids the shortcomings of using MCEF alone to easily amplify the impact of local fluctuations but also removes using only ACEF to accurately characterize the marginal effect of load changes.
- (2)
A prediction framework based on DPMM clustering and dual LSTM network is designed. DPMM realizes adaptive clustering of carbon emissions and power consumption patterns. The clustering results and historical data are input into the dual LSTM network architecture to realize the coordinated prediction of power consumption and CEF.
- (3)
A low-carbon demand response optimization framework with carbon emission indicators as the core is constructed. The framework achieves a multi-objective balance between carbon emission reduction, system security, and user comfort through the coordinated configuration of rigid constraints and flexible constraints.
Next, we give a description of the symbols used in the article. is carbon emission intensity of fuel f. is carbon emissions of power plant p. is total power consumption of the system at time t. is the amount of power generated by fuel type f at time t. is the total carbon emissions of the system at time t. P represents the set of power plants . is the power generation at power plant p for fuel f. is the capacity utilization coefficient for power plant p at time t. is original predicted load at time t. represents the increase in power consumption at time t. represents the decrease in power consumption at time t.
The paper is structured as follows:
Section 2 presents the calculation method of DCEF.
Section 3 introduces the DPMM-LSTM algorithm.
Section 4 studies the optimization method to maximize carbon reduction. The case analysis results of carbon emission scheduling optimization in a certain region are in
Section 5. Finally, the conclusion is in
Section 6.
The framework diagram of this article is shown in
Figure 1. The framework diagram mainly includes five parts: the first is data collection and processing, the second is DCEF assessment, then load characteristic clustering and collaborative prediction, and finally low-carbon target optimization scheduling.
2. Dynamic Carbon Emission Factor Calculation
This section introduces the DCEF assessment module shown in
Figure 1. This module is the foundation of the entire low-carbon optimization framework. Its core task is to construct a DCEF that accurately reflects the real-time carbon intensity of the power grid through the innovative integration of MCEF and ACEF.
The power system’s carbon emission characteristics are highly dynamic and complex, making traditional static calculation methods inadequate for capturing real-time emission variations. When determining the DCEF, neither the MCEF nor the ACEF alone suffices:
To address this, we propose a weighted fusion of MCEF and ACEF for DCEF calculation. This hybrid approach preserves baseline emission levels while accurately characterizing load-induced marginal effects.
2.1. Marginal Carbon Emission Factors
We assume that the carbon emissions of a power plant are determined by the carbon emission intensity of the fuel and the power generation efficiency. The unit carbon emissions of power plant
p using a specific fuel
f are given by the following formula:
where
is the carbon emission intensity of some fuel
f and
is the efficiency of power generation at power plant
p for fuel
f. Formula (
1) reflects the energy conversion relationship in the power generation process. The higher the efficiency, the lower the carbon emissions per power generation unit.
In order to accurately assess the actual impact of changes in power demand on system carbon emissions, it is necessary to introduce MCEF. The MCEF represents the additional carbon emissions caused by an increase in power demand by one unit. Considering the power system at time
t, the MCEF is defined as
where
is the incremental carbon emission of the marginal system at time
t and
is the incremental power demand of the marginal system at time
t.
According to Formula (
1), the MCEF can be further expressed as
where
is power transmission efficiency,
, and
is a binary variable indicating whether the generator unit participates in marginal power generation.
can be written as
where
is the residual power load at time
t and
is the installed capacity of the generator set.
The MCEF reflects the change in carbon emissions caused by the load increment. In the power system, the marginal units in different periods are often different, which results in the dynamic change in the MCEF over time.
2.2. Average Carbon Emission Factors
Unlike the MCEF, which reflects the carbon emission intensity of the incremental load of the power system, the ACEF reflects the overall carbon emission level of the power system over some time.
The ACEF is defined as
where
is the total carbon emissions of the system at time
t and
is the total power consumption of the system at time
t.
For the
pth power plant, its carbon emissions
can be expressed as
where
is the carbon emission intensity of power plant
p and
is its installed capacity.
is the capacity utilization coefficient, which can be expressed as
where
is a piecewise function used to determine the capacity utilization of a power plant. When
, it means that when the cumulative installed capacity is less than the reserved capacity, the power plant operates at full load. When
, it means that when the previous cumulative installed capacity has reached or exceeded the reserved capacity, the power plant stops operating. In other cases, it indicates partial load operation, which is operated according to the ratio of remaining demand to installed capacity.
We sum the emissions of all power plants in the system to obtain the total carbon emissions of the system
Next, we need to calculate the actual power consumption
of the system. The total power generated by the power generation side can be expressed as
where
is the amount of power generated by each fuel type
f at time
t and
F is the set of all fuel types for power generation.
Taking into account the loss during transmission, the amount of power reaching the user side is
where
is power transmission efficiency.
According to Formula (
4), one can get
2.3. Dynamic Carbon Emission Factors
By introducing the weight coefficient
, MCEF and ACEF are weighted to calculate DCEF
where
. The calculation formula for the time-varying weight coefficient
is
where
indicates the load change rate at the current instants.
is the maximum load change observed in history.
Remark 1. The proposed DCEF is derived from a weighted MCEF-ACEF fusion and is positioned as a more robust evaluation method than traditional static methods. Traditional static CEFs [10,11,12] cannot track real-time operational dynamics or marginal generation impacts. The MCEF methods [15] are sensitive to system noise and tend to overestimate the impact of demand response. The ACEF methods [13], although stable, are slow to respond to the key marginal effects that guide real-time optimization. Our DCEF model overcomes this contradiction through a dynamic weight . The carbon signal it generates can respond sensitively to marginal changes and has good resilience to noise, thus providing a more accurate and robust basis for designing effective low-carbon demand response strategies. 3. Power Consumption and DCEF Clustering and Prediction Model
This section corresponds to the DPMM load characteristic clustering module and the dual DPMM-LSTM prediction module in
Figure 1. This module aims to first adaptively identify potential patterns in electricity consumption and carbon emission data through a data-driven approach. Then, based on these pattern characteristics, it accurately and collaboratively predicts electricity consumption and DCEF for the next 24 h, providing key input for subsequent optimized scheduling.
3.1. Dirichlet Process Mixture Model Principle
The DPMM [
34] is a non-parametric Bayesian clustering approach that automatically infers the number of clusters. It models data as a mixture of an unknown number of distributions, with each observation generated from a distribution whose parameters are governed by a Dirichlet process.
We assume that the observed data point
is generated from a distribution with parameters
:
where
represents the probability density function. The random distribution
G follows a Dirichlet process
where
is the concentration parameter, which controls the probability of generating new categories;
is the reference distribution, which determines the parameter distribution of the newly generated categories.
In order to better understand the data generation process, the hidden variable
is introduced to represent the category to which each data point belongs. The model can be rewritten as
where
is a discrete probability distribution used to describe the probability of selecting multiple categories and
represents the probability of selecting each category.
When classifying a new data point, we use the Bayes theorem to calculate the posterior probability. According to the Bayes theorem, the posterior probability is proportional to the product of the prior probability and the likelihood function:
Next, we can obtain the specific probability of a new data point being assigned to each category
where
is the number of existing data points in category
k. This result shows that data points may either join existing categories or create new categories, which is determined by the characteristics of the data itself and the existing category structure.
To implement this category assignment mechanism, we must compute each data point’s probability of belonging to each category. This requires defining an observation model describing the data distribution per category. Given that real-world data often follow approximately normal distributions and for computational convenience, we typically use a Gaussian distribution as our observation model
where
and
are the mean vector and the covariance matrix of the
kth category.
represents a multi-dimensional gaussian distribution. The corresponding conjugate prior distribution is
where
is the inverse Wishart distribution. Hyperparameters
,
,
, and
need to be set before the algorithm starts and their choice affects the behavior of the model.
For a new data point, its predicted distribution can be expressed as
where
is the estimated parameter of the
kth existing category.
DPMM not only considers existing categories but also reserves the possibility of generating new categories, which enables the model to adjust the number of categories as the data increase naturally.
3.2. LSTM Network Principle
LSTM [
20] is a specialized recurrent neural network featuring a cell state that propagates information throughout the network via gating mechanisms. Its architecture comprises three gates (forget, input, output) and a memory unit.
The forget gate determines the extent to which the information in the cell state at the previous moment needs to be retained or discarded. Its expression is as follows:
where
represents the sigmoid activation function,
is the weight matrix,
is the hidden state of the previous moment,
is the current input, and
is the bias term. The output of the forget gate is a value between 0 and 1, where 1 means completely retained and 0 means completely discarded.
The input gate controls the extent to which new information enters the cell state and consists of two parts: the input gate vector
and the candidate memory cell
where
,
,
and
are the weight matrix and bias of the input gate and candidate memory respectively, and
is the hyperbolic tangent activation function, which limits the output to
.
The new cell state is updated as follows:
where
and
are the cell state at the current time step and the cell state at the previous time step, respectively. ⊙ represents the element-by-element multiplication.
The output gate regulates the extent to which the cell state information is propagated to the hidden state, as expressed by the following equation:
where
is the activation value of the output gate,
is the hidden state of the current time step,
and
are the weight matrix and bias of the output gate, respectively.
3.3. Clustering and Prediction Model Design
We assume that the calculated hourly DCEF data set is and the power consumption data , where n is the number of samples.
This paper uses the DPMM model for cluster analysis, which can be expressed as
where
G is the random probability measure,
represents the Dirichlet process,
is the concentration parameter, and
is the benchmark measure.
represents the parameter of the
ith observation data, and
is the likelihood function of the observation data.
For DCEF data, the generation process can be described as
where
is the mixing weight satisfying
.
represents Gaussian distribution,
and
are the mean and covariance matrix of the
ith component, respectively.
Similarly, the generation process of power consumption data can be expressed as
where
is the mixing weight.
and
are the mean and covariance matrix of the
jth component, respectively.
The posterior distribution of the DPMM model can be inferred by Gibbs sampling
where
represents all parameters except
,
is the likelihood function, and
is the Dirac function.
Through this model, we can obtain the clustering results of DCEF data and power consumption data , where k and m are the optimal numbers of clusters determined adaptively.
When constructing the input features of the LSTM prediction model, we first need to determine the feature category of the historical data. These historical data need to be calculated separately from the feature class center obtained by DPMM clustering to determine the feature category to which they belong.
For DCEF data, assuming that DPMM clustering obtains
k feature class centers
, the distance between the carbon emission factor data at each time
t and the
ith class center can be expressed as
where
is the CEF data at time
t. For the DCEF data at time
t, the characteristic category to which it belongs can be determined by the following method:
Similarly, for power consumption data, assuming that DPMM clustering obtains
m feature class centers
, the distance between the power consumption data at each time
t and the
jth class center is
The corresponding feature category is determined as
Through the above calculation, we can deetrmine the feature category of the historical data at each moment. When constructing the input of the LSTM prediction model, this feature category information is fused with the original data. For the LSTM model for DCEF prediction, the input feature vector can be expressed as
Similarly, the input feature vector of the LSTM model for power consumption prediction is
We employ a three-layer LSTM architecture with 128, 64, and 32 units per layer for hierarchical feature extraction, fusion, and abstraction. Input data are standardized and preprocessed, with training samples generated via sliding time windows to enhance model performance.
Through the above model, the predicted dynamic carbon emission factor data and power consumption data are obtained, which is convenient for subsequent low-carbon scheduling research.
Remark 2. By combining the original data with the feature category information to which it belongs, the LSTM model [20,21] can simultaneously learn the temporal change characteristics and clustering characteristics of the data. Feature category information, as a high-level semantic feature, helps the model capture the periodic patterns and typical change laws of the data. For example, the carbon emission factor in certain periods may belong to the low-carbon feature class, while the power consumption may belong to the high-load feature class. This feature combination information is significant for the prediction model in understanding the system operation status. Remark 3. Our framework combines adaptive DPMM clustering with dual LSTM architecture to overcome limitations in low-carbon demand forecasting. Unlike standard LSTM models [20,21] that fail to distinguish operational patterns (weekdays/holidays, seasons) or traditional clustering [22,23,24] requiring preset cluster numbers, our DPMM automatically identifies optimal operational modes. These pattern features, combined with historical data, feed into a dual LSTM network for coordinated prediction of both key variables.