1. Introduction
Realizing rural modernization is the primary goal of comprehensively promoting the strategy of rural revitalization. In this process, promoting the decoupling of agricultural energy consumption from agricultural economic growth while ensuring agricultural economic growth and energy security is crucial for China to achieve its carbon peak and carbon neutrality goals in the agricultural sector [
1]. However, China’s current energy structure poses a serious challenge to achieving this goal. In the past thirty years, China’s energy consumption structure has been dominated by coal, which is closely related to the increasing carbon dioxide emissions in China [
2]. This high carbon energy structure is particularly evident in the agricultural sector, where carbon emissions from agriculture and rural areas are one of the main sources of greenhouse gas emissions in China, and have had a profound impact on the implementation of the rural revitalization strategy [
3]. The unprecedented socio-economic growth of rural parks in China has led to a significant increase in energy consumption and carbon emissions, posing a serious threat to the environment and the health of rural residents [
4]. Therefore, the importance of rural energy systems in energy transition is particularly prominent [
5]. It should be noted that compared with urban energy structures, rural energy supply systems exhibit significant structural shortcomings. Meanwhile, rural areas possess abundant local energy resources, including biomass, wind, and solar. Yet, the existing energy supply systems fail to effectively utilize these resources to meet the full range of rural energy demands. Furthermore, the large-scale deployment of user-side energy storage and the evolution of diversified demands in rural integrated energy systems present a new challenge. The key question now is: how can we achieve a coordinated improvement in both energy utilization efficiency and user economy while ensuring the safe and stable operation of the rural distribution network?
Against this backdrop, it is crucial to forecast the multi-coupled loads of rural integrated energy systems and achieve reasonable coordinated optimal output of loads. Appropriate feature engineering can assist models in better mining the intrinsic information within data, thereby enhancing predictive performance [
6]; this approach has been extensively studied. Novel composite LSTM models are designed to extract temporal patterns from time-series data, achieving outstanding forecasting capabilities [
6]. For instance, to handle noisy and non-stationary data, a hybrid model employing Modified Complete Ensemble Empirical Mode Decomposition with Adaptive Noise, Shannon Entropy, and Long Short-Term Memory techniques was proposed and validated on real load data [
7]. Another approach introduced a Two-Layer Joint Modal Decomposition method to break down nonlinear and non-stationary multi-energy loads into several Intrinsic Mode Functions, and utilized the Uniform Information Coefficient to select strongly correlated calendar, meteorological, and coupling features, which enhanced the model’s effectiveness and accuracy [
8]. Similarly, Complete Ensemble Empirical Mode Decomposition with Adaptive Noise was applied in [
9] to decompose load data into various frequency components, enabling the model to capture both short-term and long-term patterns. Further advancing this concept, a multi-component fusion LSTM Ridge Regression ensemble model was designed, which utilized the Seasonal-Trend decomposition using the Loess algorithm to decompose the trend and seasonal components of thermal load, thereby improving its predictive accuracy [
10]. After addressing feature-related challenges, model construction remains a critical factor. Modeling methodologies encompass time-series statistical methods, deep learning approaches, and machine learning techniques.
Common methods mainly rely on single-model structures such as time series models [
11,
12] and multiple linear regression [
13]. Notably, machine learning and deep learning can provide more powerful predictive capabilities [
14]. In recent years, an increasing number of scholars have applied machine learning to load forecasting. Several robust models have demonstrated excellent performance in this domain, such as Long Short-Term Memory networks [
15], and Recurrent Neural Networks [
16]. Composite models and Graph Neural Network models are also commonly employed load forecasting methods. To boost accuracy, a model was proposed that integrates optimal IMF decomposition, BiLSTM, an attention mechanism, and a DNN, with hyperparameters tuned via Bayesian Optimization [
17]. To improve generalization, a novel framework featuring a multi-channel parallel LSTM-BiLSTM sub-network and a Split Convolution module was introduced [
18]. To handle nonlinearity and non-stationarity, a Dynamic Attention-enhanced RVFL-LSTM was developed [
19]. To capture global periodic trends, a hybrid attention scheme with Multi-Feature Attention and Context-Awareness was designed [
20]. Additionally, the integration of CNNs with LSTMs has been successfully applied to enhance learning performance [
21].
Given that no single model has yet achieved optimal results for all engineering problems [
22], ensemble learning has emerged as a methodology that integrates multiple models to address this limitation. Commonly used ensemble learning techniques include Bagging, Boosting, and Stacking. The effectiveness of these methods is well-documented. Bagging has been applied to reintegrate predictions for improved accuracy and generalization [
23], while Stacked Generalization has garnered significant attention. Successful applications of Stacking include the powerful LSTM-XGBoost fusion framework [
6], a specialized XGBoost-RF ensemble for short-term load forecasting [
24], and an algorithm for net load interval prediction [
25]. Therefore, in order to overcome the limitations of single model and homogeneous integration of complex load characteristics, such as high-dimensional, strong nonlinear, multi periodic and vulnerable to noise, this paper selects the superposition integration method. Stacking utilizes heterogeneous base learners to capture different patterns in data, thereby improving prediction accuracy and model robustness, making it particularly suitable for the challenging environment of rural integrated energy systems.
Building upon accurate load forecasting, the next critical step is to utilize these predictions for optimizing energy dispatch and planning within the rural grid. Currently, extensive research has been conducted on energy planning for rural distribution networks in China. For example, one such model introduces an Integrated Energy Station (IES) that takes renewable sources like biogas, wind, and photovoltaic power as input, demonstrating improvements in energy efficiency while alleviating the operational load of distribution networks and narrowing the peak–valley difference [
26]. In the collaborative planning process of rural distribution networks and energy storage, the Interruptible Load (IL) scheme represents a crucial form with excellent Demand Response (DR) and peak shaving capabilities. Various modeling and application frameworks for IL have been developed. For example, a load-based interruption model was presented in [
27], where the probability of server interruption is expressed as an exponential function of the total computational load. This model utilizes Multi-Agent Advantage Actor Critic—a simple yet efficient method that enables decentralized decision-making while handling large action spaces, effectively reducing the average total service delay. In the context of microgrid scheduling, a two-layer active day-ahead scheduling model for TMCM was introduced in [
28], which adopts a Basic Interruptible Plan (BIP) to ensure a successful and cost-effective approach for managing outage conditions, serving as an emergency demand response resource. Furthermore, a broader framework that systematically integrates switchable and interruptible load management was proposed to enhance overall system flexibility [
29]. Therefore, introducing an economic incentive mechanism with interruption compensation fees as the core can effectively guide users to voluntarily interrupt loads during system peak or high-cost periods, thereby achieving a win-win situation between system economic operation and user revenue improvement.
However, the aforementioned studies mostly treat IL as an independent, passive scheduling resource, lacking a feedback mechanism for their closed-loop collaborative optimization with load forecasting results. This limitation is particularly pronounced in the context of rural distribution networks, where research on establishing such a forecast-scheduling collaborative feedback mechanism remains virtually unexplored.
A pivotal challenge in bridging this gap lies in the accurate and dynamic identification of load-shedding thresholds from the highly volatile and non-stationary load sequences typical of agricultural parks. Conventional static threshold methods often fail to adapt to the intrinsic multi-timescale fluctuations and long-range correlations present in such complex time-series data. To address this core challenge, our work introduces an improved Multifractal Detrended Fluctuation Analysis (MF-DFA) algorithm—a powerful tool proven effective in extracting robust features and quantifying complexity from non-stationary signals in fields like biomedical engineering [
30,
31]—as the cornerstone for dynamic threshold determination.
To this end, this study proposes a novel integrated framework that combines advanced load forecasting with an IL incentive mechanism. The main contributions of this work are summarized as three key aspects:
- (1)
We develop a robust short-term load forecasting model based on a heterogeneous Stacking ensemble framework. This approach systematically integrates diverse base learners—including tree-based models for capturing nonlinear patterns and support vector regression (SVR) for handling noisy data—to effectively represent the complex characteristics of agricultural loads. Furthermore, time-series features can reflect long-term load trends and encompass periodic patterns. Through multi-dimensional time-series feature engineering and Bayesian hyperparameter optimization, the model achieves significantly enhanced predictive accuracy.
- (2)
An innovative IL incentive mechanism is established, incorporating a dynamically determined threshold. This threshold is determined by applying an improved multifractal detrended fluctuation analysis (MF-DFA) combined with inflection point detection, which enables proactive identification and management of high-cost risk periods. The resulting high fidelity dispatch optimization model is efficiently solved using the particle swarm optimization (PSO) algorithm, thereby forming a responsive closed-loop structure of “forecast-warning-dispatch”.
- (3)
Finally, a comprehensive case study based on an agricultural park in North China demonstrates the framework’s efficacy in optimizing synergistic grid operations and improving overall economic performance.
The remainder of this paper is structured as follows.
Section 2 elaborates on the development of the robust short-term load forecasting model based on the Stacking ensemble framework.
Section 3 details the load dispatch optimization model, including the IL incentive mechanism and the MF-DFA-PSO collaborative scheduling framework.
Section 4 presents a comprehensive case study to validate the efficacy of the proposed method. Finally,
Section 5 concludes the paper and discusses future research directions.
2. Time-Series Cross-Validation and Bayesian Optimization Integrated Ensemble Learning Prediction Model
To address the complex characteristics of agricultural park load sequences—such as high dimensionality, strong nonlinearity, multi-periodicity, and noise susceptibility—a single forecasting model is often inadequate. This paper therefore proposes a short-term load forecasting framework based on a stacked ensemble architecture, whose overall workflow is illustrated in
Figure 1. The core logic follows a structured three-stage strategy:
Feature Enhancement: First, multi-dimensional time-series feature engineering (
Section 2.1) is applied to expand the original load data, explicitly extracting trends, periodic patterns, and dependency relationships.
Model Integration: Secondly, build a Stacking ensemble learning framework (
Section 2.2). The core of this framework lies in the design of seven complementary base learners, dedicated to core nonlinear modeling, noise robustness processing, and feature stability assurance, to capture the intrinsic patterns of data from different dimensions. The prediction results of these base learners will be used as meta features and input into the meta learner (XGBoost) for final prediction.
Collaborative Hyperparameter Optimization: Finally, the Bayesian optimization-based Optuna framework (
Section 2.3) is employed to cooperatively tune the hyperparameters of both base learners and the meta-learner, thereby ensuring the integrated ensemble model reaches its optimal predictive performance.
Furthermore, the resulting load forecasting outputs are delivered to the dispatch optimization model described in
Section 3, where they serve as inputs to the objective function and operational constraints, thereby supporting IL decision-making and enhancing the overall system economic operation.
2.1. Time-Series Features
Driven by the recent development of rural distribution power grids, user-side energy storage in rural integrated energy systems has achieved large-scale deployment. In this context, it is essential to accurately capture the dynamic variation patterns of electrical load, enhance the model’s ability to understand and predict inherent data patterns, and endow it with multi-dimensional pattern recognition capabilities. Therefore, this paper employs a time-series modeling approach for rural load preprocessing and proposes the following feature engineering strategies:
(1) Temporal Feature Extraction: To explicitly capture seasonal and periodic variations in load data, structured temporal information is extracted to help the model identify regular patterns across different time dimensions. Specifically, periodic variations refer to short-term cyclical behaviors. For example, rural greenhouses require supplemental lighting in the morning to regulate crop growth, while cooling equipment operates during high-temperature periods in the afternoon. Seasonal variations, on the other hand, manifest as temperature-driven changes: heating equipment operates continuously in winter, and refrigeration systems are frequently activated in summer.
(2) Lag Feature Incorporation: By introducing time-shifted historical load values as input features, the model can explicitly capture temporal dependencies in electrical load. Using past load values as features allows the model to perceive how load evolves over time.
(3) Rolling Statistical Feature Construction: To capture local volatility and trend tendencies in the time series and enhance the model’s perception of local load dynamics, this study introduces rolling window statistics as key temporal features. A 24 h sliding window is adopted. Specifically, the mean load over the past 24 h helps smooth short-term fluctuations and highlight long-term trends, while the maximum load over the same period aids in identifying recent extreme high and low values.
(4) Adding exponentially weighted features: The Exponentially Weighted Moving Average (
EWMA) is calculated based on the heat load data of the past 24 h. By assigning a weight to each historical data point—where the weight decreases exponentially with the elapsed time of the data point—recent load data points are thereby granted greater influence. This characteristic enables the model to focus more on recent trend changes. The calculation formula is as follows: (The detailed variable explanations for the following formulas can be found in Nomenclature)
In the formula:
represents the actual observed value at time
t. Data is sampled at hourly intervals. After fully considering the daily cycle characteristics of the agricultural park load, the smoothing factor
is set to 0.08. This value was chosen so that the primary influence window of the
EWMA—derived from its half-life and spanning approximately 26 h—fully encompasses the 24 h operational cycle. This ensures the model’s memory covers at least one complete daily period while smoothly attenuating the influence of data from previous days (see
Appendix A for the detailed derivation). This configuration allows the model to focus on recent load trends while maintaining reasonable attenuation of historical information, effectively balancing the needs of trend capture and noise suppression. A larger
value indicates that more weight is assigned to recent load data.
(5) Time Series Decomposition: The complex load time series is decomposed into three constituent components: trend (reflecting long-term directional changes), seasonality (representing periodic fluctuations), and residual (capturing random noise). The trend component reveals the overall evolution of the series over extended periods, the seasonality component exposes recurring periodic patterns, and the residual component contains stochastic variations after removing the systematic components. By incorporating these three components as independent features, the model can separately learn and characterize distinct aspects of the temporal structure. This explicit decomposition enables the model to more accurately identify and discriminate between different types of temporal patterns, thus providing a more comprehensive understanding of the underlying dynamics in load data and contributing to improved predictive performance.
The overall architecture of the proposed multi-dimensional time-series feature engineering strategy, encompassing all five aspects detailed above, is visually summarized in
Figure 2.
2.2. Ensemble Learning Models and Their Principles
This study proposes a heterogeneous stacked ensemble learning framework to generate predictions for all key uncertain variables: electricity load, heat load, photovoltaic power generation, and wind turbine power generation. The robustness and accuracy of load forecasting have been improved through a multi-level model fusion strategy. The core prediction methods remain consistent among these objectives. The only difference lies in the input data used to train each specific prediction model:
For electricity and heat load forecasting, this model utilizes historical load data and multi-dimensional time series features. For photovoltaic and wind power generation prediction, the model is trained based on their respective historical power generation data and combined with relevant temporal features.
To avoid redundancy in the method description, we will focus on providing a detailed explanation of the load forecasting process in
Section 2. The detailed architecture of the framework is shown in
Figure 3 (the base learners used are highlighted in yellow frames, and the gray boxes represent the five major features processed in
Section 2.1).
Collaborative Design of Base Learners and Selection of Meta Learners
This study moves beyond the limitations of single-model approaches by systematically constructing a combination of heterogeneous base learners following the principle of “functional complementarity and collaborative enhancement.” This combination comprises three functionally specialized groups, designed to capture intrinsic load patterns from multiple perspectives and jointly improve the robustness and predictive accuracy of the ensemble model.
- (1)
Core Nonlinear Modeling Group
This group integrates four decision tree-based ensemble algorithms. Their key shared advantage lies in the ability to directly model complex nonlinear relationships between load and multi-dimensional features—such as weather and temporal factors—without requiring intricate feature engineering. Although all belong to the tree-model family, their internal mechanisms and strengths are effectively complementary.
Random Forest (RF): Adopting Bagging ensemble strategy, its core is the “double random mechanism”: using Bootstrap to sample and perturb the training data, and combining with random subspace method for feature selection, jointly constructing a “forest” composed of a large number of differentiated decision trees [
32]. The core advantage of this mechanism is that it greatly enhances the diversity and robustness of the model, effectively suppresses overfitting, and exhibits good tolerance for noise and local fluctuations in the data. The final prediction is the mean output of all decision trees, which smooths out the prediction variance of individual trees through “collective decision-making”, enabling it to reliably identify common and stable nonlinear patterns in load data.
Gradient Boosting Machine (GBM): Utilizing a Boosting strategy, GBM serially constructs decision trees, where each subsequent tree is specifically designed to correct the prediction residuals of the preceding one. This mechanism enables GBM to perform refined residual learning and multi-scale feature capture—whereby shallow nodes learn intra-day high-frequency fluctuations, while deep nodes model seasonal and inter-annual long-term trends. Additionally, its proxy splitting mechanism and gradient descent optimization procedure enhance robustness to missing values and outliers. GBM progressively improves its accuracy by minimizing an objective function that incorporates regularization terms. The core formulation is as follows:
In the formula: is loss function; represents the entire set of parameters of the GBM model; is a regularization term used to control the complexity of each tree and prevent overfitting; M is the number of trees; is the true load value of the i-th training sample; the symbol n represents the total number of training samples.
XGBoost and LightGBM: as efficient modern implementations of GBM, they achieve an excellent balance between accuracy and efficiency. XGBoost introduces both L1 and L2 regularization terms into the objective function to control model complexity and supports random sampling of rows and columns, fundamentally enhancing the model’s generalization ability and accurately capturing the complex nonlinear interactions between load and multi-dimensional features. LightGBM has been deeply optimized for large-scale data scenarios, using two core technologies: gradient-based unilateral sampling and leaf-growth algorithm with depth restriction, which significantly reduce computational complexity caused by data and feature dimensions while retaining most of the information [
33]. Its unique “leaf-wise” growth strategy enables the construction of deep asymmetric decision trees at a lower cost, thereby capturing multi-scale load patterns from intra-day fluctuations to seasonal variations more finely than traditional “level-wise” approaches.
- (2)
Noise Robustness Group
To enhance the stability of the model against load outliers caused by abnormal weather and other factors, this group introduced Support Vector Regression (SVR). The basic idea of SVR is to find an optimal hyperplane, map sample data to a high-dimensional feature space, and minimize the error between predicted and actual values [
34]. This base learner utilizes the Radial Basis Function (RBF) kernel to achieve this high-dimensional mapping implicitly. Through the kernel trick, SVR can perform nonlinear regression in the original input space without the computational expense of explicitly calculating the coordinates in the high-dimensional feature space, thus flexibly fitting the nonlinear relationship between power load and temporal characteristics. To achieve this goal, SVR has a dual advantage:
Noise Robustness: By employing an ε-insensitive loss function, SVR treats data points with deviations within a predefined tolerance band as acceptable errors, imposing penalties only on samples falling outside this interval. This approach effectively mitigates overfitting to noisy data and substantially enhances the robustness of agricultural load forecasting under extreme weather and other anomalous conditions.
Nonlinear Processing Capability: This point is inherently achieved by the kernel trick described above, which allows SVR to capture periodic fluctuations and trend variations typical of agricultural loads.
These dual advantages render SVR particularly suitable for handling agricultural load data, which often exhibits significant noise and nonlinearity due to natural influences and production cycles, thereby offering reliable support for constructing robust forecasting models.
In the formula: is the kernel function value, representing the similarity between two sample points in the feature space. and are the feature vectors of the i-th and j-th samples. is the Euclidean distance between two sample points. is the width parameter of the kernel function, which controls the decay rate of RBF.
Its optimization objective function is as follows:
This function must satisfy the following conditions:
In the formula: is regularization term; is the loss function term; is the weight vector; b1 is the bias term; is a mapping function that maps the input data into a high-dimensional space; C is regularization parameter; is the threshold of the insensitive loss function; and are slack variables, representing the degrees of violation of the upper and lower bounds of the -tube for the i-th sample, respectively; i represents the i-th training sample, the range of values is 1 ≤ i ≤ n.
- (3)
Characteristic stability group
To address the issue of multicollinearity that may arise from temporal feature engineering, this group introduces two regularized linear models.
Ridge Regression: By introducing L2 regularization term in the loss function, the regression coefficients are constrained within a reasonable range, effectively solving the problem of multicollinearity caused by highly correlated temporal features in agricultural load data. The advantage of this method is that its estimation is unbiased and tends to reduce some coefficients to zero, making it very suitable for dealing with multicollinearity and overfitting problems [
35]. Specifically, this mechanism brings three key effects: reducing sensitivity to specific features through compression coefficients; maintaining the stability of parameter estimation in the presence of multicollinearity; effectively filtering out training noise, thereby significantly improving the model’s generalization ability on complex agricultural data while maintaining prediction accuracy.
LASSO Regression: Employing an L1 regularization mechanism, LASSO automatically identifies and excludes redundant features by shrinking their coefficients to zero during training, thereby performing automatic feature selection. This characteristic directly addresses two core challenges in agricultural load data: first, it effectively mitigates multicollinearity among meteorological and historical load features across different time scales; second, it constructs a sparse model that autonomously selects the most influential feature subset for load forecasting. This approach not only achieves a marked reduction in computational complexity and improves training efficiency, but also fundamentally prevents overfitting, ensuring stable predictive performance on agricultural load data characterized by significant seasonal fluctuations.
The predictions generated by the base learners are assembled into a new feature matrix, referred to as meta-features, which are subsequently fed into the meta-learner for final training. In this study, XGBoost is employed as the meta-learner. In this capacity, the task of XGBoost shifts from direct modeling of raw input features to learning an optimal strategy for weighting and integrating the meta-features produced by the diverse base learners. Owing to its strong nonlinear fitting capability and built-in regularization mechanism, XGBoost effectively captures complex interactions among these predictive “opinions” while mitigating overfitting, thereby yielding a final decision that surpasses the performance of any individual base learner.
2.3. Optuna-Based Bayesian Hyperparameter Optimization
After completing the construction of feature engineering and heterogeneous base learners, the final performance of the ensemble model largely depends on its hyperparameter configuration. To ensure that the Stacking framework designed in
Section 2.2 can achieve its theoretical synergistic gain and overcome the inefficiency and limitations of manual parameter tuning, this study adopts the Optuna framework based on the Tree-Structure Parzen Estimator (TPE) algorithm for hyperparameter optimization [
32,
36]. The whole optimization process can be realized by the Optuna package of Python software.
The principal advantage of Bayesian optimization lies in its ability to model the black-box relationship between hyperparameters and model performance—quantified by validation set loss—through probabilistic surrogate models. Compared to conventional methods such as grid search or random search, the Tree-structured Parzen Estimator (TPE) can intelligently infer more promising hyperparameter configurations based on historical evaluation results, thereby identifying superior solutions with significantly fewer iterations. In each iteration, the TPE sampler proposes new hyperparameter combinations based on all historical evaluations, and each combination is assessed using 5-fold cross-validation. The performance (average MSE) of every evaluated combination is recorded. It is critical that all these evaluated combinations are retained in a history of trials. The final step of searching for the optimal hyperparameter combination occurs only after the stopping condition is met, which involves selecting the best-performing set from the entire history of trials. This mechanism ensures that the sequential search is guided by comprehensive historical data, a standard practice in Bayesian optimization for efficient space exploration. This efficiency is particularly critical for computationally expensive ensemble models.
In this study, the optimization objective is defined as minimizing the generalization error of the meta-learner (XGBoost) under 5-fold time-series cross-validation. Using the Optuna framework, hyperparameters of all model components are synchronously optimized within a unified search space, ensuring an effective balance between computational efficiency and predictive performance.
The specific hyperparameter framework is shown in
Figure 4. In this flowchart, the light blue rounded rectangle denotes the initiation of the process, while the pink rectangles represent the core iterative steps of the Bayesian optimization loop. The gray diamond shape is used for the decision-making node that checks the convergence criterion. Finally, the output of the workflow—the discovery of the optimal hyperparameter set—is indicated by a green rounded rectangle. This structured visualization clarifies the sequence and nature of each operation within the Optuna-based tuning process.
3. Load Dispatch Optimization Model
For clarity, the key terms used in this optimization model are defined as follows:
Electricity Load: The total electricity required by an agricultural park at a specific time, mainly consumed by agricultural equipment, lighting, irrigation systems, and other electrical equipment.
Thermal Load: The total thermal energy demand for heating purposes within the park. It is measured in kilowatts (kW) and is primarily supplied by the combined heat and power (CHP) unit, gas boilers, and thermal storage tanks.
3.1. Interruptible Load Incentive Mechanism
Conventional load dispatch methods often rely on static models, which struggle to adapt to complex and volatile energy supply-demand environments. This chapter introduces an innovative IL dispatch framework incorporating dynamic optimization algorithms, multi-energy complementation strategies, and user participation mechanisms. Aimed at minimizing system operational costs, the proposed model achieves coordination between load forecasting and IL dispatch through integrated scheduling of generation, storage, and demand-side resources.
To meet the electricity demand of diverse users, rural parks typically increase their electricity procurement from the main grid. However, this leads to higher grid interaction costs and consequently increases the operational expenses of rural parks. To address this issue, users and utility companies can negotiate Interruptible Capacity, Interruption Frequency, allowable duration, and compensation prices, formalizing these terms through interruptible load agreements. Users may voluntarily participate in the IL program based on their operational needs and the attractiveness of the economic compensation. This flexibility enables users to obtain additional economic benefits without disrupting their production activities or daily routines. During peak demand periods, non-essential user loads can be curtailed, with participating users receiving corresponding compensation. Specifically, interruptible load compensation can be categorized into two types:
1. The total cost of Capacity Compensation
(CNY/kW): a fixed reservation fee paid to users for committing their maximum Interruptible Capacity
(kW) to the system, as stipulated in the IL agreement. This cost is incurred regardless of whether the load is actually curtailed, serving as a payment for the availability of this capacity. The unit capacity compensation cost is denoted by
(CNY/kW).
2. The total cost of Electricity Compensation
(CNY/kWh): a variable execution fee calculated after a dispatch event based on the actual curtailed energy
(kWh), multiplied by the contracted energy compensation price
(CNY/kWh).
where
is the total number of users participating in the Interruption Compensation;
is the Energy Compensation Cost for user
i at time
t;
is 24 h including interrupted load.
In the formula: is the IL status for the i-th user, ranging from 0 to 1; indicates that the i-th user calls for IL at time t, indicates that the i-th user does not need to call for its IL at time t;
3. Total Compensation Amount:
In the formula: is the total cost for IL Compensation.
3.2. Objective Function
Based on the IL dispatch framework, the optimization model aims to minimize the total operational cost of the rural integrated energy system. This optimization model mainly considers variable operating costs, including fuel costs, grid interaction costs, and interruption compensation costs. The maintenance cost of batteries, as a fixed cost, does not affect the relative economic comparison of various schemes in short-term scheduling optimization, and therefore is not included in the objective function. The objective function comprehensively considers three major cost components: natural gas expenses
, grid interaction costs
, and IL Compensation Costs, as formulated in Equations (10)–(12).
where
is the minimum operating cost.
In the formula:
is natural gas prices;
is the CHP unit Power Generation for the
t-th hour;
is the power generation efficiency of the CHP unit;
is heat output power of gas boiler for the
t-th hour;
is gas boiler efficiency;
LHV is the heat released when a unit volume of natural gas is completely burned;
S is the selected user’s collection.
In the formula: is power purchased from the grid in kW for hour t; is power purchased from the grid in kW at hour t; is power sold to the grid in kW at hour t; is the electricity sales price at hour t (CNY/kWh);
3.3. Constraint Conditions
- (1)
Thermoelectric Power Balance Constraint:
In the formula:
,
,
are, respectively, the output power of the wind turbine, photovoltaic (PV), and combined heat and power (CHP) units at time
t;
,
is the charging and discharging power of the battery at time t;
represents the electrical and thermal loads at time t;
and
, respectively, represent the power purchased from and sold to the grid.
In the formula: α is the thermoelectric ratio for CHP units;
is the thermal output power of the CHP unit at time t.
In the formula: is thermal load at time t; , , respectively, represent the thermal output power of the gas-fired boiler and the combined heat and power (CHP) unit at time t; , are, respectively, the heat discharge power and heat charge power of the heat storage tank at time t.
- (2)
Energy supply equipment output power constraint:
In the formula: is output power of the energy supply equipment at time t; is upper limit of power output for energy supply equipment.
- (3)
Energy storage battery:
To maintain the computational focus on the core interplay between short-term load forecasting and the interruptible load dispatch, a simplified model is adopted for the battery energy storage system. This approach prioritizes capturing the essential energy arbitrage and power-limiting functionality of the storage unit over modeling its intricate internal electro-chemical dynamics.
In the formula: is the state of charge of the battery at time t; and are the charging and discharging efficiencies; indicates the rated capacity of the battery; is the scheduling time interval (1 h).
Overcharging and overdischarging of batteries can affect the working state of the battery. Therefore, the limits of the battery charging status must be within a safe and reliable range, that is:
In the formula: and are the maximum and minimum allowable state of charge of the battery, respectively.
A battery can only work in one state at a time:
In the formula: and are binary variables indicating the charging and discharging status of the battery at time t, respectively.
When the battery participates in the optimization operation of the integrated energy system, at the beginning and end of the scheduling cycle, the state of charge needs to be equal, that is:
In the formula: refers to the state of charge of the battery at the beginning of the scheduling cycle when the battery participates in the optimization operation of the integrated energy system.
To ensure the service life of batteries and improve the overall operational economy and reliability of integrated energy systems, batteries should not be overcharged or discharged.
In the formula: indicates charging power; indicates the discharge power; is the upper limit of charge discharge rate.
- (4)
Gas Turbine Balance Constraints:
The output of the gas boiler cannot exceed the upper and lower limits; therefore, it should meet:
In the formula: is the actual output power of the gas boiler at time i; is the minimum output power allowed for gas boilers; is the maximum output power allowed for a gas boiler; is a binary variable representing the on/off status of the gas boiler at time t, where 1 indicates the unit is on and 0 indicates it is off.
- (5)
Interruptible load interruption capacity constraint:
In the formula: for the maximum Interruptible Capacity.
- (6)
Interruptible load interruption time constraint:
In the formula: is Interruptible time for the i-th user, represents the maximum Interruptible time.
- (7)
Interruption Frequency Constraint for IL:
In the formula: , indicates the number of interruptions and the maximum number of interruptions that can be made by the i-th user.
3.4. Collaborative Scheduling Integration Framework Based on MF-DFA Threshold Warning and PSO
3.4.1. MF-DFA Dynamic Threshold Detection Mechanism
The core of this section lies in establishing a closed-loop “prediction–warning–dispatch” framework. To overcome the fundamental limitation of conventional static thresholds in adapting to the non-stationary and multi-timescale fluctuation characteristics of agricultural loads, this work introduces an improved Multifractal Detrended Fluctuation Analysis (MF-DFA) method for dynamic warning threshold setting. This approach is theoretically grounded in the intrinsic multifractal characteristics of agricultural loads—manifested as long-range correlations across different temporal scales—which systematically transition when the system approaches high-risk critical states. The improved MF-DFA accurately captures such transitions through variations in the generalized Hurst exponent,
, which is a core measure in fractal analysis that quantifies the long-range memory and scaling properties of a time series [
31]. Specifically,
> 0.5 indicates persistent (long-memory) behavior,
< 0.5 suggests anti-persistence, and
= 0.5 corresponds to an uncorrelated process, thereby establishing a more scientific and proactive decision-making basis for interruptible load dispatch compared to static threshold methods.
This study used the standard MF-DFA algorithm [
30,
31] to analyze the short-term load sequence
and obtain its characteristic generalized Hurst exponent
. However, the core contribution of this work lies not in the MF-DFA algorithm itself, but in the innovative strategy of converting its output into dynamic thresholds in the future. This strategy includes two improvement steps:
- (1)
Significance analysis based on alternative data: To distinguish the inherent time structure and random fluctuations in the load sequence, we generate a set of alternative sequences using phase randomization techniques. This operation destroys the temporal correlation of the original sequence while preserving its statistical distribution. By comparing the of the original sequence with the Hurst exponent distribution of the replacement sequence, we obtained a pure indicator that excludes random interference to characterize system persistence.
- (2)
Dynamic threshold determination based on inflection point detection: The core improvement of this method lies in applying an improved inflection point detection strategy to the curve. This strategy combines dual smoothing processing and multi-scale differential analysis techniques, aiming to robustly identify the most abrupt change points in . This inflection point is interpreted as a signal indicating a critical transition in the multifractal state of the system. Subsequently, through a preset mapping rule, the inflection point position is transformed into a specific quantile of the predicted load distribution, ultimately scientifically determining the dynamic warning threshold .
The dynamic threshold , calculated through the above process, serves as the intelligent trigger for the entire collaborative scheduling framework. Its integration logic is as follows:
Trigger Mechanism: When the predicted load at a future time point t exceeds the dynamically determined warning threshold, the system identifies this condition as a “high-cost risk period” and automatically initiates the IL dispatch optimization module. Subsequently, the trigger signal and the magnitude of this exceedance are transmitted to the PSO-based optimization model described in
Section 3.4.2, thereby defining the specific load reduction target required for the optimization problem.
It determines an economically optimal combination of participating users and their corresponding load reduction capacities from the pool of contracted users. This strategy ensures that load interruptions are activated only when the system approaches a critical state, thereby avoiding frequent interventions caused by improperly set thresholds. Such a design prevents both negative impacts on user satisfaction due to excessive interruptions and increased system costs due to missed intervention opportunities, ultimately achieving an optimal balance between economic efficiency and operational safety.
The MF-DFA dynamic threshold warning process diagram is shown in
Figure 5, and it is a blue main step containing several gray sub-steps.
3.4.2. PSO Scheduling Model
Given the high-dimensional combinatorial nature, multi-constraint coupling relationships, and real-time decision-making requirements of the optimal dispatch problem for interruptible loads in agricultural parks, traditional mixed-integer linear programming (MILP) methods face considerable challenges. Due to the prohibitively high computational cost of complex energy system models, conventional approaches often rely on simplified modeling or linearization to reduce solution complexity. However, such simplifications present limitations—including high modeling complexity and uncontrollable solution times—when addressing nonlinear system characteristics and uncertainty factors, which may lead to significant deviations between optimized results and actual operational performance.
To address these issues, this study develops an innovative joint optimization framework that tightly couples the particle swarm optimization (PSO) algorithm with an energy system simulation model. A key advantage of this framework is that it eliminates the need to linearize pronounced nonlinear characteristics in the interruptible load dispatch model, thereby preserving high model fidelity. By leveraging the stochastic search mechanism of PSO—which inherently accommodates load forecasting errors—and harnessing its strengths in global search capability, flexible constraint handling, and computational efficiency, the framework achieves intelligent optimal dispatch of interruptible loads.
The proposed framework employs a multi-coupling-driven fitness evaluation mechanism, which directly invokes a full energy system simulation during each fitness evaluation to ensure the optimization process fully captures the system’s complex nonlinear behavior. The optimization algorithm dynamically adjusts particle positions and velocities based on simulation feedback, thereby establishing a closed-loop optimization process. This mechanism enables real-time perception of system nonlinearities and effectively avoids convergence to local optima.
In terms of collaborative optimization, the framework overcomes the limitations of conventional sequential optimization methods through a swarm intelligence mechanism, enabling it to effectively uncover implicit synergies among electricity loads of different users. By strategically combining users with high and low compensation costs, the model achieves global total cost minimization, demonstrating its innovative capability in multi-user intelligent collaborative decision-making. Furthermore, the high computational efficiency of PSO allows it to converge rapidly to near-optimal solutions, making it particularly suitable for real-time dispatch applications in large-scale agricultural park integrated energy systems. Compared with traditional methods that require complex mathematical modeling and linearization, this framework handles the original nonlinear model directly, significantly improving solution efficiency while maintaining model accuracy.
This study referred to Reference [
37] for optimization settings, and the particle swarm optimization (PSO) algorithm parameters used were set as follows:
- (1)
Population size: set to 100. The algorithm can achieve stable convergence within this number of iterations, and the objective function value no longer undergoes significant changes. This setting is consistent with the typical convergence characteristics of complex energy scheduling problems.
- (2)
Maximum iterations: 200, sufficient for stable convergence as observed in our trials.
- (3)
Inertia weight: set to 0.8. This value is slightly higher than the conventional setting [
37], aiming to enhance the algorithm’s global exploration ability in the early stage of search, thereby more effectively finding high-quality solutions in complex user interrupt combination spaces.
- (4)
Learning factors: cognitive learning factor c1 and social learning factor c2 are both set to 1.5; a standard configuration for balancing individual and social learning.
Fitness Calculation: The fitness function comprehensively considers the system operation cost and interruptible compensation cost to quickly screen out the optimal solution. Its formula is as follows:
In the formula: indicates natural gas consumption, represents the power exchange cost of the power grid, compensation for IL.
This optimization process is initiated when a high-risk period is identified by the dynamic threshold detection mechanism illustrated in
Figure 5.
For the selected loads, the corresponding particles’ positions and velocities are dynamically adjusted in response to real-time load demand and electricity price fluctuations, ensuring the optimization results remain aligned with current dispatch requirements. Through such dynamic adjustment, the system can reduce electricity purchases during peak periods while increasing procurement during off-peak hours, thereby optimizing overall operational costs.
The flowchart of PSO interruptible load optimization scheduling is shown in
Figure 6 (main steps are shown in gray boxes with black text; sub-steps are indicated by blue text).
4. Numerical Example Analysis
4.1. Selection of Python Technology Libraries
This article constructs a power load forecasting model based on Python 3.12. In the data processing stage, Pandas is used for data loading, cleaning, and daily-hourly data reshaping, and NumPy is utilized to construct periodic features. To further reveal the inherent patterns of load, Statsmodels is introduced for STL decomposition to extract trend, seasonality, and residual components. At the model construction level, Scikit-learn is used to integrate multiple base learners, and LightGBM is introduced to efficiently handle time series features. The prediction results of each base learner are integrated by XGBoost as a meta-learner to fully leverage its regularization advantage. Finally, Optuna is employed to automatically optimize the hyperparameters of all models to ensure optimal prediction performance. The key Python libraries utilized in this workflow are summarized in
Table 1.
4.2. Numerical Example Selection
In Reference [
38] of this paper, regarding the integrated energy system microgrid of the large-scale agricultural park, the following equipment is configured within the park: a JMS 416 GS-N.L gas-fired generator set with a rated power output of 80,000 kW; a gas-fired boiler with a heating capacity of 55,000 kW; a small-scale photovoltaic power generation system composed of photovoltaic arrays, with a total capacity of 5070 kW; a small-scale wind power generation system composed of wind turbines, with a total capacity of 3500 kW; an electrical energy storage system composed of lead–acid battery packs, with a total capacity of 6500 kWh (
); and a thermal energy storage system composed of heat storage tanks, with a total capacity of 15,000 kWh.
This dataset utilizes hourly historical load data from the Global Energy Forecasting Competition 2012 [
39], where the measurement interval for load data is 1 h. The time range for training (for the microgrid of the large-scale agricultural park) spans from the 1st hour of 1 January 2007 to the 24th hour of 1 January 2008; while the time range for prediction covers from the 1st hour of 2 January 2008 to the 1st hour of 7 January 2008. The prediction data of 2 January 2008 is used for the optimal scheduling of the IL incentive mechanism.
In the scheduling framework using interruptible load incentive mechanism, the dynamic warning threshold calculated based on the improved MF-DFA algorithm (refer to
Section 3.4.1) is 82.5%. The natural gas price is set at 2.37 CNY/m
3 based on local energy prices. The interactive electricity price between the park’s microgrid and the main grid adopts a time-of-use electricity price, as shown in
Table 2; the relevant parameters of energy supply equipment are presented in
Table 3.
Suppose there are 10 groups of users in the agricultural park who, after signing the contract, participate in the park’s IL incentive mechanism. The quotation schemes of these users participating in the IL are shown in
Table 4, with reference to Reference [
40].
To verify the economic efficiency of this interruptible load incentive mechanism in improving the optimal dispatch of the agricultural park, four scenarios are set as the verification scenarios in this study:
Scenario 1: Economic operation results of the traditional model that sums separate electricity forecasting and heat forecasting.
Scenario 2: Economic operation results of the traditional model that sums separate electricity forecasting and heat forecasting + IL.
Scenario 3: Economic operation results considering the ensemble learning model (integrated total load of wind, PV, and load) and heat forecasting.
Scenario 4: Economic operation results considering the ensemble learning model (integrated total load of wind, PV, and load), heat forecasting results, and the IL mechanism.
4.3. Optimization Results of the Example
This article solves the following scenarios through python 3.12.
The forecasting of photovoltaic and wind energy is as shown in
Figure 7.
Scenario 1: Economic operation results of the traditional model that sums separate electricity forecasting and heat forecasting.
The traditional electricity and heat prediction models are shown in
Figure 8.
As shown in
Table 5, the proposed model achieves an RMSE of 669.75 and an MAE of 492.38 for electricity load forecasting. These values indicate relatively low average prediction errors, confirming the model’s strong capability in capturing most load patterns. Furthermore, the high R
2 value of 0.9973 reveals that the model explains 99.73% of the variance in electricity load data, reflecting remarkable fitting performance.
Similarly, for heat load forecasting, the model attains an RMSE of 489.65 and MAE of 242.25, demonstrating well-controlled prediction errors. With an R2 value of 0.9967, it also achieves highly reliable fitting performance for thermal load characteristics.
Scenario 2: Economic operation results of the traditional model that sums separate electricity forecasting and heat forecasting + IL.
The traditional electric load prediction and IL accumulation model operation results are shown in
Figure 9.
Economic analysis demonstrates that the implementation of the interruptible load (IL) strategy creates significant system value. As summarized in
Table 6, the total operating cost decreases from 1,500,299.42 CNY to 1,495,592.07 CNY after implementing IL—a net reduction of 4707.35 CNY, or 0.31%. While the absolute saving rate appears modest, it is achieved through a highly targeted intervention: load interruptions were applied only to User 9 and User 10 during the two-hour peak period (20:00–21:00). The associated compensation cost was 4430.00 CNY. The key finding is that the strategy generates a net daily gain of 277.35 CNY, proving its economic viability. This outcome validates a favorable cost–benefit ratio where the system savings outweigh the compensation payments. More importantly, it demonstrates the practical potential of IL as a precision tool for peak shaving, enhancing grid flexibility and resource allocation efficiency without compromising user satisfaction, as interventions are minimal and compensated.
Scenario 3: Economic operation results considering the ensemble learning model (integrated total load of wind, PV, and load) and heat forecasting.
Figure 10 illustrates the economically and technically optimized operation of the multi-energy system. The dispatch logic clearly follows an economic merit order: wind and photovoltaic (PV) generation, with near-zero marginal cost, are fully consumed, with PV output exhibiting its expected unimodal shape (11:00–19:00). The energy storage system actively arbitrages time-of-use electricity prices, charging during the lowest-price off-peak hours (00:00–02:00, 23:00) and discharging during the highest-price peak periods (08:00–10:00, 18:00–20:00), with its discharge power strategically peaking in the evening to offset the most expensive grid purchases.
The gas turbine maintains minimum output during low-load nighttime hours (00:00–05:00). As load increases, its output gradually rises, reaching peak levels during the evening peak period (18:00–23:00) to meet high demand. Meanwhile, grid interaction demonstrates an economically optimized pattern: the system purchases electricity from the grid during off-peak periods while minimizing purchase costs during peak hours through coordinated dispatch of internal generation resources.
Through the coordinated operation of multiple power sources, the system achieves multiple objectives including supply-demand balance, cost optimization, and operational stability. This integrated approach not only reduces expensive electricity purchases from the grid during peak periods but also contributes to peak shaving and valley filling of the grid load profile, thereby enhancing both economic efficiency and operational reliability. Consequently, the system realizes dual benefits: temporal electricity value transfer and effective load profile smoothing.
During the 00:00–05:00 period, the heat load demand is relatively low, and the CHP (Combined Heat and Power) unit (gas turbine) maintains a minimum output state. During this period, the heat storage tank absorbs and stores the excess thermal power in the system. This strategy not only improves energy utilization efficiency but also prepares for the subsequent heat supply demand during peak periods.
As illustrated in
Figure 11, the thermal energy stored during off-peak periods enables collaborative heat supply operation among the thermal storage tank, CHP unit, and gas-fired boiler during daytime peak hours (07:00–09:00). During the 17:00–21:00 period, when the heat output from the fully loaded CHP unit becomes insufficient to meet the total system heat load demand, the gas-fired boiler automatically activates based on thermal balance constraints to compensate for the supply shortage, ensuring complete fulfillment of heat load requirements. This establishes a multi-source collaborative heat supply mode characterized by “CHP-based baseload supply + thermal storage for peak shaving + gas boiler for emergency backup”, which effectively accommodates heat load fluctuations while maintaining system stability.
This coordinated multi-source operation strategy optimally balances the thermal power output within the system and significantly enhances the stability and reliability of heat supply. Additionally, the operational strategy of the CHP unit demonstrates strong coupling with heat load demand; through flexible output adjustment, it simultaneously satisfies thermal load requirements while avoiding energy waste.
Scenario 4: Economic operation results considering the ensemble learning model (integrated total load of wind, PV, and load), heat forecasting results, and the IL mechanism.
The operation results of the IL accumulation model for the power supply equipment of the wind solar complementary power plant are shown in
Figure 12.
With the implementation of the IL incentive mechanism, comparative analysis reveals a pronounced reduction in electrical load during the evening peak period from 20:00 to 21:00. This demonstrates that the incentive mechanism effectively curtails electricity consumption by pre-contracted users during critical peak hours, thereby flattening the overall load profile and improving its morphology. Meanwhile, in alignment with the power dispatch objective of minimizing electricity purchases from the grid, the gas turbine operates predominantly at full capacity. The system prioritizes utilizing the thermal output from the gas turbine to meet heat load demand, supplemented by thermal energy from the heat storage tank, thereby achieving effective peak shaving and valley filling for the thermal load through the coordinated use of thermal storage equipment.
The economic evaluation is conclusive. As summarized in
Table 7, the total operating cost decreases from 1,464,678.01 CNY to 1,449,674.26 CNY, a significant saving of 15,003.75 CNY (1.02%). The key insight is the favorable cost–benefit structure: despite a 7698.00 CNY compensation payout, the system secures a net economic benefit of 7305.75 CNY, meaning for every 1 CNY spent on compensation, the system gains approximately 1.95 CNY in return.
It is worth noting that in
Table 7, the total interruption capacity reached 9700 kilowatts, with a net economic contribution of 0.75 CNY per kilowatt of interruption capacity. The net economic contribution of 0.75 CNY per kW of interrupted capacity provides a critical benchmark for future planning and contract design with agricultural users. These results collectively validate the core premise of our closed-loop “forecast-warning-dispatch” framework: that intelligent, forecast-informed demand response, optimized via PSO, is a powerful tool for achieving economic efficiency and operational security simultaneously in agricultural integrated energy systems.