Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity

Hu, Yi; Xu, Han; Han, Run; Li, Yuansheng; Long, Yang

doi:10.3390/su18136498

Open AccessArticle

Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity

by

Yi Hu

¹,

Han Xu

^1,*,

Run Han

²,

Yuansheng Li

² and

Yang Long

³

¹

School of Electrical Engineering and Electronic Information, Xihua University, Chengdu 610039, China

²

State Grid Ningxia Electric Power Co., Ltd. Yinchuan Power Supply Company, Yinchuan 750011, China

³

School of Electrical Engineering, Southwest Jiaotong University, Chengdu 611756, China

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(13), 6498; https://doi.org/10.3390/su18136498 (registering DOI)

Submission received: 27 May 2026 / Revised: 21 June 2026 / Accepted: 23 June 2026 / Published: 25 June 2026

(This article belongs to the Topic AI-Enabled Operation and Control of Modern Power and Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

To address the issues of neglecting population heterogeneity and the difficulties in determining constraint parameters in residential load adjustable capacity forecasting, this paper proposes a data-driven forecasting method that considers profile heterogeneity. First, K-means++ is utilized to extract diverse user electricity consumption profiles. Second, to solve the problem of real response data scarcity, the difference-in-differences (DID) method is employed to empirically calibrate the true physical constraint boundaries of different clusters, and high-quality response samples are generated in batches based on an electricity cost minimization model. Finally, a Long Short-Term Memory (LSTM) time-series forecasting model is constructed to achieve the precise quantitative evaluation of adjustable capacity. Case studies demonstrate that after introducing user profile labels, the three accuracy metrics of the predictive model are improved by 16.29%, 24.52%, and 20.21%, respectively. Although the practical application of synthetic labels faces minor limitations caused by uncertain user behaviors, this scalable framework supports seamless incremental retraining using future empirical response data to realize continuous model evolution and persistent accuracy improvement, thereby providing technical support for load aggregators’ market bidding and the precise dispatch of power grid demand response.

Keywords:

residential load adjustable capacity; user profile heterogeneity; K-means++; LSTM; residential flexible load

1. Introduction

The global energy transition is accelerating the evolution of power systems towards a high penetration of renewable energy. However, the intermittency and volatility of wind and solar power pose tremendous challenges to the supply–demand balance of the system [1,2,3,4]. Against the backdrop of the paradigm shift from “source following load” to “source-load interaction,” residential load has emerged as the largest and most widely distributed flexible resource [5,6,7]. According to statistics from the National Energy Administration, the domestic electricity consumption of urban and rural residents in China exceeded 1.588 trillion kWh in 2025. Although it accounts for only about 15% of the total electricity consumption across society, residential consumption behaviors exhibit high spatiotemporal concentration. Particularly during the evening peak, residential loads are highly prone to superimposing on the system’s peak load, imposing immense peak-shaving pressure on the power grid. Fully unlocking the flexible regulation potential of these massive resources will significantly enhance the operational efficiency of the novel power system, facilitate the efficient accommodation of renewable energy, reduce carbon emissions from fossil fuel generation, and provide crucial support for the green and low-carbon development of the power industry, thereby enhancing its overall sustainability.

Accurately predicting residential adjustable capacity is crucial for both market and operational decision-making. Load aggregators and virtual power plants (VPPs) require reliable flexibility estimations prior to submitting bids to energy or ancillary service markets [8,9,10]. Overestimation may lead to energy imbalance penalties and delivery failures, whereas underestimation can leave valuable demand-side resources underutilized. At the distribution system level, dispatchable residential flexibility can only be effectively utilized upon achieving a comprehensive understanding of its magnitude, temporal location, and the uncertainties reliant on profile characteristics.

Methods for assessing demand response (DR) potential can be broadly categorized into physics-based optimization approaches and data-driven forecasting methods. Physics-driven models construct device-level load models based on the physical characteristics of appliances and energy conservation principles. Despite their clear physical interpretability, they suffer from inherent drawbacks such as severe parameter dependency and difficulties in large-scale applications. They require detailed parameters and behavioral habits for all appliances within each household, leading to prohibitively high data acquisition costs [11,12,13,14]. Furthermore, once fixed, model parameters are difficult to update, failing to capture the dynamic evolution of user response behaviors driven by time, electricity prices, and policies. Conversely, data-driven models, including transfer learning and deep learning [15,16,17,18,19], can learn nonlinear mapping relationships from historical data, price signals, and response outcomes. However, these models typically rely on high-quality response labels, which are exceedingly scarce in the residential DR domain. Although adopting optimization-generated labels as a “cold-start” strategy can effectively break this data barrier, the inherent gap between theoretical optimization and uncertain actual human behaviors inevitably introduces errors into practical forecasting. Moreover, lacking explicit modeling of the physical boundaries of DR, these models cannot perceive the rigid constraints of user consumption and the operational limits of appliances, potentially yielding physically unrealistic predictions. In addition, many data-driven approaches fail to adequately incorporate user behavioral heterogeneity into the model inputs or the response sample generation process.

User profiling offers a practical approach for characterizing user heterogeneity from smart meter data. Clustering-based profiling technologies have been widely applied in smart meter data analysis and residential load pattern recognition [20,21,22,23]. Currently, however, user profiles are predominantly utilized for descriptive analysis, and their integration with flexible constraint sample generation and time-series forecasting technologies remains underexplored. To bridge this research gap, this paper proposes a profile-aware framework for residential adjustable capacity forecasting. The main contributions are summarized as follows:

Breaking the traditional limitation of utilizing user profiles merely for descriptive classification, this study deeply embeds user heterogeneity information into the entire pipeline of constraint extraction, sample generation, and adjustable capacity forecasting, thereby transforming clustered profiles from isolated preliminary results into core inputs for all stages of demand response modeling.
A profile-constrained demand response sample pool generation method is established. By combining user profile constraints with a mathematical optimization model to generate large-scale and high-quality training samples, it provides a reliable data foundation featuring both heterogeneity and rationality for data-driven models.
A profile-aware LSTM model is constructed to predict response load curves and adjustable capacity, utilizing baseline load, price signals, and profile labels as inputs.

2. Residential User Profiling

2.1. Load Feature Extraction

Load features directly reflect the overall electricity consumption scale and basic electricity demand of users [24]. This paper constructs a load feature set comprising 7 physically meaningful indicators, which are summarized in Table 1.

Let

P_{i, d, t}

denote the power consumption of household

i

on day

d

and time interval

t

, where

t

= 1, …,

T

. For the half-hourly data used in this study,

T

= 48. Let

D_{i}

be the set of valid days for household

i

,

T_{night}

be the evening peak time interval set;

D_{work}

be the weekday set, and

D_{weekend}

be the weekend set.

2.2. K-Means++ Clustering

To accurately extract typical residential electricity consumption behavior patterns, K-means++ is adopted for clustering analysis [24,25,26]. This algorithm introduces a distance-based probability sampling strategy to ensure that the initial clustering centers are as dispersed as possible in space, thus effectively avoiding falling into local optimum and significantly improving the convergence speed and result stability of load clustering.

Let the historical load dataset be classified as

X = {x_{1}, x_{2}, \dots, x_{n}}

, where

x_{i}

represents the load curve of the

i

-th user, and the set number of target typical load patterns is

k

. The mathematical modeling and specific steps of the K-means++ algorithm for initializing clustering centers are as follows:

Step 1: Randomly select a load curve from the load dataset following a uniform distribution as the first clustering center, and add it to the selected center set

C

.

Step 2: For each sample point in the dataset, calculate the Euclidean distance between it and the nearest clustering center in the current selected center set

C

, denoted as

D (x_{i})

. Its mathematical expression is as follows:

D (x_{i}) = \min_{c \in C} {‖x_{i} - c‖}_{2}

(1)

Step 3: Calculate the probability that each load sample in the dataset is selected as the next clustering center. This probability is proportional to the square of its distance to the nearest center. The probability distribution formula is defined as follows:

P (x_{i}) = \frac{D {(x_{i})}^{2}}{\sum_{x_{j} \in X} D {(x_{j})}^{2}}

(2)

Step 4: Repeat Step 2 and Step 3 until the set

C

contains

k

clustering centers.

Step 5: Take these k highly representative load curves as the determined initial clustering centers, and then start the K-means alternating optimization process until the clustering results converge.

2.3. Construction Process of User Profile System

To accurately capture the heterogeneity of residential users’ electricity consumption behaviors, this paper constructs a user profile system driven by historical load data. Through multi-dimensional feature extraction and clustering analysis, the system aims to transform a large number of disordered load curves into group labels with typical response characteristics. The specific construction process is as follows:

Standardization of load features

Since the dimensions and value ranges of the 7 types of load features are significantly different, direct clustering will lead to high-dimensional features dominating the clustering results. Therefore, this paper uses Z-score standardization to preprocess all features and convert them into a standard distribution with a mean of 0 and a standard deviation of 1:

x^{'} = \frac{x - μ}{σ}

(3)

where

x

is the original feature value,

μ

is the mean of the feature in the control training set,

σ

is the standard deviation of the feature in the control training set, and

x^{'}

is the standardized feature value.

2.: Determination of the optimal number of clusters

To avoid the inherent limitations of a single clustering validity indicator and ensure the objectivity and reliability of the selection of the optimal number of clusters, this paper selects four objective clustering validity indicators: elbow method, silhouette coefficient, CH index and DB index, to conduct a comprehensive quantitative evaluation of the candidate number of clusters and determine the optimal number of clusters.

3.: Clustering model training and profile label assignment

Based on the standardized full feature matrix and the optimal number of clusters, the K-means++ algorithm is used to train the clustering model with an iteration number of 1000 and a convergence threshold of the maximum change in clustering centers less than 10⁻⁶. After the model training is completed, based on the feature differences in the

k

clustering centers and combined with the typical electricity consumption behavior rules of residents, each cluster is assigned a user profile label with clear physical meaning.

3. Prediction of Load Adjustable Capacity Considering User Profiles

3.1. Profile-Constrained Demand Response Sample Generation

3.1.1. Objective Function

The objective of the model is to minimize the daily electricity cost of users, and the calculation formula is as follows:

\min F = \sum_{t = 1}^{T} c_{t} \cdot P_{DR} (t) \cdot Δ t

(4)

where

c_{t}

is the electricity price at time

t

;

P_{DR} (t)

is the optimized electricity power of the user at time

t

;

Δ t

is the length of the response time interval.

3.1.2. Constraint Conditions

To ensure that the demand response optimization results conform to the real physical conditions and behavioral elasticity of users, this paper constructs a constraint system including electricity conservation, physical upper and lower limits and profile elasticity boundaries on the basis of the traditional electricity consumption optimization model.

1.: Daily energy conservation constraint

Demand response is mainly realized through load shifting. To ensure the basic electricity demand of users, it is assumed that the total daily electricity consumption remains unchanged before and after the response.

\sum_{t = 1}^{T} P_{opt} (t) = \sum_{t = 1}^{T} P_{base} (t)

(5)

where

P_{base} (t)

is the original baseline load of the user at time

t

.

2.: Physical equipment upper and lower limit constraints

The optimized load cannot exceed the physical capacity upper limit of the household electricity meter, and cannot be lower than the rigid load that maintains the basic operation of the family.

P_{\min} \leq P_{opt} (t) \leq P_{\max}

(6)

where

P_{\max}

is the historical maximum electricity power of the user;

P_{\min}

takes the historical minimum load of the user.

3.: Maximum adjustable capacity constraint of profiles

The maximum peak shaving ratio boundary for different clusters is introduced to constrain the load reduction depth of each profile user during peak hours.

P_{base} (t) - P_{opt} (t) \leq α_{k}^{\max} \cdot P_{base} (t), \forall t \in T_{peak}

(7)

where

α_{k}^{\max}

is the maximum adjustable elasticity coefficient of the

k

-th profile group under a specific time-of-use electricity price, which is directly calculated from the core load characteristics of the profile. Its calculation formula is as follows:

α_{k}^{\max} = γ \cdot \frac{Δ {\bar{p}}^{k}}{{\bar{p}}_{avg}^{k}} \cdot {\bar{p}}_{night}^{k}

(8)

where

γ

represents the global calibration coefficient, which is introduced to correct the discrepancy between the theoretical elasticity and the practically realizable elasticity. In this study, this coefficient is inversely derived from real-world electricity price regulation data using the DID method and is calibrated at 0.0546.

3.2. Calculation of Adjustable Capacity

To quantify the actual load reduction in users, it is necessary to accurately estimate the theoretical electricity consumption trajectory of users in the non-response state, that is, the baseline load [27,28]. There are many existing baseline estimation strategies. To avoid additional uncertainty introduced by external environmental factors and strictly focus on the time-domain evolution characteristics of the load sequence itself, this paper adopts the High X of Y method for baseline characterization [29]. Its specific mathematical expression is as follows:

P_{base} (t) = \frac{1}{X} \sum_{d \in D_{X}} P_{d} (t)

(9)

where

X

denotes the number of selected high-load sample days;

D_{X}

represents the set of the top

X

days with the highest total load selected from

Y

valid candidate reference days; and

P_{d} (t)

stands for the measured power of the

d

-th historical reference day within the set at time

t

. In this study, the “High 4 of 5” baseline method, which well aligns with residential load characteristics, is adopted for parameter configuration.

Considering the natural difference in the work and rest rules of residents between weekdays and weekends, their load curves usually show different characteristics. When applying the High X of Y method to process weekend load data,

Y

valid reference days are delineated from historical weekends, and the average of the top

X

days with the highest load level is also selected for calculation.

In summary, the adjustable capacity of the user can be calculated as follows:

P_{adj} (t) = P_{base} (t) - P_{DR} (t)

(10)

3.3. LSTM-Based Adjustable Capacity Prediction Network

3.3.1. Principle of LSTM

Considering that the daily load curve and time-of-use electricity price signal of residential users have strong time-series dependence, and the response transfer in different electricity price periods shows significant forward and backward memory and long-term evolution characteristics, this paper introduces a Long Short-Term Memory neural network for deep feature learning and prediction. LSTM is a neural network model specially designed to process sequence data with complex time spans [30,31,32,33], which is mainly composed of three gating mechanisms: forget gate, input gate and output gate.

1.: Forget gate

As the core unit controlling the retention degree of historical information, the forget gate determines which useless historical load or electricity price information to discard by reading the hidden state of the previous moment and the input feature of the current moment. Its mathematical expression is as follows:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(11)

where

f_{t}

is the output vector of the forget gate;

h_{t - 1}

is the hidden state of the previous moment;

x_{t}

is the input sequence data of the current moment;

W_{f}

and

b_{f}

are the weight matrix and bias term of the forget gate respectively;

σ

is the Sigmoid activation function.

2.: Input gate and state update

To write new key response features into the model, the input gate determines the update ratio of the input information at the current moment and generates the candidate memory cell state. The two are combined with the previous moment cell state processed by the forget gate to complete the update of the current global cell state, thus retaining the most significant long-term demand elasticity characteristics. The calculation formulas are as follows:

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(12)

{\tilde{C}}_{t} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(13)

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ {\tilde{C}}_{t}

(14)

where

i_{t}

is the output vector of the input gate;

{\tilde{C}}_{t}

is the candidate memory cell state generated at the current moment;

W_{i}

,

b_{i}

,

W_{c}

and

b_{c}

are the corresponding weight matrices and bias terms;

\tanh

is the hyperbolic tangent activation function;

C_{t}

and

C_{t - 1}

are the memory cell states of the current moment and the previous moment respectively;

⊙

represents the Hadamard product.

3.: Output gate and fully connected layer

The memory cell state filtered and updated by the gating mechanism determines the final output hidden feature state by the output gate. The high-dimensional abstract time-series features extracted by the LSTM layer are finally input to the fully connected layer. The fully connected layer is responsible for integrating the extracted global time-series response information and directly outputting the predicted response load curve. The relevant calculation formulas are as follows:

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(15)

h_{t} = o_{t} ⊙ \tanh (C_{t})

(16)

Y = W_{y} \cdot H + b_{y}

(17)

where

o_{t}

is the control vector of the output gate;

h_{t}

is the hidden state output of the current moment;

W_{o}

and

b_{o}

are the weight matrix and bias of the output gate;

Y

is the final prediction output sequence of the fully connected layer;

H

is the time-series hidden feature vector output by the LSTM layer and flattened;

W_{y}

and

b_{y}

are the weight matrix and bias vector of the fully connected layer respectively.

3.3.2. Prediction Process of Adjustable Capacity

Based on the above discussion, the prediction method of residential adjustable capacity considering user profile heterogeneity proposed in this paper can be mainly divided into three core parts: first, the construction and division of user profiles based on load features; second, the construction of user sample pool based on demand response; third, the prediction of adjustable capacity based on an LSTM deep learning network. The specific process is shown in Figure 1.

4. Experimental Results and Analysis

This paper employs a residential load dataset from a region in Northwest China, comprising half-hourly electricity consumption data of 2207 households from January 2024 to June 2025. Specifically, from 1 January to 31 May 2024, all users were subject to a uniform flat-rate tariff. From 1 June 2024, to 1 June 2025, the test groups (Groups A, B, and C) were subject to Time-of-Use (ToU) tariffs, while the control group (Group D) continued under the flat-rate tariff. The ToU pricing schemes in this dataset reflect actual implemented pricing mechanisms within the Northwest China regional power grid, including residential ToU tariffs, critical peak pricing, and flat-rate discounted tariffs. These schemes comprehensively cover the mainstream price incentives of current demand response programs in China. The specific time period divisions and corresponding prices are detailed in Table 2.

4.1. Construction of User Profiles

To capture the authentic electricity consumption behaviors of the users, the actual historical load profiles of 518 households from Control Group E, which was unaffected by price interventions, were selected as the baseline.

Correlation coefficient of load features

To verify the independence and information dimension coverage of the 7 constructed load feature indicators, this paper calculates the Pearson correlation coefficient between each feature, and the results are shown in Figure 2.

It can be seen from Figure 2 that there is a strong positive correlation between the load characteristic indicators representing electricity consumption scale, which conforms to the physical law that the larger the basic electricity consumption scale, the higher the extreme value. Overall, this feature set effectively decouples the scale attribute and behavioral elasticity of users, and can comprehensively and redundantly characterize the electricity consumption heterogeneity of groups.

2.: Selection of the number of clusters

Based on the four clustering validity indicators mentioned above, the clustering effects of

k

= 3, 4, 5, 6, 7, 8 are quantitatively evaluated, where the weight of the silhouette coefficient is 0.4, and the weights of the CH index and DB index are both 0.3. The calculation results of each indicator and the comprehensive score are shown in Table 3.

The experimental results show that when

k

= 6, the WCSS curve has an inflection point, and the silhouette coefficient, CH index and DB index all reach the optimal values with the highest weighted comprehensive score. Therefore, this paper determines the optimal number of clusters as 6.

3.: User profile labels

Based on the optimal number of clusters, the 6 clustering centers obtained from training and the core load characteristics of users in each cluster are statistically analyzed, and the results are shown in Table 4. By comparing the load characteristics of different clusters and combining with residential electricity consumption behaviors, user profile labels are assigned to each cluster.

It can be seen from Table 4 that the electricity consumption behaviors of the six types of user groups have significant heterogeneity, and the six profile labels can accurately distinguish the differences in residential users’ electricity consumption behaviors.

4.: Visual verification of clustering results

To further intuitively show the discrimination of clustering results, principal component analysis is used to reduce the 7-dimensional load features to a 3-dimensional feature space, and the clustering results are drawn as shown in Figure 3.

It can be clearly seen from Figure 3 that the six profile groups show significant internal aggregation and clear inter-cluster boundaries in space, indicating that the user profile system constructed in this paper can effectively distinguish user groups with different electricity consumption behaviors.

To more intuitively show the differences in electricity consumption patterns of each group, the average daily load curves of users with different profiles are drawn as shown in Figure 4.

It can be clearly seen from Figure 4 that the six types of users present completely different electricity consumption characteristics in the 24 h dimension. The summary is shown in Table 5.

The above group differences indicate that if a unified model is used to predict the adjustable capacity of all users, the regular deviation caused by profile heterogeneity will be ignored. Therefore, it is necessary to introduce profile label features into the model to improve prediction accuracy and physical rationality.

4.2. Robustness Validation of the User Profiling System

To ensure the proposed user profiling system possesses reliable classification capabilities and to mitigate the impacts of algorithmic randomness and temporal volatility on clustering outcomes, this study conducts systematic validation experiments from two dimensions: random seeds and temporal variations. These experiments quantify the consistency and stability of the clustering results under varying conditions.

Validation of Clustering Stability Under Different Random Seeds

Although the K-means++ algorithm optimizes the initial center selection through probabilistic sampling, minor random fluctuations may still exist. To verify the reproducibility of the clustering results, 100 sets of different random initial seeds are configured to repeatedly execute the K-means++ clustering process on the complete dataset. The Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI) are employed as two metrics to calculate the consistency between any two clustering outcomes. The values of these metrics range from [0, 1], where a value closer to 1 indicates a higher degree of clustering consistency. The results are presented in Table 6.

The experimental results demonstrate that the mean values of ARI and NMI over the 100 repeated experiments both approach the theoretical maximum of 1.0. Even in the worst-case scenario, the minimum ARI still reaches 0.9888, and the minimum NMI reaches 0.9833. This indicates that the probabilistic sampling initial center strategy of the K-means++ algorithm performs exceptionally well in this study. The clustering results are almost unaffected by the selection of initial random seeds, demonstrating the extremely strong robustness of the algorithm.

2.: Validation of Clustering Stability Across Different Seasons

Residential electricity consumption behavior is significantly influenced by seasonal factors such as temperature, sunshine duration, and living habits. To verify the applicability of the user profile system across different seasons, the complete dataset is divided into four independent subsets according to natural seasons: Spring (March to May), Summer (June to August), Autumn (September to November), and Winter (December to February). The ARI and NMI metrics between the clustering results of each season and the annual baseline clustering results are then calculated. The results are presented in Table 7.

The experimental results exhibit a pattern of “higher in autumn and winter, lower in spring and summer,” which highly aligns with the climatic characteristics of the inland northwest region of China. Influenced by seasonal factors, the shape of users’ daily load curves undergoes fundamental changes. However, the ARI and NMI metrics for all seasons remain higher than random levels, indicating that the core electricity consumption patterns of users maintain a strong internal consistency despite seasonal variations.

3.: Validation of Clustering Stability Across Different Time Periods

There are significant differences in residents’ daily routines between weekdays and weekends. To verify the stability of the user profile system across different time periods within a week, the dataset is further divided into two independent subsets: weekdays and weekends. The consistency metrics between these subsets and the annual baseline clustering results are then calculated. The results are presented in Table 8.

The experimental results indicate that the consistency between the weekday clustering results and the annual baseline is significantly higher than that of the weekends, which completely conforms to residents’ daily routines. In contrast, users have more flexible schedules on weekends, leading to a slight decrease in clustering consistency; nonetheless, core characteristics such as their basic electricity consumption scale and load volatility features remain stable.

4.3. Construction of Sample Pool

The historical load curves and electricity price signals are input into the optimization model, and the optimal response curves of each group are solved in batches with the goal of minimizing the user’s daily electricity cost, which are used as the profile labels for deep learning. The optimization solution process is completed based on the Gurobi 13.0.1 solver in the Python 3.9 environment, with the system environment of Intel Core i7-9750H and 16GB RAM. The results are shown in Figure 5.

4.4. Training of Deep Learning Model

After constructing the demand response sample pool, the generated sample set is divided into 70% training set and 30% test set. The input feature matrix of the model includes the user’s historical baseline load, time-of-use electricity price signal and user profile label, and the output is the ideal response load curve generated by the Gurobi solver.

To verify the capability of different network architectures in capturing the implicit mapping relationships between load and price, three mainstream deep learning models—namely, the One-Dimensional Convolutional Neural Network (1D-CNN), the Gated Recurrent Unit (GRU), and the proposed LSTM—are selected for benchmark comparison. Furthermore, three evaluation metrics—Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE)—are employed to assess the predictive performance of the adjustable capacity forecasting model.

To ensure a fair comparison, uniform training hyperparameter configurations are applied across all deep learning models. Specifically, the batch size is set to 128, the Adam optimizer is employed for network weight updates, the initial learning rate is configured at 0.001, and the maximum number of training epochs is capped at 500. Furthermore, an early stopping mechanism utilizing the validation loss as the monitoring metric is incorporated. Regarding the specific network architecture design, the 1D-CNN model is constructed with three one-dimensional convolutional layers, with kernel sizes of 5, 5, and 3, and output channels of 64, 128, and 128, respectively. A Batch Normalization layer and a Dropout regularization operation with a dropout rate of 0.2 are cascaded after each convolutional layer, and the results are ultimately output through two fully connected layers with 128 and 64 nodes, respectively, along with the ReLU activation function. For both the LSTM and GRU recurrent neural networks, a 2-layer stacked recurrent structure is adopted, where the input feature dimension is 2 and the hidden state dimension is uniformly set to 128. Ultimately, a single fully connected layer is utilized in both models to map the high-dimensional time-series features into continuous load predicted values. The global prediction and accuracy results of the respective models on the identical test set are illustrated in Figure 6 and Figure 7.

It can be seen from Figure 6 and Figure 7 that the LSTM model has better prediction performance on the validation set than the other two benchmark models. Specifically, the RMSE index of the LSTM model prediction results is 8.83% and 18.37% lower than that of the 1D-CNN and GRU models respectively; the MAE index is 7.2% and 55.2% lower than that of the 1D-CNN and GRU models respectively; the MAPE index is 20.89% and 60.52% lower than that of the 1D-CNN and GRU models respectively.

To further demonstrate the superiority of the proposed heterogeneity-handling method based on user profiling, a benchmark comparison is conducted against commonly used heterogeneity-aware approaches, namely K-means + SVM and K-means + XGBoost. Similarly, three evaluation metrics—MAE, RMSE, and MAPE—are employed to assess the predictive performance of each model. The experimental results are tabulated in Table 9.

As demonstrated in Table 9, the proposed method outperforms both the K-means + SVM and K-means + XGBoost approaches across all three evaluation metrics, fully verifying its superiority in handling residential load time-series data.

4.5. Prediction and Analysis of Adjustable Capacity

Model Point Prediction Analysis

After completing the training of the LSTM deep prediction model, to quantitatively analyze the physical heterogeneity of different user profile groups in adjustable capacity, this paper selects 6 users belonging to different profile groups on a typical summer day from the test set. The historical baseline load of the day, the profile category and three time-of-use electricity pricing schemes with different peak-valley difference gradients are input into the trained LSTM model. The specific prediction situation is shown in Figure 8.

It can be seen from Figure 8 that the evening peak load shows obvious load reduction and shifting with the increase in the electricity price gradient. The predicted adjustable capacity results in Table 10 show that the High-base continuous-consumption profile shows the strongest elastic space, and its adjustable capacity can reach about 1.6018 kW under the Price C scheme; while the Low-base rigid-consumption profile has an extremely low reduction amount, only about 0.2004 kW.

2.: Interval Prediction and Uncertainty Analysis

To address the differentiated decision-making requirements across three major engineering scenarios—power grid dispatch, load aggregator bidding, and demand response (DR) potential assessment—this study constructs a Quantile Regression LSTM (QR-LSTM) interval prediction model based on the original LSTM point prediction framework. By selecting three quantiles (0.05, 0.5, and 0.95), the model synchronously outputs the upper and lower bounds of the 95% confidence interval (CI) alongside the median baseline predicted value. The model is trained utilizing the Pinball loss function. The uncertainty evaluation metrics of the model on the test set are presented in Table 11.

As shown in Table 11, the actual coverage rate of the 95% confidence interval (CI) closely approaches the preset 95% confidence level, encompassing the vast majority of the actual adjustable capacity values. The mean interval width is only 0.0813 kW, avoiding excessively wide prediction intervals. The point prediction RMSEs across the three key quantiles remain at low levels, indicating the model’s capability to meet the accuracy demands of varying engineering scenarios.

Based on these quantitative findings, specific strategies are tailored: for the power grid dispatch scenario prioritizing safety and reliability, the conservative adjustable capacity at the 0.05 quantile is adopted as the baseline to avert over-dispatching risks from high point predictions. For the load aggregator bidding scenario aiming to ensure profitability while boosting bidding success rates, the neutral capacity at the 0.5 quantile serves as the reference. For the demand response potential assessment scenario seeking to fully grasp the maximum regional capacity, the optimistic capacity at the 0.95 quantile is set as the upper bound.

4.6. Validation of the Necessity of Considering User Profile Heterogeneity and Analysis of Generalization Limitations

Validation of the Necessity of User Profile Heterogeneity

To quantitatively demonstrate the value of user profile heterogeneity in adjustable capacity forecasting, the baseline period load data of the test groups was utilized to conduct short-term rolling forecasting of the testing period loads. Subsequently, a comparative error analysis was performed against the actual loads during the testing period. Using an LSTM model without user profile labels as the baseline for comparison, the forecasting accuracy comparison is presented in Table 12.

As shown in Table 12, the forecasting accuracy of the LSTM model drops significantly after the removal of user profile labels. Compared with the baseline model, the proposed model achieves improvements of 16.29%, 24.52%, and 20.21% across the three primary accuracy metrics, respectively. This substantiates that user profile labels can significantly enhance the feature extraction capability of time-series models for heterogeneous response behaviors.

2.: Limitations of the Synthetic Data-Based Model and Its Impact on Generalization Capability

This study utilizes synthetic samples generated via Gurobi optimization as the model training data, which effectively addresses the challenge of scarce real-world demand response labels. When applied to the real-world measurement dataset, the corresponding error metrics of the proposed model increased by 9.41%, 12.67%, and 15.12%, respectively, compared to its performance on the synthetic optimization test set.

The fundamental reason is that the synthetic samples strictly adhere to the ideal physical constraints of electricity cost minimization. Consequently, they cannot replicate the various non-ideal factors present in real-world grid scenarios, including the fluctuation of users’ subjective response willingness, random electricity consumption behaviors, and the influence of environmental factors. However, once a sufficient volume of real-world demand response measurement samples is accumulated in the future, the model can undergo seamless incremental retraining using these real samples without altering the underlying model architecture. With the increasing number of real samples and the continuous enrichment of user response behavior data, the model will progressively learn the non-ideal factors inherent in real-world scenarios, thereby achieving sustained improvements in both forecasting accuracy and generalization capability.

5. Conclusions

Aiming at the problems of lack of group heterogeneity, subjective setting of constraint parameters, shortage of real response labeled samples in the prediction of residential adjustable capacity, and significant prediction deviation of the traditional fixed reduction coefficient method, this paper deeply integrates K-means++ clustering and LSTM time-series prediction model, and constructs a prediction method of residential adjustable capacity considering user profile heterogeneity. The main research conclusions of this paper are as follows:

Based on 7-dimensional load features, including the daily average load, peak-to-valley difference, and the proportion of evening peak load, the K-means++ algorithm is employed to divide residential users into 6 typical electricity consumption groups. The constructed user profile system can effectively characterize the heterogeneity of consumption behaviors, providing a foundation for differentiated forecasting.
The proposed profile-specific constrained sample pool construction method generates high-quality response samples in batches with the objective of electricity cost minimization. This resolves the challenges faced by traditional models regarding the reliance on empirical settings for constraint parameters and the shortage of response samples.
The LSTM forecasting model integrated with user profile labels exhibits outstanding performance, and its prediction accuracy outperforms commonly used heterogeneity-aware methods, namely K-means + SVM and K-means + XGBoost. After introducing the profile labels, the three core accuracy metrics of the model are further improved by 16.29%, 24.52%, and 20.21%, respectively, validating the critical role of profile heterogeneity in enhancing prediction accuracy.

Ultimately, in real-world grid dispatch and market bidding, the practically realizable capacity is inherently constrained by practical engineering factors. Future research will focus on extending this fundamental framework toward practical implementation across three key dimensions: (1) designing privacy-preserving forecasting architectures based on federated learning to guarantee the security of smart meter data; (2) optimizing lightweight, distributed edge-computing algorithms to satisfy the real-time efficiency demands of massive residential scenarios; and (3) integrating behavioral economics models to account for dynamic user participation willingness, thereby bridging the gap between theoretically predicted capacities and actual dispatchable resources.

Author Contributions

Conceptualization, Y.H. and Y.L. (Yuansheng Li); methodology, H.X. and Y.L. (Yang Long); software, H.X.; validation, H.X., R.H. and Y.H.; formal analysis, H.X.; investigation, H.X. and R.H.; resources, Y.L. (Yuansheng Li); data curation, H.X. and Y.L. (Yang Long); writing—original draft preparation, H.X.; writing—review and editing, Y.H. and Y.L. (Yuansheng Li); visualization, H.X.; supervision, Y.H. and Y.L. (Yuansheng Li); project administration, Y.H. and Y.L. (Yang Long); funding acquisition, Y.L. (Yuansheng Li). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Ningxia, grant number 2024AAC03754.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are not publicly available. The residential smart meter data from a region in Northwest China are confidential under the project agreement, and cannot be shared publicly due to data privacy and security restrictions.

Conflicts of Interest

Authors Run Han and Yuansheng Li are employed by State Grid Ningxia Electric Power Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Li, W.; Shi, Y.; Qian, Y.; Ding, K.; Luo, J.; Fu, J.; Wang, Y.; Yu, H. Probabilistic Forecasting of PV Adjustable Capacity and Two-Timescale Volt-Var Control in Active Distribution Networks. IET Gener. Transm. Distrib. 2025, 19, e70185. [Google Scholar] [CrossRef]
Zhang, M.J.; Yan, Q.Y.; Guan, Y.J.; Ni, D.; Tinajero, G.D.A. Joint planning of residential electric vehicle charging station integrated with photovoltaic and energy storage considering demand response and uncertainties. Energy 2024, 298, 131370. [Google Scholar] [CrossRef]
Xu, R.H.; Lv, F.; Shi, Y.; Gao, X.F.; Li, H.H.; Wang, G.H. Demand response of large-scale residential load to concentrated variable renewable energy. J. Clean. Prod. 2024, 464, 142751. [Google Scholar] [CrossRef]
Li, H.; Dong, Q.; Wang, P.; Zhang, N. Forecasting Multi-Timescale Demand Response Potential Using Characteristic Maps. CSEE J. Power Energy Syst. 2025, 12, 200–209. [Google Scholar] [CrossRef]
Luo, Z.Y.; Peng, J.Q.; Yin, R.X. Many-objective day-ahead optimal scheduling of residential flexible loads integrated with stochastic occupant behavior models. Appl. Energy 2023, 347, 121348. [Google Scholar] [CrossRef]
Qian, C.; Wu, Z.J.; Xu, D.L.; Dou, X.B.; Hu, Q.R. A Multi-Cluster Mean-Field Game-Based Demand Response Management for Large-Scale Residential Customers with Heterogeneous Flexibility. IEEE Open Access J. Power Energy 2025, 13, 2–14. [Google Scholar] [CrossRef]
Shao, X.S.; Cui, G.Y.; Chen, X.; Ji, X.R.; Yi, Y.X. Optimal Control Strategy of Platform Load Oriented to Network and Load Cooperation. Complexity 2021, 2021, 9976473. [Google Scholar] [CrossRef]
Zhou, X.Y.; Liu, X.F.; Liu, H.; Ji, Z.Y.; Li, F. Optimal dispatching strategy for residential demand response considering load participation. Glob. Energy Interconnect. 2024, 7, 38–47. [Google Scholar] [CrossRef]
Song, Z.F.; Shi, J.; Shu, S.J.; Chen, Z.; Yang, W.W.; Zhang, Z.T. Day Ahead Bidding of a Load Aggregator Considering Residential Consumers Demand Response Uncertainty Modeling. Appl. Sci. 2020, 10, 7310. [Google Scholar] [CrossRef]
Yu, H.Y.; Zhang, J.C.; Ma, J.X.; Chen, C.Y.; Gong, G.X.; Jiang, Q.Y. Privacy-preserving demand response of aggregated residential load. Appl. Energy 2023, 339, 121018. [Google Scholar] [CrossRef]
Yin, R.; Kara, E.C.; Li, Y.; DeForest, N.; Wang, K.; Yong, T.; Stadler, M. Quantifying flexibility of commercial and residential loads for demand response using setpoint changes. Appl. Energy 2016, 177, 149–164. [Google Scholar] [CrossRef]
Wang, T.; Wang, J.; Zhao, Y.M.; Shu, J.; Chen, J.C. Multi-objective residential load dispatch based on comprehensive demand response potential and multi-dimensional user comfort. Electr. Power Syst. Res. 2023, 220, 109331. [Google Scholar] [CrossRef]
Hsueh, I.C.; Zhao, X.S.; Feng, J.; Cheng, Y.; Jiang, T.; Yang, Q.; Zhang, Z.H. Operational optimization for electric heat pump clusters based on demand response considering user comfort. Energy Build. 2023, 297, 113272. [Google Scholar]
Sonmez, M.A.; Bagriyanik, M. Generating Manageable Electricity Demand Capacity for Residential Demand Response Studies by Activity-based Load Models. Adv. Electr. Comput. Eng. 2021, 21, 99–108. [Google Scholar] [CrossRef]
Song, Z.F.; Shi, J.; Li, S.J.; Chen, Z.; Jiao, F.S.; Jiang, W.; Zhang, Z.T. Data-driven and physical model-based evaluation method for the achievable demand response potential of residential consumers’ air conditioning loads. Appl. Energy 2022, 307, 118017. [Google Scholar] [CrossRef]
Li, K.P.; Li, Z.H.; Huang, C.Y.; Qi, Q.W. Online transfer learning-based residential demand response potential forecasting for load aggregator. Appl. Energy 2024, 358, 122631. [Google Scholar] [CrossRef]
Jia, D.; Zhao, H.Y.; Liu, K.Y.; Xie, K.H.; Gou, B. An adaptive transfer learning approach for dynamic demand response potential prediction of load aggregators. Energies 2026, 19, 1083. [Google Scholar] [CrossRef]
Wang, Y.; Wang, X.; Long, C.; He, J.; Liu, Z. Smart Households’ Aggregated Capacity Forecasting for Load Aggregators Under Incentive-Based Demand Response Programs. IEEE Trans. Ind. Appl. 2020, 56, 1086–1097. [Google Scholar] [CrossRef]
Wang, K.; Wang, K.; Yin, R.; DeForest, N. A Two-Layer Framework for Quantifying Demand Response Flexibility at Bulk Supply Points. IEEE Trans. Smart Grid 2016, 9, 3616–3627. [Google Scholar] [CrossRef]
Wang, Y.; Chen, Q.; Hong, T.; Kang, C. Review of Smart Meter Data Analytics: Applications, Methodologies, and Challenges. IEEE Trans. Smart Grid 2018, 10, 3125–3148. [Google Scholar] [CrossRef]
McLoughlin, F.; Duffy, A.; Conlon, M. A clustering approach to domestic electricity load profile characterisation using smart metering data. Appl. Energy 2015, 141, 190–199. [Google Scholar] [CrossRef]
Csoknyai, T.; Legardeur, J.; Akos, M.; Horváth, M.; Szalay, Z.; Szalay, Z.; Babulák, R.; Bakonyi, A.; Bélafi, Z.D.; Bélafi, Z.D.; et al. Development of electricity consumption profiles of residential buildings based on smart meter data clustering. Energy Build. 2021, 251, 111376. [Google Scholar] [CrossRef]
Michalakopoulou, V.; Michalakopoulos, S.; Sarras, F.; Papas, I.; Skouloudis, P.; Marinakis, V.; Doukas, H. A machine learning-based framework for clustering residential electricity load profiles to enhance demand response programs. Appl. Energy 2024, 361, 122943. [Google Scholar] [CrossRef]
Sun, M.; Sun, H.; Cai, D.; Zhao, C.; Hu, S.; Li, P.; Li, Y.; Wang, J.; Geng, G.; Bian, W. Load Characteristic Analysis and Load Forecasting Method Considering Extreme Weather Conditions. Electronics 2025, 14, 3978. [Google Scholar] [CrossRef]
Shang, R.; Ma, Y. Electric Vehicle Charging Load Forecasting Based on K-Means plus plus—GRU-KSVR. World Electr. Veh. J. 2024, 15, 582. [Google Scholar] [CrossRef]
Liang, J.W.; Yue, J.; Xin, Y.; Pan, S.; Tian, J.M.; Sun, J. Short-Term photovoltaic power forecasting based on K-means plus plus clustering, secondary decomposition and TCN-BiLSTM-Attention model. Electr. Power Syst. Res. 2026, 255, 112749. [Google Scholar] [CrossRef]
Valentini, M.; Grimaldi, O.; Andreotti, A.; Kotsampopoulos, N.; Bertoldi, P.; Pierluigi, S.; Lucas, A.; Saez, A.S.; Kanellos, F.; Efthymia, E. Demand Response Impact Evaluation: A Review of Methods for Estimating the Customer Baseline Load. Energies 2022, 15, 5259. [Google Scholar] [CrossRef]
Schwarz, P.; Mohajeryami, S.; Cecchi, V. Building a Better Baseline for Residential Demand Response Programs: Mitigating the Effects of Customer Heterogeneity and Random Variations. Electronics 2020, 9, 570. [Google Scholar] [CrossRef]
KEMA Inc. PJM Empirical Analysis of Demand Response Baseline Methods, PJM Load Manage; Task Force; KEMA Inc.: Clark Lake, MI, USA, 2011. [Google Scholar]
Alhussein, M.; Aurangzeb, K.; Haider, S.I. Hybrid CNN-LSTM Model for Short-Term Individual Household Load Forecasting. IEEE Access 2020, 8, 180544–180557. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Kwon, B.S.; Park, R.J.; Song, K.B. Short-Term Load Forecasting Based on Deep Neural Networks Using LSTM Layer. J. Electr. Eng. Technol. 2020, 15, 1501–1509. [Google Scholar] [CrossRef]
Abumohsen, M.; Owda, A.Y.; Owda, M. Electrical Load Forecasting Using LSTM, GRU, and RNN Algorithms. Energies 2023, 16, 2283. [Google Scholar] [CrossRef]

Figure 1. Prediction Process of Residential Adjustable Capacity Considering User Profiles.

Figure 2. Pearson correlation heat map of the seven load features.

Figure 3. Three-dimensional PCA projection of residential user clustering results.

Figure 4. Average Daily Load Curves of Users with Different Profiles.

Figure 5. Load curve of the same image under different times of use electricity prices: (a) Medium-load stable profile; (b) Low-base intermittent-consumption profile; (c) Medium-load strongly fluctuating profile; (d) High-base continuous-consumption profile; (e) High-base strongly fluctuating profile; (f) Low-base rigid-consumption profile.

Figure 6. Horizontal Comparison of Prediction Results of Three Methods.

Figure 7. Comparison of Prediction Accuracy among Three Methods.

Figure 8. Quantitative Comparison of Adjustable Capacity for Multi-stage Electricity Price Response on a Typical Day: (a) Medium-load stable profile; (b) Low-base intermittent-consumption profile; (c) Medium-load strongly fluctuating profile; (d) High-base continuous-consumption profile; (e) High-base strongly fluctuating profile; (f) Low-base rigid-consumption profile.

Table 1. Load features used for residential user profiling.

Feature	Mathematical Expression	Physical Meaning
Average daily load	$p_{avg} = \frac{1}{D_{i}} \sum_{d = 1}^{D_{i}} (\frac{1}{T} \sum_{t = 1}^{T} P_{i, d, t})$	The average electricity consumption per unit time in the statistical cycle, serving as a core indicator of the overall consumption scale.
Maximum load	$p_{\max} = \max_{d \in [1, D_{i}], t \in [1, T]} P_{i, d, t}$	The maximum instantaneous load value across all periods in the statistical cycle, representing peak demand.
Minimum load	$p_{\min} = \min_{d \in [1, D_{i}], t \in [1, T]} P_{i, d, t}$	The minimum instantaneous load value across all periods in the statistical cycle, representing uninterruptible basic demand.
Peak-valley difference	$Δ p = p_{\max} - p_{\min}$	The difference between maximum and minimum load, reflecting the fluctuation range of the daily load curve.
Load factor	$p_{load} = \frac{p_{a v g}}{p_{\max}}$	The ratio of average daily load to maximum load, a comprehensive indicator measuring consumption stability.
Evening peak load ratio	$p_{nignt} = \frac{\sum_{d = 1}^{D_{i}} \sum_{t \in T_{night}} P_{i, d, t}}{\sum_{d = 1}^{D_{i}} \sum_{t = 1}^{T} P_{i, d, t}}$	The proportion of electricity consumed during evening peaks to the daily total, reflecting concentration during peak hours.
Weekday-weekend difference	$p_{diff} = \frac{\| {\bar{P}}_{work} - {\bar{P}}_{weekend} \|}{{\bar{P}}_{work}}$ $\{\begin{cases} {\bar{P}}_{work} = \frac{1}{\| D_{work} \|} \sum_{d \in D_{work}} \frac{1}{T} \sum_{t = 1}^{T} P_{i, d, t} \\ {\bar{P}}_{weekend} = \frac{1}{\| D_{weekend} \|} \sum_{d \in D_{weekend}} \frac{1}{T} \sum_{t = 1}^{T} P_{i, d, t} \end{cases}$	The relative difference between average weekday load and weekend load, reflecting weekly patterns.

Table 2. Residential Time-of-Use Electricity Pricing Scheme Table.

Scheme	Valley Period (¥/kWh) 23:00–08:00	Flat Period (¥/kWh) 08:00–17:00, 19:00–23:00	Peak Period (¥/kWh) 17:00–19:00
D	0.49	0.49	0.49
A	0.38	0.45	0.65
B	0.32	0.41	0.72
C	0.2	0.4	0.81

Table 3. Comparison of Six Types of Indicators and Comprehensive Scores.

k	WCSS	Silhouette	CH Index	DB Index	Comprehensive Score
3	4929.44	0.2512	399.9683	1.3195	0.6370
4	4627.95	0.2675	375.7744	1.1682	0.7934
5	4485.67	0.2454	366.3149	1.2645	0.4782
6	3263.55	0.2558	361.0529	1.1521	0.8783
7	2930.28	0.2469	357.1254	1.1340	0.6581
8	2736.86	0.2266	339.3152	1.1288	0.3000

Table 4. Statistics of Core Load Characteristics for Each Cluster.

Cluster	$p_{avg}$ (kW)	$p_{\max}$ (kW)	$p_{\min}$ (kW)	$Δ p$ (kW)	$p_{load}$ (%)	$p_{nignt}$ (%)	$p_{diff}$ (%)	User Profile
1	0.7751	8.3367	0.0137	8.3229	0.0929	0.3487	0.0748	Medium-load stable profile
2	0.5172	7.0461	0.0059	7.0403	0.0731	0.3140	0.0879	Low-base intermittent consumption profile
3	0.7753	8.3022	0.0112	8.2910	0.0928	0.4136	0.3071	Medium-load strongly fluctuating profile
4	1.4935	8.4416	0.0024	8.4392	0.1769	0.3576	0.0850	High-base continuous consumption profile
5	1.2772	8.2787	0.1604	8.1183	0.1536	0.3328	0.0728	High-base strongly fluctuating profile
6	0.2774	4.5551	0.0071	4.5480	0.0667	0.2862	0.1255	Low-base rigid consumption profile

Table 5. Electricity Consumption Characteristics of Different User Profiles.

User Profile	Electricity Consumption Characteristics
Medium-load stable profile	The overall power consumption scale is moderate, with relatively stable operation. As the base load remains at a moderate level, the absolute adjustable capacity is at an intermediate level.
Low-base intermittent-consumption profile	The overall electricity consumption base is relatively low, with almost zero baseline standby load. The electricity usage pattern exhibits strong intermittency, while the proportion of rigid baseline load remains relatively high, resulting in limited adjustable space.
Medium-load strongly fluctuating profile	Electricity consumption exhibits significant fluctuations, with a minimal baseline standby load. On weekdays, peak-hour electricity usage is highly concentrated, while there is a notable low-demand period during the day. The response to price signals demonstrates considerable flexibility.
High-base continuous-consumption profile	The proportion of rigid equipment with extremely high standby or uninterrupted operation is significant, with adjustable potential primarily concentrated in the 18:00–22:00 period, where both adjustable power and regulation capacity remain at high levels.
High-base strongly fluctuating profile	The base load level is relatively high, with notable fluctuations in electricity consumption behavior. Significant peak usage occurs during late evening hours, and the adjustable potential exhibits both scalability and volatility characteristics.
Low-base rigid-consumption profile	The overall electricity consumption base is extremely low, with a small peak-to-valley difference and highly intermittent characteristics. The rigid base load dominates the electricity consumption curve, resulting in an extremely limited absolute adjustable capacity.

Table 6. Clustering Stability Metrics Under Different Random Seeds.

Metric	Mean	Standard Deviation	Minimum
ARI	0.9935	0.0040	0.9888
NMI	0.9906	0.0057	0.9833

Table 7. Clustering Stability Metrics Across Different Seasons.

Season	ARI	NMI
Spring	0.2012	0.2994
Summer	0.2159	0.3400
Autumn	0.3840	0.4650
winter	0.4291	0.4626

Table 8. Clustering Stability Metrics Across Different Time Periods.

Time Period	ARI	NMI
weekday	0.8211	0.8441
weekend	0.5962	0.6101

Table 9. Comparison of prediction accuracy among different models.

Model	Prediction Accuracy Metrics
Model	RMSE (kW)	MAE (kW)	MAPE (%)
K-means + SVM	0.08634	0.04305	5.8172
K-means + XGBoost	0.08269	0.04017	5.4683

Table 10. Comparison of Adjustable Capacity for Multi-stage Electricity Price Response on a Typical Day.

User Profile	Adjustable Capacity (kW)
User Profile	Price A	Price B	Price C
Medium-load stable profile	0.1484	0.1749	0.2562
Low-base intermittent-consumption profile	0.2015	0.3043	0.3755
Medium-load strongly fluctuating profile	0.3895	0.6394	0.7843
High-base continuous-consumption profile	1.0030	1.2808	1.6018
High-base strongly fluctuating profile	0.9881	1.2005	1.4487
Low-base rigid-consumption profile	0.0644	0.1436	0.2004

Table 11. Uncertainty evaluation results of the QR-LSTM model.

Evaluation Metric	Result Value
Actual coverage rate of the 95% CI	91.21%
Mean interval width	0.0813 kW
RMSE of the 0.05 quantile point prediction	0.0721 kW
RMSE of the 0.5 quantile point prediction	0.0657 kW
RMSE of the 0.95 quantile point prediction	0.0789 kW

Table 12. Accuracy Comparison between the Model without User Profile Labels and the Proposed Model on a Real-World Dataset.

Model	Prediction Accuracy Metrics
Model	RMSE (kW)	MAE (kW)	MAPE (%)
Model without user profile labels	0.10234	0.05478	7.3862
Proposed model	0.08567	0.04135	5.8932

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, Y.; Xu, H.; Han, R.; Li, Y.; Long, Y. Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity. Sustainability 2026, 18, 6498. https://doi.org/10.3390/su18136498

AMA Style

Hu Y, Xu H, Han R, Li Y, Long Y. Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity. Sustainability. 2026; 18(13):6498. https://doi.org/10.3390/su18136498

Chicago/Turabian Style

Hu, Yi, Han Xu, Run Han, Yuansheng Li, and Yang Long. 2026. "Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity" Sustainability 18, no. 13: 6498. https://doi.org/10.3390/su18136498

APA Style

Hu, Y., Xu, H., Han, R., Li, Y., & Long, Y. (2026). Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity. Sustainability, 18(13), 6498. https://doi.org/10.3390/su18136498

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Residential Load Adjustable Capacity Considering User Profile Heterogeneity

Abstract

1. Introduction

2. Residential User Profiling

2.1. Load Feature Extraction

2.2. K-Means++ Clustering

2.3. Construction Process of User Profile System

3. Prediction of Load Adjustable Capacity Considering User Profiles

3.1. Profile-Constrained Demand Response Sample Generation

3.1.1. Objective Function

3.1.2. Constraint Conditions

3.2. Calculation of Adjustable Capacity

3.3. LSTM-Based Adjustable Capacity Prediction Network

3.3.1. Principle of LSTM

3.3.2. Prediction Process of Adjustable Capacity

4. Experimental Results and Analysis

4.1. Construction of User Profiles

4.2. Robustness Validation of the User Profiling System

4.3. Construction of Sample Pool

4.4. Training of Deep Learning Model

4.5. Prediction and Analysis of Adjustable Capacity

4.6. Validation of the Necessity of Considering User Profile Heterogeneity and Analysis of Generalization Limitations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI