Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs

Sajid, Suhaib; Li, Bin; Qi, Bing; Berehman, Badia; Guo, Qi; Athar, Muhammad; Muqtadir, Ali

doi:10.3390/en19081851

Open AccessReview

Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs

by

Suhaib Sajid

^1,*

,

Bin Li

¹

,

Bing Qi

¹,

Badia Berehman

²,

Qi Guo

³,

Muhammad Athar

¹ and

Ali Muqtadir

^1,*

¹

School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China

²

College of Telecommunications and Information Engineering, Nanjing University of Post and Telecommunications, Nanjing 210049, China

³

Digital Research Branch (Digital Research Institute), Inner Mongolia Power (Group) Company Limited, Hohhot 010010, China

^*

Authors to whom correspondence should be addressed.

Energies 2026, 19(8), 1851; https://doi.org/10.3390/en19081851

Submission received: 8 February 2026 / Revised: 25 March 2026 / Accepted: 5 April 2026 / Published: 9 April 2026

Download

Browse Figures

Versions Notes

Abstract

Incentive-based demand response (DR) programs rely on accurate and trustworthy quantification of customer performance to ensure fair compensation and market efficiency. Estimating the customer baseline load is an important part of this process. It shows how much electricity would be used if there were no DR occurrence. Unlike conventional load forecasting, baseline modeling is inherently unobservable, economically sensitive, and vulnerable to strategic manipulation. With the growing penetration of distributed energy resources, electric vehicles, and intelligent control technologies, traditional baseline estimation approaches face increasing limitations. This paper offers a thorough and future-oriented synthesis of baseline load estimation for incentive-based DR strategies. Current approaches are carefully classified into rule-based, statistical, probabilistic, machine learning (ML), and hybrid intelligence techniques, and their appropriateness for various DR services and client categories is rigorously evaluated. Beyond modeling accuracy, this paper emphasizes market-oriented requirements, including incentive compatibility, simplicity, transparency, privacy preservation, and deployment feasibility. Furthermore, emerging digital trust enablers such as blockchain and FL are reviewed, along with baseline-free and baseline-light alternatives for performance evaluation. Finally, open research challenges and future directions toward interpretable, robust, and market-ready baseline intelligence are discussed.

Keywords:

baseline load estimation; demand response; deep learning; machine learning; energy markets

1. Introduction

The global transition toward low-carbon and renewable-dominated power systems has significantly increased the operational complexity of electricity grids. The widespread use of variable renewable energy sources, the electrification of transportation, and the growth of distributed energy supplies create unpredictability and volatility that make it hard to use classic supply-side balancing strategies [1]. Enhancing the incorporation of renewable energy providers into electrical grids is an effective tool for reducing the number of greenhouse gas emissions, thus achieving carbon neutrality and environmental sustainability [2]. The extensive use of renewables, which is marked by a high degree of unpredictability, fluctuation and inconsistency is able to undermine the dependability of the grid. Demand response (DR) is a stable, cost-efficient, and simplified solution that could be applied to stabilize the system in times of temporary supply deficits [3]. Today’s power markets feature two main types of DR initiatives, each driven by distinct objectives: those based on pricing mechanisms and those dependent on rewards. Here, end-users are encouraged to tap into their adaptable consumption patterns to intentionally shift away from typical electricity usage habits, either by reacting to changes in rates or by earning financial benefits provided by power providers [4]. DR has offered a cost-effective and flexible means of handling such problems by enabling the end-users to adjust patterns of electricity consumption as the system needs it or the market would want it. Incentive-based DR programs are of special significance among other DR paradigms as they are capable of obtaining deterministic and contractually assured flexibility of both grid economy and grid reliability services [5]. In incentive-based DR programs, customer compensation is determined by comparing actual electricity consumption during a DR event with an estimated baseline load. The baseline load is the amount of energy that would have been used if the DR event had not happened. This counterfactual quality is what makes baseline load estimation different from regular load forecasting. In reward-oriented DR programs, the main stakeholders in the market are the energy supplier which provides electricity, the consumers who use it, and the load aggregator business. The aggregator helps close this gap in that the smaller-scale electricity users (the ones who are usually not allowed to be independent participants in large-scale wholesale power trading on their own because of their size or resources) still possess good potential to change their energy needs [6]. When such programs are implemented, the compensation to the consumer end participants is done in terms of incentives and this is computed depending on the amount of their engagement on the DR activities and actual results attained. For instance, this could involve cutting back on power usage to alleviate peak-hour strains, known as peak-shaving, or ramping up consumption during low-demand periods to fill in valleys in the grid’s load profile [7]. Such outcomes should be evaluated using the comparison of the estimated consumption of electricity that would have been in place without any DR intervention as compared to the actual consumption being registered as the event unfolds. Consequently, the very notion of a consumer baseline load is created as a cornerstone factor; a baseline, in turn, is an estimated estimation of the quantity of electricity that the user would have used in the course of the very same period had they not participated in the DR program at all [8]. Being a key data point in the measurements and justification of DR outcomes, there are usually significant concerns and controversies among all stakeholders involved, such as the concern of accuracy and fairness, as well as speculating on the potential impact of incentive compensation and the overall credibility of the program [9].

While the process of calculating a baseline load might initially appear comparable to traditional load prediction within electrical networks, it is important to recognize that these represent fundamentally different approaches with unique purposes and methodologies. The traditional load forecasting techniques are primarily applied to help perform the efficient operation, coordination, and monitoring of the functions of both the power generation infrastructure (supply side) and the final-user consumption pattern (demand side), as represented in Figure 1, which enables one to plan and make adjustments in real time as needed to achieve an equilibrium in the system operation. There are four primary classifications for forecasting prediction, which are long-, medium-, short-, and ultra-short-term forecasting based on the prediction horizon [10]. In the case of the power supply industry, long-term projections can help the industry predict future demand, invest in new generation plants, modernize the transmission system, and create decarbonization plans by the utility [11]. Medium-range forecasting assists with equipment upkeep, fuel procurement, rate setting, and budget management [12]. Short-range forecasting emphasizes economic operations, supporting energy oversight, allocation of assets (encompassing adaptable options and renewables like photovoltaics and turbines), and power assignment across generators. Ultra-short-range forecasting bolsters both cost-efficiency and dependability via immediate adjustments, crisis handling, and transaction-based operations [13]. On the consumption side, research has focused mainly on the short- and ultra-short-range predictions. Short-distance techniques are combined with management of storage systems to reorganize the patterns of use and reduce costs. Ultra-short-range solutions provide the possibility to instantly manage adjustable devices, such as cooling systems, circulators, and blowers, to reduce energy consumption or emissions and improve the health and living conditions of occupants and hygiene [12,14]. Inaccurate or manipulable baseline estimation can result in economic inefficiencies, including overcompensation, free-riding behavior, reduced customer trust, and undermined market credibility. Consequently, baseline modeling has evolved into a multidisciplinary problem involving data analytics, behavioral economics, game theory, and digital trust technologies [15,16].

While several existing review papers have explored the broader concepts of DR and load forecasting, they frequently approach the topic from isolated technical perspectives. Previous reviews have extensively covered smart grid architectures and the specific optimization techniques required for efficient residential appliance scheduling [17]. Similarly, the recent literature has provided valuable insights into the advanced control and optimization of DR frameworks within industrial and commercial applications [18]. But most existing reviews focus either strictly on the mathematical accuracy of load forecasting or the hardware implementation of DR, often overlooking the complex settlement constraints of electricity markets. There remains a distinct gap in the literature for a comprehensive review that bridges statistical baseline estimation directly with market-oriented requirements such as incentive compatibility, fairness, and vulnerability to strategic manipulation. Unlike previous reviews, this paper bridges that gap by synthesizing intelligent baseline estimation methods with the economic realities of compensation mechanisms and the emerging role of digital trust technologies.

In incentive-based demand response, baseline load estimation should not be evaluated only as a forecasting exercise because its output is used directly in compensation, settlement, and performance verification. Unlike conventional load prediction, the baseline is a counterfactual quantity that determines how much flexibility is financially recognized and rewarded during a DR event. Therefore, even a model with good predictive accuracy may be unsuitable for practical deployment if it introduces systematic settlement bias, is difficult to interpret in disputes, or can be strategically exploited by market participants. This issue becomes especially important because different DR services rely on different compensation structures, such as energy-based settlement, capacity reservation, or mileage-based performance. As a result, baseline methods must be assessed not only by statistical error, but also by their compatibility with service-specific payment rules, fairness requirements, verification procedures, and transparency needs in real electricity markets.

Contributions of the Paper

The existing literature has largely focused on the statistical accuracy of baseline load estimation, often overlooking the economic and operational realities of modern energy markets. This paper addresses this disparity by offering a critical synthesis of baseline methodologies viewed through the lens of market deployability.

This review makes a conceptual and analytical contribution rather than proposing a new baseline prediction algorithm. Its novelty lies in introducing a market-oriented framework for evaluating baseline load estimation methods in incentive-based demand response. Instead of comparing methods only in terms of prediction accuracy, this paper assesses them through four interrelated dimensions: reliability, practicality, fairness, and transparency. Using this framework, this review links estimation techniques to demand response service classes, settlement logic, end-user characteristics, and verification requirements. As a result, this paper functions not only as a survey of the literature, but also as a decision-support taxonomy for baseline selection in real market settings. In addition, by integrating digital trust enablers and baseline-light alternatives into the same analytical structure, this review extends the field from method comparison toward deployable and trustworthy baseline intelligence. Specifically, the primary contributions of this work are as follows:

This paper reframes baseline load estimation as a performance quantification and economic settlement problem, rather than only a statistical prediction task.
It proposes a four-dimensional evaluation framework based on reliability, practicality, fairness, and transparency, enabling a more complete assessment of baseline methods in real DR settings.
It maps estimation approaches to DR service classes, settlement structures, and end-user types, thereby providing a decision-support taxonomy for method selection.
It extends the review beyond conventional model comparison by incorporating digital trust enablers, manipulation resistance, and baseline-light alternatives as part of future market-ready baseline intelligence.

2. Conceptual Foundations and Market-Oriented Requirements of Baseline Load Estimation

Baseline load is defined as the expected electricity consumption of a customer during a DR event if no external intervention had occurred. As a counterfactual quantity, it cannot be directly measured or validated using observed data. Unlike the conventional load forecasting that is used to support the operational planning and control, baseline estimation serves as a performance standard to reconcile the financial settlements and to verify the market compliance. The existence of such a difference creates special issues related to fairness, transparency, and strategic behavior. Baseline models should, thus, be balanced between predictive accuracy and manipulation resistance, stakeholder interpretability and large-scale implementation in actual electricity markets [19]. Unlike traditional load estimation, predicting the baseline load for a client is closely related to how benefits are shared and how utilities, aggregators, and end users work together. Its successful application, therefore, relies on four core principles, namely, reliability, practicality, fairness, and openness. Figure 2 summarizes the major requirements, incentive mechanism, estimation approaches, and digital trust enablers for customer reference load estimation. These four guiding principles are outlined below.

2.1. Reliability

Reliability is essential for evaluating the true performance of DR participants, as customers should be rewarded strictly in proportion to the flexibility they actually deliver. In this setting, accurate baseline estimation necessitates little systematic error and restricted divergence between the calculated baseline and actual electricity usage while flexibility is inactive [20]. The high dependability will be based on the advanced metering infrastructure, as well as the sophisticated estimate methods. In the incentive-based DR systems, services will vary based on the response time, duration, and operational focus, which allows flexibility during different times individually, leading to a set of different compensation structures. Indicatively, day-ahead demand bidding schemes are usually premised on the energy remunerations, whereas frequency regulation services are founded on the capacity and mileage-based payment on exceedingly brief time scales [21]. Consequently, different DR services demand different levels of estimation reliability and temporal resolution to ensure accurate performance evaluation. From an economic perspective, an acceptable level of reliability is usually achieved through compromise among all market participants.

2.2. Practicality

Practicality highlights the need for estimation approaches and compensation rules that are easy for stakeholders to understand and apply. Clear and straightforward mechanisms facilitate settlement procedures and promote efficient transactions between flexibility providers and buyers [16]. Practically, there has to be a trade-off between operational simplicity and methodological sophistication to ensure that the implementation is cost-effective. The payment mechanisms on separate market-driven DR services are also supposed to be transparent. As an illustration, direct load management and emergency backup systems might provide payments to customers when flexibility is dispatched and capacity payments when reserves are held but not dispatched [14]. Table 1 summarizes the alignment between DR service classes and baseline requirements and the key references related to each DR service class.

2.3. Fairness

Fairness is treating all stakeholders equally, and it also means that baseline load assessment has to be sufficiently robust to resist strategic exploitation, often referred to as moral hazard. A common concern is the artificial inflation of baseline consumption by customers seeking higher compensation [34]. In market-based DR contexts, the information asymmetry and the threat of adverse selection are typically experienced between customers and aggregators because customers typically have more detailed knowledge of their own consumption behavior than do aggregators. Such imbalance enables the customers to make strategic decisions regarding participation based on whether past data are oriented towards higher baseline estimates or not [35]. Such information gaps may encourage gaming behavior, where customers dispute baseline accuracy while aggregators suspect intentional manipulation of financial gain. These issues have been observed in real-world enforcement cases. Poorly designed incentives undermine trust and reduce participation, threatening the long-term sustainability of DR programs. The fairness-oriented arrangements thus work to deter opportunism by properly designed reward and punishment programs that adhere to such principles as compatibility of rewards, rationality of individuals, and financial resources balance [36].

2.4. Transparency

Transparency is about making sure that everyone involved in baseline load assessment has equal access to information. It also forms the foundation of fairness. It requires participants to be in a position to access information pertaining to market transactions such that they can develop confidence and prevent people who lie about how flexible they are at the expense of privacy of customers. As smart meters and digital infrastructures become more common, new technologies like blockchain and federated learning (FL) offer intriguing solutions to create secure, auditable, and open trade environments for certifying DR services [37].

3. Taxonomy of Baseline Load Estimation Approaches

Beyond a purely methodological taxonomy, baseline estimation approaches should also be evaluated from a market-oriented perspective. In incentive-based DR, the practical value of a baseline model depends not only on its statistical accuracy, but also on how well it aligns with the settlement logic of the service, the operational constraints of deployment, and the behavioral responses it may induce among market participants. From this perspective, three additional evaluation dimensions are important. First, settlement-rule compatibility concerns whether a method is suitable for energy-based, capacity-based, mileage-based, or score-based compensation. Second, operational compatibility concerns whether the method can satisfy the temporal resolution, data availability, interpretability, and verification requirements of the target DR service. Third, behavioral robustness concerns whether the method is resistant to baseline inflation, strategic participation, information asymmetry, and post-event disputes. Therefore, the taxonomy presented in this section should be read not only as a classification of modeling techniques, but also as a framework for selecting baseline approaches under real market conditions.

3.1. Rule-Based and Statistical Approach

Rule-based and statistical techniques, including methods such as rolling averages and similar-day matching, are commonly applied for baseline load estimation because of their simplicity and minimal implementation requirements [27]. The rolling averaging approach, for instance, calculates the baseline by averaging a set of prior non-DR days, with variants like “High X of Y”, “Mid X of Y”, “Low X of Y”, and “Last Y” differing in which days are selected from the historical dataset. Adaptations of this technique (weighted or exponential) also improve the estimation by assigning more weight to days that are close to the event of the DR. Although all of these are simple and easy to adopt, they can only be accurate to an extent of the past load data, especially to loads that vary considerably or are highly sensitive to the weather [38].

The similar-day or day-matching method, as depicted in Figure 3, finds past days with load patterns that are similar to the corresponding day based on things like the type of day, the day of the week, and the weather. This solution only uses a customer’s own previous load data and does not employ any outside control groups [39]. The clustering method, including K-means++, may be useful to improve the accuracy and may be used to choose a set of similar days and add weather covariates in order to decrease further adjustments. In spite of these advances, the similar-day methods still require consistency of the days chosen and are less effective when dealing with commercial or industrial customers who have only somewhat stable consumption patterns [40,41].

3.2. Regression-Based Approaches

Regression-based approaches are used to predict the relationship between the baseline power demand and the factors that affect it, which include past use, weather, daylight cycles and utilization patterns [42]. The reason why these methods are ubiquitous in the literature is due to the fact that they are fairly easy to implement and the interpretation of results are not that difficult for shareholders. These models compute the baseline load by modeling observed data to an equation whose regression coefficients are obtained to minimize the error between the predicted and actual loads [43]. In practice, different regression techniques are applied depending on the complexity of the load behavior. For example, multiple linear regression has been employed by ERCOT for baseline estimation in emergency and interruptible DR programs. Quantile regression extends this approach by allowing the estimation of specific quantiles, which helps capture uncertainty and variability in load behavior [44]. Time-series regression models, particularly autoregressive integrated moving average (ARIMA), are particularly efficient at capturing complex temporal patterns in electricity consumption. ARIMA takes advantages of the the historical data to predict short term dynamics in the load and, thus, it is applied extensively in forecasting problems [45]. The problem with these regression models is that although they are easier to implement, they suffer serious accuracy drop when the data is complex [46].

3.3. Probabilistic Approaches

For residential customers whose load patterns vary significantly and for households that use renewable energy sources, probabilistic estimation methods can provide more accurate baseline load predictions through representing outcomes as probability distributions that capture inherent uncertainties. Weng et al. developed a Gaussian-process-based approach to account for the significant variability in residential consumption behaviors [47], while a recurrent Bayesian approach for estimating aggregated household baselines was presented by [48]. Li et al. integrated dynamic spatial clustering based on density with K-mean clustering in order to identify typical patterns of load while excluding insignificant anomalies [49]. Sun et al. established a statistical model utilizing neural-network-based clustering to enhance the representation of residential load uncertainty [50].

Consumers, especially those offering contingency reserve services, get advantages from probabilistic methodologies because of the significant impact of weather on renewable energy generation. Researchers and practitioners often utilize Bayesian estimation and Monte Carlo simulations to figure out how much reserve capacity they have [51,52]. For instance, Jost et al. used kernel density estimation to estimate wind and PV electricity for reserves that may be used the next day, which gave them realistic output distributions [53]. Bapin et al. integrated a bivariate Farlie–Gumbel–Morgenstern likelihood density with a statistical approach to enhance allocation of rotating reserves for wind power [51]. Wang et al. created short-range probabilistic predictions for PV output that take into account extended loss variations [54]. He et al. integrated quantitative regression with online learning to provide statistical wind energy forecasts and detect output fluctuations [55].

Probabilistic approaches should not be judged only by whether they improve forecasting accuracy. In incentive-based DR, their practical value also depends on how their uncertainty representation affects economic settlement risk. A probabilistic model may describe load variability more realistically, but if its predictive intervals are poorly calibrated, too wide, or difficult to interpret in settlement practice, it can still create material risks of overpayment, underpayment, or post-event dispute. This concern is especially important in reserve-oriented and renewable-coupled DR services, where rare weather-driven deviations and tail events can strongly affect the estimated counterfactual. Therefore, the limitation of probabilistic methods is not only computational burden or modeling complexity but also the possibility that uncertainty is transferred into ambiguous settlement outcomes unless confidence bounds, validation rules, and ex-post verification procedures are clearly specified. This economic settlement perspective is fully consistent with the broader market-oriented requirements of reliability, fairness, and transparency discussed earlier in the study.

3.4. Machine Learning Approaches

Machine learning approaches are also ubiquitious in the literature for their use in baseline load estimation. These approaches are mostly used for CBL estimation when it is coupled with IoT-enables devices such as smart-meters and phasor measurement units [56]. The reason for these approaches being so powerful is that these methods do not rely on manual feature engineering as they learn directly from past observations. This is the main reason why researchers use various sequence-based models such RNNs, LSTMs and CNNs for forecasting purposes [57]. For instance, Kim et al. used an LSTM network to anticipate the state of charge in a battery energy storage system that was taking part in frequency regulation [58]. Khosravi et al. utilized hierarchical recurrent CNNs to enhance voltage and frequency control in frequency regulation [59]. Conversely, Yu et al. implemented deep learning and hybrid CNN-LSTM architectures to forecast renewable generation outputs, such as wind and photovoltaic power, in flexible ramping applications [49].

In addition to neural networks, tree-based methods such as decision trees, random forests, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) are widely used due to their effectiveness in handling categorical or discrete input features, including day type, weather conditions, and weekday/weekend distinctions [60]. For example, Gassar et al. created an XGBoost framework that improved the estimate of short-term baseline load in homes beyond what typical statistical and machine learning models could do [56]. Jo et al. used LightGBM to enhance ramp event forecasting for wind power [61]. Despite their high predictive accuracy, ML-based methods are mostly limited to research settings because of their black-box nature, which hinders interpretability and acceptance among stakeholders in practical electricity market operations [37].

3.5. Hybrid and Physics-Informed Approaches

To address the limitations of conventional ML methods, such as limited interpretability, low generalizability, and poor transparency, domain knowledge and physical principles are increasingly incorporated to provide informative priors. In order to improve model dependability and interpretability, this method, also recognized as physics-based or information-guided ML, uses empirical, observational, physical, or mathematical knowledge about system behavior [62]. High-accuracy, fine-granularity forecasts are especially important for extra services like load following and frequency regulation, where accurate baseline estimates make sure that compensation is cost-effective. To achieve this, hybrid and physics-informed methodologies have been suggested. In order to forecast solar production, Huang et al. developed an interval-based prediction model which combines a constrained GAN with CNN and dual long-term memory networks [63]. An XGBoost-DNN composite was developed by Harikrishnan et al. to forecast short-range reference home demand. This ensemble made fewer mistakes than any of the individual ML approaches [64]. Wu et al. devised a probabilistic, physics-guided approach that combines Monte Carlo estimates with state-space transformations based on the Koopman operator to figure out how well wind farms can handle frequency regulation [65].

Bao et al. [66] used a Kalman filter, as depicted in Figure 4, and physics-based models to accurately estimate baseline power for HVAC systems. This is because power changes due to weather and occupancy are slower than those from frequency control signals. Also, feature selection methods are typically used with ML to make it more accurate and less expensive. For instance, Babu et al. employed recursive Hilbert transform techniques to ascertain critical input characteristics for estimating reference household demand [67]. Wen et al. created a feature selection system based on LightGBM to make predictions about prosumer load and generation more accurate and easier to understand [68]. Table 2 represents the comparison of different estimation techniques.

In short, rolling averaging is still the easiest and most used method as it only uses historical load data, which makes it useful for many real-world applications. The control group approach combines group-based methodologies with extra static or dynamic inputs to make them more accurate. This makes them good for commercial and industrial clients with consistent load patterns in low-compensation, grid economy-focused DR programs. For residential clients with very variable loads, regression and probabilistic estimating approaches work well, especially when occupancy schedules are known. Prosumers that use renewable energy benefit from probabilistic approaches because they can account for changes in weather [69]. Lastly, participants in load following such energy storage systems and industrial equipment with adjustable-speed operations may get very accurate estimates using ML and hybrid models provided their transparency issues are fixed or accepted. In general, there is not one baseline estimating approach that works for all market-based DR providers [70]. Tree-based models with hierarchical, rule-based designs are easier to understand and more transparent than deep learning models with complicated underlying structures. This makes them good candidates for practical baseline load estimates. Deep learning architectures, including long short-term memory networks, convolutional neural networks, and transformers, demonstrate superior performance in time-series modeling but often operate as black boxes. The lack of interpretability and explainability remains a major barrier to the adoption of deep learning models in market-based DR programs. Explainable AI techniques, such as SHAP values and attention mechanisms, are increasingly used to address this challenge. To provide a comprehensive evaluation, Table 3 summarizes these methods based on typical error metrics, data requirements, and computational scalability.

4. Incentive Mechanisms for Market-Oriented DR

Incentive mechanisms form the institutional and economic foundation of incentive-based DR programs. Their primary objective is to align the interests of grid operators, aggregators, and end users by encouraging flexible electricity consumption while ensuring fairness, transparency, and operational reliability. As DR programs are used to achieve various objectives of the system, the incentive mechanisms should be well tailored to indicate the operational purpose of flexibility, the response properties demanded and the market structure where the services are used. In reality, incentive systems may be divided into two main groups: those that focus on the economics of the grid and those that focus on the dependability of the grid. These two classes vary in the nature of their operations, compensation schemes, and regulatory purposes but they both have to deal with strategic behavior, manipulation of basis and information asymmetry, as shown in Figure 5.

4.1. Grid-Economy-Oriented Incentive Mechanisms

Grid-economy-oriented DR programs are primarily designed to improve market efficiency and reduce system operating costs. Some of the typical ones are demand bidding programs, interruptible load services, emergency response programs, and direct load control schemes. These programs are aimed at shifting or reducing electricity demand during periods of high prices or high-stress condition of the system; thus, lower peak demand is achieved and eliminates the necessity of expensive generation reserves.

4.1.1. Energy-Based Compensation Structures

In economy-driven DR services, customer flexibility is commonly remunerated through energy-based payments measured in kilowatt-hours (kWh). Participants receive financial compensation based on the amount of load they reduce relative to their estimated baseline during DR events. As a result, incentive mechanisms must incorporate verification and validation procedures to ensure that reported load reductions represent genuine flexibility contributions [22].

4.1.2. Credit-Scoring and Performance Rating Mechanisms

Credit-scoring mechanisms evaluate customer performance over historical DR events and assign reliability scores accordingly. Customers that have good performance records are given a priority of access to future DR contracts, more compensation rates or priority dispatch. This will promote the longevity of involvement as well as hinder opportunistic behavior because customers will be motivated to have a permanent and verifiable performance. The same mechanisms also make it possible to allow the aggregators to control the risk of the portfolio by choosing other participants with predictable and consistent characteristics of response, and enhances the overall flexibility offered. Credit-scoring schemes, which are widely adopted in electricity markets, are typically developed within a repeated game-based approach. In such schemes, the load integrator uses indirect assessment and validation to assess and rank clients’ results, and then allocates rewards or penalties directly, or influences participation decisions indirectly. The primary objective is to reduce uncertainty in DR delivery while maintaining fairness and credibility in the market [23].

Lv et al. suggested a credit-based incentive system in which consumer pay is directly tied to their credit ratings [24]. The credit-scoring-based market-clearing system is a good illustration of how to construct indirect incentives. This mechanism may lead to higher marginal pricing, but the consistent and predictable flexibility it provides may cut overall utility costs in the end. Also, a number of credit indicators have been used in practice, such as subscription of efficiency indices and market trust indices [25].

Effective credit scoring requires the accurate identification of baseline manipulation. Wang et al. used a Markov decision process to represent manipulation behavior in rolling-average baseline approaches and investigated the impact of inputs on referenced computation [26]. Their investigation revealed that manipulation displays a unimodal relationship concerning parameter X. Ellman et al. employed a sequential stochastic optimization framework to examine the influence of strategic consumer conduct on anticipated outcome throughout unpredictable DR event schedules [71]. As computational intelligence and networked smart devices continue to improve, credit score modeling systems are likely to become more consistent and easier to use as they will be able to detect manipulative conduct more accurately and automatically. In short, all of these methods help make baseline load assessment less likely to be manipulated. Still, no one incentive design can meet all of the important requirements at the same time, such as incentive compatibility, individual rationality, and maximizing societal well-being.

4.1.3. Self-Reported Baseline Declaration

In self-reported baseline mechanisms, customers declare their expected electricity consumption prior to a DR event. The declared baseline is then used as the reference for settlement. Deviations between actual consumption and the declared baseline determine the delivered flexibility and corresponding payment. This method makes baseline estimation models less complicated, but it also opens the door to the possibility of purposeful baseline inflating. To reduce this risk, self-reported baselines are usually paired with penalty clauses, auditing mechanisms, and historical benchmarking. On the basis of concept of voluntary participation condition, consumers are thought to behave in ways that provide them the most economic gain [28]. The self-reported baseline technique, first put out by Muthirayan et al., is set up as a repeating game with inadequate information. The DR operation under this system is broken down into four steps: (i) notice and reporting, (ii) bidding and clearing the market, (iii) dispatch and execution, and (iv) settlement with incentives and punishments [42].

With this method, customers choose to provide their baseline load, available DR capacity, and anticipated compensation before a DR event happens. Only customers whose bids are approved via market clearing are sent to deliver flexibility, and they are paid the same marginal price for each delivery. If customers who are not summoned to participate use more than what they said they would, they will be punished. For clients that are sent, pay is based on how well they did in the DR. But if their supplied capacity is less than the minimum threshold of the stated score, they may still have to pay fines. On the other hand, if the realized capacity is more than the top limit, the extra may not be rewarded or may be paid at a lower rate [72]. In 2018, Vuelvas et al. streamlined the practical implementation of this mechanism and showed that, in large-scale aggregation settings, a low probability of dispatch is more conducive to encouraging truthful reporting [73]. Building on this, Wang et al. demonstrated that near-incentive compatibility can be achieved by introducing an allowable threshold for baseline inflation [74].

Although this model is useful in terms of the repeated interactions, to encourage voluntary response and gradual truthful responses, the penalty scheme needs to be carefully designed. The penalties should not be too harsh as this will probably scare away participation, and they should not be too mild as they will not be deterrent to manipulation of strategies [75]. In addition, although the self-reporting procedure simplifies operations for load aggregators, it places an additional reporting and compliance burden on customers.

4.1.4. Profit-Sharing and Revenue Allocation Models

Profit-sharing mechanisms distribute a portion of market revenues or system cost savings among participating customers. Instead of compensating flexibility solely based on energy reductions, these mechanisms link customer remuneration to the overall economic benefit generated by the DR program. Such an arrangement promotes collaboration between the system operators, aggregators, and the consumers because everyone will be the beneficiary of a more efficient system. Profit-sharing programs are particularly attractive in wholesale power markets where the DR can be used to keep the prices steady and reduce congestion. The profit-sharing scheme aligns consumer and aggregator objectives by fostering cooperative participation with mutual benefits [76].

This scheme is frequently formalized in the form of contracts describing how the incentives are to be paid, a combination of a direct payment to the customer due to his DR effort and a given percentage of the total profits of the aggregator. The agreement is designed to reduce strategic manipulation by maximizing the aggregator’s projected value while evaluating both wrong cases behavior and effort-related charges, as long as each person is acting rationally. When the total DR capacity is different from the desired level, all customers who are part of the program are responsible for it in proportion to their stated DR capabilities. As an example, consumers who want to make the most money may be tempted to understate how much they can reduce their load by decreasing their stated baseline use [77]. The decision of a client to under-report or over-report is contingent upon the extent of the real reduction they can provide. Although the profit-sharing framework is designed to achieve a balanced-budget, win–win outcome, it exhibits limited incentive compatibility and, therefore, provides only weak guarantees of fairness among all stakeholders.

4.1.5. Hierarchical Game-Theoretic Mechanisms

Hierarchical incentive mechanisms model the strategic interactions between aggregators and customers using game-theoretic frameworks. In these models, aggregators act as leaders who design contracts and pricing schemes, while customers respond optimally based on their operational constraints and profit objectives. Such mechanisms enable the design of optimal contracts under strategic behavior, information asymmetry, and uncertainty. They provide a rigorous analytical foundation for balancing profit maximization with system-level objectives [78]. In incentive-based DR programs, utilities want to lower the costs of paying for resources that are in short supply. Aggregators want to get the most money from utilities while paying the least to customers. Customers want to get the most money from incentives while keeping their costs of dissatisfaction low in order to find the best levels of DR provision [79].

Samadi et al. developed a stochastic DR model using a mixed-strategy Stackelberg game to analyze the interaction between aggregators and residential consumers [80]. Cui et al. used a Stackelberg framework to dynamically enhance energy pricing techniques devised by power providers, consequently augmenting DR efficacy [81]. Even while the Stackelberg game framework may be used in many situations, it has built-in problems that make it hard to make incentives compatible. Therefore, it is typically used with the Vickrey–Clarke–Groves (VCG) mechanism to make a hybrid Stackelberg–VCG strategy. The Stackelberg game sets wholesale market pricing in this hybrid structure, while the VCG mechanism makes sure that clients report honestly [82].

4.2. Grid-Reliability-Oriented Incentive Mechanisms

GR-DR programs are aimed at stabilizing the power system, regulating the frequency, and ensuring the stability of its functioning. Such services are contingency reserve, load following services and frequency regulation services, all of which demand a high response time, high availability and accuracy in being controllable.

4.2.1. Capacity-Based Compensation Schemes

In reliability-oriented DR programs, customer remuneration is commonly structured around the reservation of flexible capacity, typically quantified in kilowatts (kW). In this paradigm, the participants are paid capacity to commit a certain amount of controllable load or generation capacity regardless of whether or not such flexibility is ultimately activated in a DR event. These are common especially in the ancillary service markets and capacity markets, where the system operators are focused on the adequacy of resources and the security of operations [29]. Capacity-based compensation schemes are designed to guarantee the availability of sufficient reserve resources to manage unexpected contingencies.

4.2.2. Mileage-Based Performance Payments

In fast-response DR services, particularly those focused on frequency regulation and grid balancing, customers are often compensated not only for reserving capacity but also for the actual work performed in adjusting their load or generation. This form of compensation is called mileage-based performance payment, which is the scale of compensation calculated by the amount of energy or power modification of the participants in line with the real-time system control signals during a given time span [83]. For example, if a customer’s controllable load increases or decreases repeatedly in response to automatic generation control (AGC) signals, the total magnitude of these adjustments over time determines the payment. This approach ensures that participants are rewarded for both the accuracy and the speed of their response, rather than merely for committing capacity that may remain idle. As a result, resources that can track regulation signals more precisely and respond with minimal delay achieve higher revenues, reinforcing the economic value of superior control performance [31]. AGC-based demand response belongs to the category of fast balancing and regulation services, where flexible loads or aggregations are dispatched continuously to follow a regulation signal issued by the system operator. The role of AGC is to support secondary frequency control by correcting short-term imbalances between generation and demand and helping maintain scheduled interchange and system frequency. In this setting, DR resources are not rewarded only for being available, but for how accurately and rapidly they track the dispatched signal over time.

4.2.3. Incentive-Compatible Contract Design

Incentive-based DR programs are fundamentally designed to achieve incentive compatibility, which ensures that participants truthfully reveal their DR costs and preferences by acting to maximize their own profits [84]. Among the various mechanisms developed, the Vickrey–Clarke–Groves (VCG) approach stands out as one of the most prominent auction-based mechanisms, as it simultaneously guarantees incentive compatibility and maximizes social welfare [85]. In the VCG model, the best and most desirable strategy of all the players is to declare their actual utility functions, and results are socially optimal. The resulting equilibrium is a Nash equilibrium, in which none of the players can increase their payoff by acting unilaterally to change their strategy, based on the strategy of the other players. Li et al. have come up with a resource adequacy settlement model that is based on the VCG approach where generators are paid back their real capacity costs [86].

The VCG system has certain problems in practice, even if it sounds good in theory. It requires users to share all of their preferences with the aggregator, which might make them worry about their privacy. The process also has a lot of computing complexity and does not ensure budget balance on its own [87]. In real-world situations, people typically choose other game-theoretic processes that work with imperfect knowledge to deal with these problems. The Arrow–d’Aspremont–Gerard-Varet (AGV) mechanism is one example. It provides Bayesian incentive compatibility in uncertain situations [88]. Other examples are Bayesian-VCG and VCG-like mechanisms. Even while these options may not be completely strategy-proof, they can better secure the privacy of participants, keep the budget in check, and lower the amount of computing power needed [89]. Incentive compatibility is a top-down market design principle that focuses on enhancing the general well-being of society rather than the profits of individual stakeholders. China is working on rules regarding how to run an incentive-compatible electricity market. These rules are founded on the idea that makes sure that the costs and payments from DR programs are evenly split between utilities and participants [21]. Nonetheless, the design and implementation of completely incentive-compatible procedures continue to be a challenging endeavor. The VCG mechanism and similar ones that are almost incentive-compatible are best for small, high-paying DR programs, like those in auxiliary services, where the advantages of honest reporting and maximizing societal welfare are most important. Different incentive mechanisms in DR markets such as baseline dependence, their vulnerabilities and mitigation approaches are provided in Table 4.

5. Digital Trust Enablers in DR Market

Digital technologies play an increasingly important role in enhancing transparency and trust in incentive-based DR programs, and below are some of the technologies currently deployed that bring trust into DR markets.

5.1. Blockchain Technology

Blockchain technology enables secure, immutable, and transparent recording of DR transactions and baseline calculations, facilitating automated settlement through smart contracts. Blockchain, as a distributed ledger technology, enables secure and trustworthy information sharing among multiple participants. It is organized as a chain of data blocks, where each block is cryptographically linked to its predecessor through hash-based signatures [37]. This leads to the conclusion that perhaps a remedy to the multi-aggregator environment lies in the cooperation game theory, in which forming alliances and attempting to benefit the players mutually become easier yet the strategy can act only in the case of interactions of single service. In addition, it provides a technical foundation for smart contracts, which are digital agreements that are automatically executed once predefined conditions are satisfied [90]. Figure 6 illustrates a blockchain-based trading framework for incentive-driven DR programs. All transactions of the system and security of the process are handled by the cyber layer, which is represented by the smart contract, whereas the physical layer is concerned with power distribution and collection of data about customers. Each customer submits their load data, DR capacity, expected remuneration, and credit rating, which are stored across sequential ledger blocks. Spot checks verify data integrity between blocks. Smart contracts perform matching and market clearing based on supply and demand, and update credit scores according to ex-post-performance. Finally, payments are automatically executed by the blockchain according to verified results, culminating in the final validated ledger block. During this procedure, every data that is submitted and processed is time-stamped and secured by cryptographic signatures. This makes it impossible to change and easy to track.

Li et al. showed that self-reported baselines that use blockchain may improve consumer privacy and security at a minimal cost of implementation. To make data security even better, each person who uses the blockchain gets a unique identity certificate and a set of public and private cryptographic keys [91]. The load collector checks the data via associated public key, while the private key encrypts and signs the information that the customer sends in to make sure it is real. Wang et al. suggested a spot-check technique inside this framework. This approach allows the load aggregator to randomly ask clients to provide them their private keys so they may double-check settlement data and outcomes. If false information is found, the customer’s credit score goes down. Random and regular spot inspections can then prevent cheating of people, limit those who abuse situations and instill trust among those who are in the market. Nevertheless, such processes may be detrimental to the privacy of clients and cause a decrease in the willingness to participate [92].

Furthermore, due to its decentralized computing structure, blockchain naturally supports peer-to-peer (P2P) transactions, allowing market participants to trade directly without relying on intermediaries. Blockchain-based trading systems are great for P2P energy markets because they are safe, reliable, and trustworthy at the peer level [93]. In general, there is a high level of trust and security in smart contracts based on IoT and blockchain technology to prevent fake baseline manipulations along with making the entire trading cycle of incentive-based DR programs more legitimate, transparent, reliable, and affordable.

5.2. Federated Learning

FL lets people work together to train models without sharing raw data. This keeps client privacy safe and makes models more generalizable. Combining blockchain with FL might lead to a safe, private, and reliable baseline intelligence. FL has become a well-explored distributed learning paradigm for data security, privacy protection, and information credibility because it allows for collaborative model training while keeping data private [94]. FL, on the other hand, tries to cut down on data interaction as much as possible, which lowers the danger of privacy leaks. Blockchain, on the other hand, focuses on protecting shared data using cryptographic methods and consensus protocols. Figure 7 shows how to estimate the reference load in an FL architecture. Customers do not have to provide their raw usage statistics under this framework. Instead, local models are trained at each customer’s location, and only the model parameters or changes are sent to a central server for global aggregate. This way, the global model may find similar patterns in different datasets, which makes baseline load assessment more accurate and able to be used in more situations [95]. FL allows users to work together to learn while keeping their data safe and private.

Chen et al. put up an FL method for rebuilding residential baseline loads while protecting privacy [96]. Shi et al. created an FL-based system for predicting how flexible homes would be, with privacy protection incorporated [97]. Cheng et al. suggested an optimization technique based on FL and a Stackelberg game to find the best DR participants without putting users’ privacy at risk [98]. FL can also be applied to blockchain to store records of model transformations that cannot be altered and are traceable. Danish et al. developed an FL method that can be paired with a blockchain to compute the base loads to charge electric vehicles and protect the private and sensitive data of the users [99].

Estimating the load of clients in incentive-based DR plans requires not just sophisticated modeling approaches but also well-thought-out incentive systems and strong information technologies. Therefore, baseline estimation requires making a lot of trade-offs between different needs, such accuracy and simplicity, budget balance and incentive compatibility, and privacy and openness. The identification of the most appropriate estimating techniques, remuneration strategies, and electronic infrastructures requires comprehensive techno-economic optimization including consideration of the speed, duration, and orientation of DR reaction and effectiveness of customers’ response. Load aggregators want to get as much money as possible from utilities while spending as little as possible on things like capital investment, operating and maintenance expenses, and consumer compensation. The structure of local flexibility markets affects how important opposing goals are to each other. For instance, in developing countries with little financial resources, keeping the budget balanced should be the most important thing to do to make sure the program works. In contrast, in established markets with substantial cash flow, it is important to focus on incentive compatibility in order to keep the market efficient and sustainable over the long term.

6. Open Challenges and Future Research Directions

With the rapid growth of distributed energy resources (DERs), an expanding pool of flexible assets is now being mobilized to participate in a wide range of market-oriented DR services. However, the diversity of regulatory frameworks and operational rules complicates the accurate quantification, thereby introducing new challenges for estimation. These challenges include the invisibility of behind-the-meter consumption, the coexistence of customers across multiple services, and the problem of limited or unavailable data. The following subsections elaborate on these emerging issues.

6.1. Invisible Behind-the-Meter Consumption Behaviors

Behind-the-meter DERs, which are situated on the customer’s side of the utility meter and make demand-side flexibility better, are becoming more common, as depicted in Figure 8. In these instances, aggregators can only see the net load. They have to guess the real generation and consumption behind the meter based on restricted metering infrastructure [100]. The fact that these hidden resources are not always available may change load profiles and make baseline gaming more likely, such as on-purpose cutting down on PV production or moving battery discharge. End-users are using on-site generating technologies, including rooftop micro wind turbines and building-integrated PV systems, a lot to develop low-carbon and sustainable buildings [101]. However, a lot of renewable energy makes client load profiles more vulnerable to weather factors. To address this problem, Tian et al. suggested a strong architecture that uses control groups and meteorological data to segregate PV production from net load [102,103].

At the same time, the fast rise of electric cars (EVs) has led to a lot of charging stations in homes and businesses, making it easier for people to go about on their own. However, stochastic EV charging patterns make normal load profiles less predictable, therefore they need to be properly handled in baseline estimate. To solve this problem, Kammona et al. suggested a memory-based transformer that can find behind-the-meter EV charging events without using preset charging signatures [104]. Also, the widespread use of electrical and thermal energy storage systems gives us more ways to use multi-time-scale flexibility, such as load shifting, peak shaving, and quick frequency control [105]. These technologies may change how much power customers use by allowing them to charge and discharge in different ways. However, they also make it harder to estimate the baseline.

While current research provides sophisticated methodologies for disaggregating behind-the-meter distributed energy resources (DERs) from net load and identifying possible manipulation, the presence of many DER technologies may create intricate and bidirectional power flow patterns. As a result, it is still very hard to get accurate baseline loads from multiplex consumption profiles using the monitoring infrastructure that is already in place. To tackle this problem, explainable artificial intelligence (XAI) has come to light as a potential way to get credible baseline estimates. XAI tries to make AI models more transparent and understandable by showing why they make predictions. There are two main types of XAI methods: intrinsic and post hoc [106]. In contrast, post hoc methods provide explanations for black-box models without modifying their internal architectures, thereby offering model-agnostic interpretability. Common techniques like SHAP, LIME, and ICE have been used to determine how much each input variable affects solar and wind power forecasts and to rank their relative relevance [107].

6.2. Overlapping Consumers in Multi-Service Participation

As behind-the-meter DERs become more common, consumers may take part in more than one incentive-based DR program at the same time. Renewable energy sources are known to be important for services like peak-shaving and contingency reserve. Energy storage solutions, such fixed batteries and electric vehicles, are typically used alongside intermittent renewables to provide them more flexibility. Because they can charge both ways and respond quickly, storage systems may take part in interruptible load programs, load following services, and frequency control. Active power storage, which is often used with building HVAC systems, may also provide the flexibility across a wide range of time scales [104]. These technologies may help reduce load throughout the day and can quickly change the frequency by shutting down equipment and smartly controlling devices with variable speeds. Some clients are signed up with more than one aggregator and may use numerous services at the same time. This makes things less clear and makes it harder to set up fair settlement and compensation systems [4].

In order to address the problem of overlapping participation in EV aggregation, Ji et al. proposed the models by Stackelberg game in which aggregators act as leaders and the EVs as followers to maximize the overall payoffs [103]. Fraija et al. created cooperative Stackelberg game frameworks for multi-aggregator settings for residential consumers who are part of more than one aggregation scheme. Their strategies seek to diminish uncertainty, lower utility costs, and equitably distribute gains via profit-sharing systems [32]. These findings suggest that a potential solution to multi-aggregator environments is the cooperative game theory that makes it easier to build alliances and seeks to mutually benefit the players although its applicability is limited to single-service interactions. Tang et al. suggested a Stackelberg-game-based optimization framework for prosumers using different flexibility services to find the best transaction costs and flexibility capacities, taking into consideration the balance between customer happiness and profit [33].

Consequently, the distribution of benefits for consumers participating in various DR services presents a multi-party dilemma, including both aggregators and customers, as depicted in Figure 9. To stop double compensation and scenarios where utilities pay for flexibility that is not given while consumers take advantage of the system, strong incentive and settlement systems are needed. In this situation, blockchain-enabled FL frameworks that use cryptographic methods to make data unchangeable and trackable while preserving privacy via distributed learning are a potential way to increase trust and openness.

6.3. The Dilemma of Unavailable Data

Most methods for estimating baseline load include a lot of past non-DR consumption data. But if a customer participates frequently in ordinary DR activities, such as demand bidding, and also in a few highly important emergency events, such as contingency reserve activation, this may result in a serious shortage of clean non-DR data [108]. In these situations, baseline estimation approaches have a serious paucity of data for training, validating, and calculating models. To solve this problem, generative adversarial networks (GANs) and transfer learning have been used to deal with the lack of data and small sample sizes in rare-event situations. Li et al. [109] and Tian et al. [110] also came up with hybrid ways based on transfer learning for buildings with little knowledge and big EV fleets that let them predict power use across domains in the near term. Gao et al. have devised a GAN hybrid model for predicting solar radiation, which allows for knowledge transfer from source systems to target systems without any labels [111]. Nevertheless, the acceptance of synthetically generated data for baseline load estimation by both aggregators and customers remains an open question, particularly given the limited transparency of many generative modeling approaches. A summary of open challenges in customer baseline load estimation is presented in Table 5 to provide a holistic overview.

6.4. Trustworthiness of Baseline Load Estimation Methods

A major future research challenge is not only improving the predictive performance of baseline load estimation methods, but also ensuring their trustworthiness in real market deployment. In incentive-based DR, a baseline is used as a financial reference for compensation, penalty assignment, and performance verification. Therefore, a method with high statistical accuracy may still be unsuitable if it is difficult to audit, sensitive to strategic manipulation, unstable under changing customer behavior, or unable to provide clear justification during settlement disputes. From this perspective, trustworthy baseline intelligence should satisfy several additional requirements, including transparency of model logic, reproducibility of results, robustness to gaming, calibrated uncertainty reporting, privacy-preserving verification, and compatibility with auditable digital infrastructures.

7. Conclusions

Customer baseline load is a fundamental reference for evaluating DR performance and determining customer compensation in incentive-based programs. In particular, it examines a broad spectrum of modeling approaches, market-compatible incentive structures, and advanced digital technologies that enable transparent and reliable DR transactions. Based on this review, several key findings are summarized below.

First, no universal baseline estimation approach can be applied across all market-based DR services in a cost-effective manner. For reliability-driven programs characterized by high remuneration levels and stringent performance requirements, ML models and hybrid data-driven approaches are frequently adopted to improve predictive precision, although this often comes at the expense of model simplicity and interpretability. Economy-oriented DR services with their focus on energy-based compensation, in turn, prefer less complex mechanisms, i.e., self-declared baselines and reputation-based credit scoring, which are grounded in the repeated-game theory and focus on long-term participation incentives. Second, incentive designs based on non-cooperative game-theoretic principles offer promising tools for deterring baseline manipulation and strategic behavior. Third, emerging digital infrastructures, including Internet-of-Things-enabled blockchain platforms and privacy-preserving FL frameworks, are reshaping the architecture of DR markets. These technologies make smart contracts tamper-proof, settlement processes auditable, and, thus, make it easier to verify customer contributions in a decentralized way, thus increasing the data integrity and making it less likely that someone will intentionally distort the baseline.

The increasing penetration of distributed energy resources and the diversification of regulatory instruments are leading to more irregular and heterogeneous consumption patterns, which introduce additional complexity for baseline estimation. Several critical challenges arise in this evolving landscape, including the disaggregation of behind-the-meter resources with limited observability, the redesign of settlement rules for customers participating simultaneously in multiple services, and the validation of synthetic data generated by ML models to address data scarcity and rare-event scenarios. In addition to traditional paradigms of the baseline, two alternative methods of measuring customer flexibility are presented that have potential in the future. A Customer-Directrix Load Program is an advanced smart grid strategy where utilities provide end users with a dynamically calculated, ideal energy consumption target known as the directrix. Rather than simply asking customers to reduce power during emergencies, the utility transmits this optimal baseline curve to a home or building’s smart energy management system. The automated system then adjusts flexible loads, such as smart thermostats, electric vehicle chargers, and water heaters, to closely track and mirror this target curve minute by minute throughout the day. By keeping real-time power consumption perfectly aligned with the utility’s mathematical directrix, this approach maximizes grid stability, better integrates fluctuating renewable energy sources, and rewards participating customers with financial incentives. This simplifies the computations, lessens the controversy over fairness, and reduces the congestion at the distribution level. Despite their potential, these alternatives are currently constrained by incomplete incentive structures and limited-service applicability.

Looking forward, a central research direction lies in developing an integrated baseline intelligence framework that combines IoT, blockchain, and artificial intelligence to achieve accurate, cost-efficient, manipulation-resistant, and privacy-aware performance quantification. In this context, the use of explainable and physics-informed ML techniques is a promising avenue to meet the trade-off between predictive accuracy and transparency and interpretability of ML models. Similarly, hybrid architectures that combine blockchain with FL and cryptographic privacy-enhancing technologies are strong candidates for enabling trustworthy and confidential market transactions.

Author Contributions

Conceptualization, S.S., B.L., M.A. and A.M.; methodology, S.S., B.L. and Q.G.; software, S.S., B.Q. and B.B.; validation, Q.G., B.Q., M.A. and B.B.; formal analysis, M.A., B.L. and A.M.; investigation, S.S., B.L. and M.A.; resources, B.L., Q.G. and B.Q.; data curation, A.M. and B.Q.; writing—original draft preparation, S.S., B.L. and A.M.; writing—review and editing, A.M., Q.G., B.Q. and B.B.; visualization, B.B. and B.Q.; supervision, B.L. and Q.G. and B.Q.; project administration, B.Q., B.B. and B.L.; funding acquisition, B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Beijing Changping District’s Special Program for Science and Technology Deputy Chief under the title: “Construction of a Resource Pool for the New-Type Power Load Management System and Development of Interactive Simulation Software” (2023-806).

Data Availability Statement

Data will be available on demand from the corresponding authors.

Acknowledgments

The authors would like to thank Beijing Changping District’s Special Program for Science and Technology.

Conflicts of Interest

Author Qi Guo was employed by the Inner Mongolia Power (Group) Company Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Ejuh Che, E.; Roland Abeng, K.; Iweh, C.D.; Tsekouras, G.J.; Fopah-Lele, A. The impact of integrating variable renewable energy sources into grid-connected power systems: Challenges, mitigation strategies, and prospects. Energies 2025, 18, 689. [Google Scholar] [CrossRef]
Kabeyi, M.J.B.; Olanrewaju, O.A. Sustainable energy transition for renewable and low carbon grid electricity generation and supply. Front. Energy Res. 2022, 9, 743114. [Google Scholar] [CrossRef]
Islam, M.M.; Yu, T.; Giannoccaro, G.; Mi, Y.; La Scala, M.; Nasab, M.R.; Wang, J. Improving reliability and stability of the power systems: A comprehensive review on the role of energy storage systems to enhance flexibility. IEEE Access 2024, 12, 152738–152765. [Google Scholar] [CrossRef]
Tang, H.; Wang, S.; Li, H. Flexibility categorization, sources, capabilities and technologies for energy-flexible and grid-responsive buildings: State-of-the-art and future perspective. Energy 2021, 219, 119598. [Google Scholar] [CrossRef]
Siano, P. Demand response and smart grids—A survey. Renew. Sustain. Energy Rev. 2014, 30, 461–478. [Google Scholar] [CrossRef]
Li, Y.; Yang, Y.; Zhang, F.; Li, Y. A Stackelberg game-based approach to load aggregator bidding strategies in electricity spot markets. J. Energy Storage 2024, 95, 112509. [Google Scholar] [CrossRef]
Türkoğlu, A.S.; Erkmen, B.; Eren, Y.; Erdinç, O.; Küçükdemiral, İ. Integrated approaches in resilient hierarchical load forecasting via TCN and optimal valley filling based demand response application. Appl. Energy 2024, 360, 122722. [Google Scholar] [CrossRef]
Li, K.; Wang, F.; Mi, Z.; Fotuhi-Firuzabad, M.; Duić, N.; Wang, T. Capacity and output power estimation approach of individual behind-the-meter distributed photovoltaic system for demand response baseline estimation. Appl. Energy 2019, 253, 113595. [Google Scholar] [CrossRef]
Lee, H.; Jang, H.; Oh, S.H.; Kim, N.W.; Kim, S.; Lee, B.T. Novel single group-based indirect customer baseline load calculation method for residential demand response. IEEE Access 2021, 9, 140881–140895. [Google Scholar] [CrossRef]
Muqtadir, A.; Li, B.; Ying, Z.; Songsong, C.; Kazmi, S.N. Nowcasting the next hour of residential load using boosting ensemble machines. Sci. Rep. 2025, 15, 7157. [Google Scholar] [CrossRef]
Chandrasekaran, R.; Paramasivan, S.K. Advances in deep learning techniques for short-term energy load forecasting applications: A review. Arch. Comput. Methods Eng. 2025, 32, 663–692. [Google Scholar] [CrossRef]
Klyuev, R.V.; Morgoev, I.D.; Morgoeva, A.D.; Gavrina, O.A.; Martyushev, N.V.; Efremenkov, E.A.; Mengxu, Q. Methods of forecasting electric energy consumption: A literature review. Energies 2022, 15, 8919. [Google Scholar] [CrossRef]
Srinivasan, S.; Kumarasamy, S.; Andreadakis, Z.E.; Lind, P.G. Artificial intelligence and mathematical models of power grids driven by renewable energy sources: A survey. Energies 2023, 16, 5383. [Google Scholar] [CrossRef]
Xu, T.; Wang, F. Review and prospect of power demand response implementation. Distrib. Energy 2024, 9, 1–11. [Google Scholar]
Gabaldón, A.; García-Garre, A.; Ruiz-Abellón, M.C.; Guillamón, A.; Álvarez Bel, C.; Fernandez-Jimenez, L.A. Improvement of customer baselines for the evaluation of demand response through the use of physically-based load models. Util. Policy 2021, 70, 101213. [Google Scholar] [CrossRef]
Ziras, C.; Heinrich, C.; Bindner, H.W. Why baselines are not suited for local flexibility markets. Renew. Sustain. Energy Rev. 2021, 135, 110357. [Google Scholar] [CrossRef]
Liaquat, S.; Zia, M.F.; Benbouzid, M. Modeling and formulation of optimization problems for optimal scheduling of multi-generation and hybrid energy systems: Review and recommendations. Electronics 2021, 10, 1688. [Google Scholar] [CrossRef]
Zhang, Y.; Fan, S.; Meng, Y.; He, G. Payment and incentive allocation in demand response based on cost causation principle. IEEE Trans. Ind. Appl. 2025, 61, 8674–8687. [Google Scholar] [CrossRef]
Valentini, O.; Andreadou, N.; Bertoldi, P.; Lucas, A.; Saviuc, I.; Kotsakis, E. Demand response impact evaluation: A review of methods for estimating the customer baseline load. Energies 2022, 15, 5259. [Google Scholar] [CrossRef]
Lind, L.; Chaves-Ávila, J.P.; Valarezo, O.; Sanjab, A.; Olmos, L. Baseline methods for distributed flexibility in power systems considering resource, market, and product characteristics. Util. Policy 2024, 86, 101688. [Google Scholar] [CrossRef]
Yan, X.; Gao, B.; Yu, Y.; Liu, N. Cooperated operation for renewable energy community with energy storage capacity rental in the frequency regulation market. IEEE Trans. Ind. Inform. 2023, 20, 5182–5192. [Google Scholar] [CrossRef]
Chen, Y.; Liu, F.; Wei, W.; Mei, S. Energy sharing at demand side: Concept, mechanism and prospect. Autom. Electr. Power Syst. 2021, 45, 1–11. [Google Scholar]
Lee, K.; Lee, H.; Lee, H.; Yoon, Y.; Lee, E.; Rhee, W. Assuring explainability on demand response targeting via credit scoring. Energy 2018, 161, 670–679. [Google Scholar] [CrossRef]
Lv, T.; Yan, Y.; Li, L.; Zhou, Z.; Zhang, Z.; Zhang, T.; Yang, L.; Lin, Z. Credit-based demand side incentive mechanism optimization for load aggregator. Energy Rep. 2022, 8, 227–234. [Google Scholar] [CrossRef]
Anil, V.; Arun, S. Credit rating-based transactive energy system with uncertainties in energy behavior. IEEE Access 2023, 11, 132101–132118. [Google Scholar] [CrossRef]
Wang, X.; Tang, W. Modeling and analysis of baseline manipulation in demand response programs. IEEE Trans. Smart Grid 2021, 13, 1178–1186. [Google Scholar] [CrossRef]
Segovia, E.; Vukovic, V.; Bragatto, T. Comparison of baseline load forecasting methodologies for active and reactive power demand. Energies 2021, 14, 7533. [Google Scholar] [CrossRef]
Haring, T.; Andersson, G. Contract design for demand response. In Proceedings of the IEEE PES Innovative Smart Grid Technologies, Europe; IEEE: New York, NY, USA, 2014; pp. 1–6. [Google Scholar]
Hurley, D.; Peterson, P.; Whited, M. Demand Response as a Power System Resource; Synapse Energy Economics Inc.: Cambridge, MA, USA, 2013. [Google Scholar]
Chapman, N. Techno-Economic Assessment of Business Cases for Multi-Energy Demand Response. Ph.D. Thesis, The University of Manchester, Manchester, UK, 2016. [Google Scholar]
Ullah, K.; Basit, A.; Ullah, Z.; Aslam, S.; Herodotou, H. Automatic generation control strategies in conventional and modern power systems: A comprehensive overview. Energies 2021, 14, 2376. [Google Scholar] [CrossRef]
Fraija, A.; Henao, N.; Agbossou, K.; Kelouwani, S.; Fournier, M. Cooperative price-based demand response program for multiple aggregators based on multi-agent reinforcement learning and Shapley-value. Sustain. Energy Grids Netw. 2024, 40, 101560. [Google Scholar] [CrossRef]
Tang, H.; Wang, S. Game-theoretic optimization of demand-side flexibility engagement considering the perspectives of different stakeholders and multiple flexibility services. Appl. Energy 2023, 332, 120550. [Google Scholar] [CrossRef]
Chao, H.-P. Demand response in wholesale electricity markets: The choice of customer baseline. J. Regul. Econ. 2011, 39, 68–88. [Google Scholar] [CrossRef]
Chao, H.-P.; DePillis, M. Incentive effects of paying demand response in wholesale electricity markets. J. Regul. Econ. 2013, 43, 265–283. [Google Scholar] [CrossRef]
Zhang, L.; Zhu, T.; Xiong, P.; Zhou, W.; Yu, P.S. A robust game-theoretical federated learning framework with joint differential privacy. IEEE Trans. Knowl. Data Eng. 2022, 35, 3333–3346. [Google Scholar] [CrossRef]
Sciume, G.; Palacios-Garcia, E.J.; Gallo, P.; Sanseverino, E.R.; Vasquez, J.C.; Guerrero, J.M. Demand response service certification and customer baseline evaluation using blockchain technology. IEEE Access 2020, 8, 139313–139331. [Google Scholar] [CrossRef]
Ghasemi, A.; Hojjat, M.; Saebi, J.; Neisaz, H.R.; Hosseinzade, M.R. An investigation of the customer baseline load (CBL) calculation for industrial demand response participants–A regional case study from Iran. Sustain. Oper. Comput. 2023, 4, 88–95. [Google Scholar] [CrossRef]
Lee, E.; Lee, K.; Lee, H.; Kim, E.; Rhee, W. Defining virtual control group to improve customer baseline load calculation of residential demand response. Appl. Energy 2019, 250, 946–958. [Google Scholar] [CrossRef]
Wu, T.; Hu, R.; Zhu, H.; Jiang, M.; Lv, K.; Dong, Y.; Zhang, D. Combined IXGBoost-KELM short-term photovoltaic power prediction model based on multidimensional similar day clustering and dual decomposition. Energy 2024, 288, 129770. [Google Scholar] [CrossRef]
Muqtadir, A.; Li, B.; Qi, B.; Chen, S.; Shi, K. DualDRNet: A Unified Deep Learning Framework for Customer Baseline Load Estimation and Demand Response Potential Forecasting for Load Aggregators. IEEE Access 2025, 13, 167280–167301. [Google Scholar] [CrossRef]
Muthirayan, D.; Kalathil, D.; Poolla, K.; Varaiya, P. Baseline estimation and scheduling for demand response. In Proceedings of the 2018 IEEE Conference on Decision and Control (CDC); IEEE: New York, NY, USA, 2018; pp. 4857–4862. [Google Scholar]
Binz Varghese, K. Baseline Load Estimation of Residential Customers for Incentive Based Demand Response Programs. Ph.D. Thesis, Universitat Politècnica de Catalunya, Barcelona, Spain, 2024. [Google Scholar]
Jiang, H.; Kong, X.; Zhang, X.; Wang, Z.; Guo, M. Deep Learning-based Quantile Regression for Demand Response Potential Assessment. In Proceedings of the 2024 IEEE Power & Energy Society General Meeting (PESGM); IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
Zhang, X.; Hug, G.; Kolter, J.Z.; Harjunkoski, I. Demand response of ancillary service from industrial loads coordinated with energy storage. IEEE Trans. Power Syst. 2017, 33, 951–961. [Google Scholar] [CrossRef]
Chen, Y.; Xu, P.; Chu, Y.; Li, W.; Wu, Y.; Ni, L.; Bao, Y.; Wang, K. Short-term electrical load forecasting using the Support Vector Regression (SVR) model to calculate the demand response baseline for office buildings. Appl. Energy 2017, 195, 659–670. [Google Scholar] [CrossRef]
Weng, Y.; Yu, J.; Rajagopal, R. Probabilistic baseline estimation based on load patterns for better residential customer rewards. Int. J. Electr. Power Energy Syst. 2018, 100, 508–516. [Google Scholar] [CrossRef]
Tehrani, N.H.; Khan, U.T.; Crawford, C. Baseline load forecasting using a Bayesian approach. In Proceedings of the 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE); IEEE: New York, NY, USA, 2016; pp. 1–4. [Google Scholar]
Li, K.; Wang, B.; Wang, Z.; Wang, F.; Mi, Z.; Zhen, Z. A baseline load estimation approach for residential customer based on load pattern clustering. Energy Procedia 2017, 142, 2042–2049. [Google Scholar] [CrossRef]
Sun, M.; Wang, Y.; Teng, F.; Ye, Y.; Strbac, G.; Kang, C. Clustering-based residential baseline estimation: A probabilistic perspective. IEEE Trans. Smart Grid 2019, 10, 6014–6028. [Google Scholar] [CrossRef]
Bapin, Y.; Zarikas, V. Probabilistic estimation of spinning reserves in smart grids with Bayesian-driven reserve allocation adjustment algorithm. Int. J. Energy Sect. Manag. 2021, 15, 433–455. [Google Scholar] [CrossRef]
Klem, A.; Stephen, G. Capacity valuation of demand response in the presence of variable generation through Monte Carlo analysis. In Proceedings of the 2019 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT); IEEE: New York, NY, USA, 2019; pp. 1–5. [Google Scholar]
Jost, D.; Speckmann, M.; Sandau, F.; Schwinn, R. A new method for day-ahead sizing of control reserve in Germany under a 100% renewable energy sources scenario. Electr. Power Syst. Res. 2015, 119, 485–491. [Google Scholar] [CrossRef]
Wang, S.; Sun, Y.; Zhang, S.; Zhou, Y.; Hou, D.; Wang, J. Very short-term probabilistic prediction of PV based on multi-period error distribution. Electr. Power Syst. Res. 2023, 214, 108817. [Google Scholar] [CrossRef]
He, Y.; Yu, N.; Wang, B. Online probability density prediction of wind power considering virtual and real concept drift detection. Appl. Energy 2025, 396, 126318. [Google Scholar] [CrossRef]
Gassar, A.A.A. Short-term energy forecasting to improve the estimation of demand response baselines in residential neighborhoods: Deep learning vs. machine learning. Buildings 2024, 14, 2242. [Google Scholar] [CrossRef]
Xiao, J.W.; Liu, P.; Fang, H.; Liu, X.K.; Wang, Y.W. Short-term residential load forecasting with baseline-refinement profiles and bi-attention mechanism. IEEE Trans. Smart Grid 2023, 15, 1052–1062. [Google Scholar] [CrossRef]
Ardiansyah; Kim, Y.; Choi, D. Lstm-based multi-step soc forecasting of battery energy storage in grid ancillary services. In Proceedings of the 2021 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm); IEEE: New York, NY, USA, 2021; pp. 276–281. [Google Scholar]
Khosravi, N.; Dowlatabadi, M.; Sabzevari, K. A hierarchical deep learning approach to optimizing voltage and frequency control in networked microgrid systems. Appl. Energy 2025, 377, 124313. [Google Scholar] [CrossRef]
Yao, X.; Fu, X.; Zong, C. Short-term load forecasting method based on feature preference strategy and LightGBM-XGboost. IEEE Access 2022, 10, 75257–75268. [Google Scholar] [CrossRef]
Jo, Y.; Hur, J. An improved ramp events forecasting of wind generating resources using ensemble learning of numerical weather prediction: The case of Jeju Island’s wind farms. Therm. Sci. Eng. Prog. 2025, 66, 103936. [Google Scholar] [CrossRef]
Hao, Z.; Liu, S.; Zhang, Y.; Ying, C.; Feng, Y.; Su, H.; Zhu, J. Physics-informed machine learning: A survey on problems, methods and applications. arXiv 2022, arXiv:2211.08064. [Google Scholar]
Huang, X.; Li, Q.; Tai, Y.; Chen, Z.; Liu, J.; Shi, J.; Liu, W. Time series forecasting for hourly photovoltaic power using conditional generative adversarial network and Bi-LSTM. Energy 2022, 246, 123403. [Google Scholar] [CrossRef]
Harikrishnan, G.; Sreedharan, S. Advanced short-term load forecasting for residential demand response: An XGBoost-ANN ensemble approach. Electr. Power Syst. Res. 2025, 242, 111476. [Google Scholar]
Wu, W.; Wang, Z.; Li, X.; Guo, L.; Liu, Y.; Zhai, J.; Wang, C. Physics-informed probability distribution assessment for primary frequency regulation capability of wind farms considering wind speed uncertainty. J. Mod. Power Syst. Clean Energy 2025, 14, 541–551. [Google Scholar]
Bao, Y.Q.; Shen, C.; Wang, Q.; Zhang, J.L. Demand response based on Kalman-filtering for the frequency control. J. Electr. Eng. Technol. 2019, 14, 1087–1094. [Google Scholar] [CrossRef]
Panda, S.K. Electrical load and solar power forecasting using machine learning techniques. J. King Saud Univ.–Eng. Sci. 2025, 37, 1–14. [Google Scholar] [CrossRef]
Wen, Q.; Liu, Y. Feature engineering and selection for prosumer electricity consumption and production forecasting: A comprehensive framework. Appl. Energy 2025, 381, 125176. [Google Scholar] [CrossRef]
Koltsaklis, N.; Panapakidis, I.P.; Pozo, D.; Christoforidis, G.C. A prosumer model based on smart home energy management and forecasting techniques. Energies 2021, 14, 1724. [Google Scholar] [CrossRef]
AlHammadi, K.A. Analysis of Energy Storage Technologies in the United Arab Emirates: Current State and Future Needs. Ph.D. Thesis, Khalifa University of Science, Abu Dhabi, United Arab Emirates, 2024. [Google Scholar]
Ellman, D.; Xiao, Y. Incentives to manipulate demand response baselines with uncertain event schedules. IEEE Trans. Smart Grid 2020, 12, 1358–1369. [Google Scholar] [CrossRef]
Zhen, C.; Niu, J.; Tian, Z.; Lu, Y.; Liang, C. Risk-averse transactions optimization strategy for building users participating in incentive-based demand response programs. Appl. Energy 2025, 380, 125009. [Google Scholar] [CrossRef]
Vuelvas, J.; Ruiz, F.; Gruosso, G. Limiting gaming opportunities on incentive-based demand response programs. Appl. Energy 2018, 225, 668–681. [Google Scholar] [CrossRef]
Wang, X.; Tang, W. A self-reported baseline demand response program for mitigation of baseline manipulation. In Proceedings of the 2021 IEEE Power & Energy Society General Meeting (PESGM); IEEE: New York, NY, USA, 2021; pp. 1–5. [Google Scholar]
Muthirayan, D.; Kalathil, D.; Poolla, K.; Varaiya, P. Mechanism design for demand response programs. IEEE Trans. Smart Grid 2019, 11, 61–73. [Google Scholar] [CrossRef]
Wijaya, T.K.; Vasirani, M.; Aberer, K. When bias matters: An economic assessment of demand response baselines for residential customers. IEEE Trans. Smart Grid 2014, 5, 1755–1763. [Google Scholar] [CrossRef]
Dobakhshari, D.G.; Gupta, V. A contract design approach for phantom demand response. IEEE Trans. Autom. Control 2018, 64, 1974–1988. [Google Scholar] [CrossRef]
Dehghanpour, K.; Nehrir, H. An agent-based hierarchical bargaining framework for power management of multiple cooperative microgrids. IEEE Trans. Smart Grid 2017, 10, 514–522. [Google Scholar] [CrossRef]
Yang, Y.; Zhao, Y.; Yan, G.; Mu, G.; Chen, Z. Real time aggregation control of P2H loads in a virtual power plant based on a multi-period Stackelberg game. Energy 2024, 303, 131484. [Google Scholar] [CrossRef]
Samadi, M.; Kebriaei, H.; Schriemer, H.; Erol-Kantarci, M. Stochastic demand response management using mixed-strategy Stackelberg game. IEEE Syst. J. 2022, 16, 4708–4718. [Google Scholar] [CrossRef]
Cui, F.; An, D.; Zhang, G. A game strategy for demand response based on load monitoring in smart grid. Front. Energy Res. 2023, 11, 1240542. [Google Scholar] [CrossRef]
Nekouei, E.; Alpcan, T.; Chattopadhyay, D. Game-theoretic frameworks for demand response in electricity markets. IEEE Trans. Smart Grid 2014, 6, 748–758. [Google Scholar] [CrossRef]
Stanelyte, D.; Radziukyniene, N.; Radziukynas, V. Overview of demand-response services: A review. Energies 2022, 15, 1659. [Google Scholar] [CrossRef]
Aghamohamadi, M.; Hajiabadi, M.E.; Samadi, M. A novel approach to multi energy system operation in response to DR programs; an application to incentive-based and time-based schemes. Energy 2018, 156, 534–547. [Google Scholar] [CrossRef]
Ma, K.; Kumar, P. Incentive compatibility in stochastic dynamic systems. IEEE Trans. Autom. Control 2020, 66, 651–666. [Google Scholar] [CrossRef]
Li, Z.; Hu, P.; Li, S.; Zhao, W.; Wang, X. Research on Capacity Market Settlement Method Based on VCG Mechanism. In Proceedings of the 2024 3rd Asian Conference on Frontiers of Power and Energy (ACFPE); IEEE: New York, NY, USA, 2024; pp. 200–204. [Google Scholar]
Abedrabboh, K.; Al-Fagih, L. Applications of mechanism design in market-based demand-side management: A review. Renew. Sustain. Energy Rev. 2023, 171, 113016. [Google Scholar] [CrossRef]
Wang, T.; Xu, Y.; Withanage, C.; Lan, L.; Ahipaşaoğlu, S.D.; Courcoubetis, C.A. A fair and budget-balanced incentive mechanism for energy management in buildings. IEEE Trans. Smart Grid 2016, 9, 3143–3153. [Google Scholar] [CrossRef]
Yang, F.; Feng, J.; Li, D.; Zhang, B. Incentive-compatibility auction and renewable energy pricing strategy based on incomplete Information game of price cost declaration. Energy Sources Part B Econ. Plan. Policy 2023, 18, 2280880. [Google Scholar] [CrossRef]
Koukaras, P.; Afentoulis, K.D.; Gkaidatzis, P.A.; Mystakidis, A.; Ioannidis, D.; Vagropoulos, S.I.; Tjortjis, C. Integrating blockchain in smart grids for enhanced demand response: Challenges, strategies, and future directions. Energies 2024, 17, 1007. [Google Scholar] [CrossRef]
Li, B.; Banimenia, I.; Chuan, L.; Zhansheng, H.; Zhao, J. Incentive-based demand response program with self-reported baseline supported by blockchain technology. IET Smart Grid 2023, 6, 205–218. [Google Scholar] [CrossRef]
Xi, L.; Wang, C.; Zheng, T.; Zhang, K. Baseline Load Estimation for Demand Response Based on Blockchain and Neural Networks. In Proceedings of the 2023 IEEE International Conference on Mechatronics and Automation (ICMA); IEEE: New York, NY, USA, 2023; pp. 2259–2264. [Google Scholar]
Junaidi, N.; Abdullah, M.P.; Alharbi, B.; Shaaban, M. Blockchain-based management of demand response in electric energy grids: A systematic review. Energy Rep. 2023, 9, 5075–5100. [Google Scholar] [CrossRef]
Sarker, M.A.A.; Shanmugam, B.; Azam, S.; Thennadil, S. Enhancing smart grid load forecasting: An attention-based deep learning model integrated with federated learning and XAI for security and interpretability. Intell. Syst. Appl. 2024, 23, 200422. [Google Scholar] [CrossRef]
Wang, R.; Qiu, H.; Gao, H.; Li, C.; Dong, Z.Y.; Liu, J. Adaptive horizontal federated learning-based demand response baseline load estimation. IEEE Trans. Smart Grid 2023, 15, 1659–1669. [Google Scholar] [CrossRef]
Chen, Y.; Chen, C.; Zhang, X.; Cui, M.; Li, F.; Wang, X.; Yin, S. Privacy-preserving baseline load reconstruction for residential demand response considering distributed energy resources. IEEE Trans. Ind. Inform. 2021, 18, 3541–3550. [Google Scholar] [CrossRef]
Shi, Y.; Li, Y. Demand Response Flexibility Prediction of Residents Based on Federated Learning. In Proceedings of the 2022 China International Conference on Electricity Distribution (CICED); IEEE: New York, NY, USA, 2022; pp. 70–74. [Google Scholar]
Cheng, H.; Lu, T.; Hao, R.; Li, J.; Ai, Q. Incentive-based demand response optimization method based on federated learning with a focus on user privacy protection. Appl. Energy 2024, 358, 122570. [Google Scholar] [CrossRef]
Danish, S.M.; Hameed, A.; Ranjha, A.; Srivastava, G.; Zhang, K. Block-FeDL: Electric vehicle charging load forecasting using federated learning and blockchain. IEEE Trans. Veh. Technol. 2024, 74, 2048–2056. [Google Scholar] [CrossRef]
Srivastava, A.; Zhao, J.; Zhu, H.; Ding, F.; Lei, S.; Zografopoulos, I.; Haider, R.; Vahedi, S.; Wang, W.; Valverde, G.; et al. Distribution system behind-the-meter ders: Estimation, uncertainty quantification, and control. IEEE Trans. Power Syst. 2024, 40, 1060–1077. [Google Scholar] [CrossRef]
Li, H.; Wang, Z.; Hong, T.; Piette, M.A. Energy flexibility of residential buildings: A systematic review of characterization and quantification methods and applications. Adv. Appl. Energy 2021, 3, 100054. [Google Scholar] [CrossRef]
Tian, J.; Gao, Y.; Wang, X.; Chen, Y. Behind-the-Meter PV Power Disaggregation via Ensemble Machine Learning Methods. In Proceedings of the 2024 9th Asia Conference on Power and Electrical Engineering (ACPEE); IEEE: New York, NY, USA, 2024; pp. 1170–1174. [Google Scholar]
Qu, Z.; Ge, X.; Lu, J.; Wang, F. Unsupervised disaggregation of aggregated net load considering behind-the-meter PV based on virtual PV sample construction. Appl. Energy 2025, 381, 125007. [Google Scholar] [CrossRef]
Kamoona, A.; Song, H.; Jalili, M.; Wang, H.; Razzaghi, R.; Yu, X. Online electric vehicle charging detection based on memory-based transformer using smart meter data. Appl. Energy 2025, 398, 126353. [Google Scholar] [CrossRef]
Zang, X.; Li, H.; Wang, S. Levelized cost quantification of energy flexibility in high-density cities and evaluation of demand-side technologies for providing grid services. Renew. Sustain. Energy Rev. 2025, 211, 115290. [Google Scholar] [CrossRef]
Gopal, J.N.; Madhu, B.; Somu, S.; Anand, A.J. Explainable AI (XAI) for Energy Demand Forecasting. In Neural Networks and Graph Models for Traffic and Energy Systems; IGI Global Scientific Publishing: Hershey, PA, USA, 2025; pp. 115–138. [Google Scholar]
Nwakanma, C.I.; Ahakonye, L.A.C.; Njoku, J.N.; Gyawali, P.; Srivastava, A.K. Explainable AI for interpretable and model-agnostic energy consumption prediction. In Proceedings of the 2025 IEEE Industry Applications Society Annual Meeting (IAS); IEEE: New York, NY, USA, 2025; pp. 1–6. [Google Scholar]
Peivand, A.; Farsani, E.A.; Abdolmohammadi, H.R. Accelerating optimal scheduling prediction in power system: A multi-faceted GAN-assisted prediction framework. Renew. Energy 2024, 230, 120830. [Google Scholar] [CrossRef]
Li, G.; Wu, Y.; Yan, C.; Fang, X.; Li, T.; Gao, J.; Xu, C.; Wang, Z. An improved transfer learning strategy for short-term cross-building energy prediction using data incremental. In Proceedings of the Building Simulation; Springer: Berlin/Heidelberg, Germany, 2024; Volume 17, pp. 165–183. [Google Scholar]
Tian, C.; Liu, Y.; Zhang, G.; Yang, Y.; Yan, Y.; Li, C. Transfer learning based hybrid model for power demand prediction of large-scale electric vehicles. Energy 2024, 300, 131461. [Google Scholar] [CrossRef]
Gao, Y.; Hu, Z.; Shi, S.; Chen, W.A.; Liu, M. Adversarial discriminative domain adaptation for solar radiation prediction: A cross-regional study for zero-label transfer learning in Japan. Appl. Energy 2024, 359, 122685. [Google Scholar] [CrossRef]

Figure 1. Applications of load forecasting across time horizons.

Figure 2. Conceptual structure of load baseline modeling.

Figure 3. Day matching approach for baseline load estimation. The color line represents different days.

Figure 4. Frequency control using Kalman filter.

Figure 5. Application of market-based DR incentives.

Figure 6. Incorporation of blockchain technology in DR markets.

Figure 7. Implementation of federated learning in DR markets.

Figure 8. Invisible behind-the-meter consumption behaviors.

Figure 9. Overlapping consumers in multi-service participation.

Table 1. Mapping between DR service classes and baseline, metering, and verification requirements.

Service Class	Typical Programs	Settlement Basis	Baseline and Measurement Requirements (Practical Implications)	Key Refs.
Grid-economy-oriented DR	Demand bidding, interruptible load, emergency response, direct load control	Energy deviation (kWh) relative to baseline during events	Sensitive to baseline bias and event-day abnormality. Needs transparent baseline rules. Needs bias control. Needs explicit validation. Ensures genuine flexibility is rewarded.	[19,20,22]
Reputation and performance rating (within economy-oriented DR)	Credit scoring, performance rating, priority dispatch	Long-run score derived from repeated events; affects access, price, and selection	Require detect baseline manipulation. Validate performance ex post. Use repeated participation data. Ensure auditable scoring	[23,24,25,26]
Self-reported baseline declaration (within economy-oriented DR)	Customer declares expected consumption; deviations settled	Declared baseline as settlement reference; penalties for strategic misreporting	Needs penalty clauses. Needs auditing. Needs historical benchmarking. Preserves fairness and credibility.	[27,28]
Grid-reliability-oriented DR	Contingency reserve, load following, frequency regulation	Capacity reservation (kW) and performance obligations; penalties for non-delivery	Baseline is secondary to availability. Needs qualification tests. Needs availability constraints. Needs verification procedures. Ensures contracted flexibility can be delivered when called.	[29,30]
Fast balancing and regulation services	Frequency regulation and real-time balancing	Mileage-based performance plus capacity components	Needs high time-resolution metering. Needs strict tracking metrics. Rewards fast and accurate response. Focuses on signal tracking and responsiveness. Not limited to event energy alone.	[3,31]
Local flexibility markets	Distribution-level congestion management and local services	Service-specific settlement; baseline disputes can dominate	Baselines may be unsuitable in some local settings. Baseline-light alternatives may be preferable. Can reduce fairness disputes. Can reduce settlement friction where applicable.	[16]
Multi-service and multi-aggregator participation	Storage and distributed energy resources (DERs) portfolios participating across services and aggregators	Mixed settlement across services; risk of double compensation	Requires coordinated accounting and settlement rules. Prevents double counting. Preserves stakeholder incentives. Requires strong governance and auditable coordination.	[5,32,33]

Table 2. Comparison of baseline estimation techniques for DR from operational and market-settlement perspectives.

Estimation Technique	Ease of Implementation	Estimation Reliability	Settlement Bias Risk	DER/EV Observability	Non-Stationarity Robustness	Applicable DR Schemes	Typical End-Users
Rule-based technique	Very High	Limited	High	Low	Low	Load curtailment programs	Small and commercial consumers
Reference group technique	High	Moderate	Medium	Medium	Medium	Market-based bidding	Commercial and light industrial users
Day matching	High	Moderate	Medium to High	Low	Low	Interruptible load contracts	Residential participants
Statistical curve-fitting	Moderate	Moderate	Medium	Low to Medium	Medium	Price-responsive DR	Residential customers
Probabilistic approaches	Moderate	Moderate	Medium	Moderate	Medium	Capacity reserve, demand bidding	Prosumers, energy storage systems
ML algorithms	Low	High	Medium	Moderate to High	Medium	Load following, frequency support	Industrial loads, aggregators
Hybrid and physics-informed techniques	Low	High	Low to Medium	High	High	Multi-service DR (ancillary and energy)	Prosumers, storage, industrial users

Table 3. Quantitative evaluation of baseline load estimation approaches.

Estimation Category	Typical Error (MAPE)	Data Requirements	Computational Scalability and Complexity
Rule-Based and Statistical	High (10–20%+)	Low: Requires only historical load data and basic calendar mapping.	Very High: Uses simple arithmetic operations and scales to millions of users.
Regression-Based	Moderate (8–15%)	Moderate: Requires historical load, weather data, and time-of-day features.	High: Fast training and inference; easy to deploy in cloud environments.
Probabilistic Approaches	Low to Moderate (5–12%)	High: Requires historical variance, distribution metrics, and stochastic variables.	Moderate: More computationally demanding due to simulation and parameter updating.
Machine Learning	Low (3–8%)	Very High: Requires large granular historical datasets and high-resolution metering.	Low to Moderate: High training time and possible need for GPUs.
Hybrid and Physics-Informed	Very Low (<5%)	Extensive: Requires multi-domain data, including physical parameters such as thermal inertia.	Low: Complex to tune and deploy; scalability is often limited to specific microgrids.

Table 4. Incentive mechanisms in DR markets: baseline dependence, main vulnerabilities, and mitigation approaches.

Mechanism	Baseline Dependence	Typical Vulnerabilities	Recommended Mitigation and Verification	Key Refs.
Energy-based compensation	High	Overpayment or underpayment due to systematic bias. Reduced market confidence if reductions are not verifiable.	Add validation and verification procedures. Monitor bias. Enforce transparent baseline rules. Ensure reductions reflect genuine flexibility.	[22]
Credit scoring and performance rating	Medium	Baseline manipulation to improve apparent delivery. Scoring instability when detection is weak.	Use manipulation detection. Apply repeated-event scoring. Maintain auditable score and eligibility updates.	[24,25]
Self-reported baseline declaration	High	Intentional baseline inflation. Information asymmetry between customer and operator.	Use penalty clauses. Audit declared baselines. Benchmark against historical data. Enforce settlement rules for deviations and non-dispatched overuse.	[74]
Capacity-based compensation	Low to medium	Inflated capability claims. Non-delivery when called. Over-procurement risk.	Use qualification tests. Enforce availability obligations. Apply non-delivery penalties. Periodically verify technical readiness.	[86]
Mileage-based performance payments	Low to medium	Poor tracking quality hidden by coarse monitoring. Misalignment when performance metrics are weak.	Use high-resolution telemetry. Track response accuracy and speed. Settle based on performed regulation work.	[21]
Incentive-compatible contract design (auction and mechanism design)	Medium	High complexity. Privacy concerns. Budget-balance constraints in practice.	Use budget-balanced and privacy-aware variants. Simplify computation. Retain strategic robustness where feasible.	[84,85]

Table 5. Open challenges for baseline load estimation and market settlement, with research directions.

Challenge	Why It Disrupts Baseline Estimation and Settlement	Research Directions and Practical Design Responses	Key Refs.
Invisible behind-the-meter DER behavior	Aggregators often see only net load. DER generation and flexible demand remain hidden. This increases uncertainty and gaming risk.	Net-load disaggregation. Context-aware modeling. Verification designs that reduce manipulation.	[101,102]
Overlapping participation across multiple services and aggregators	Concurrent participation creates attribution ambiguity. It may cause double compensation. It may also cause unpaid non-delivery.	Coordinated settlement rules. Cross-aggregator coordination. Auditable accounting infrastructures.	[4]
Lack of clean non-DR data for training and validation	Frequent DR participation reduces clean non-event data. Rare critical events further limit validation quality.	Synthetic data generation. Transfer learning. Validation protocols for synthetic data use.	[108,109]
Accuracy–interpretability trade-off	High-accuracy models may be hard to explain. This reduces trust. It also complicates settlement disputes.	Explainable AI. Physics-informed models. Balance accuracy, transparency, and credibility.	[106,107]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sajid, S.; Li, B.; Qi, B.; Berehman, B.; Guo, Q.; Athar, M.; Muqtadir, A. Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs. Energies 2026, 19, 1851. https://doi.org/10.3390/en19081851

AMA Style

Sajid S, Li B, Qi B, Berehman B, Guo Q, Athar M, Muqtadir A. Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs. Energies. 2026; 19(8):1851. https://doi.org/10.3390/en19081851

Chicago/Turabian Style

Sajid, Suhaib, Bin Li, Bing Qi, Badia Berehman, Qi Guo, Muhammad Athar, and Ali Muqtadir. 2026. "Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs" Energies 19, no. 8: 1851. https://doi.org/10.3390/en19081851

APA Style

Sajid, S., Li, B., Qi, B., Berehman, B., Guo, Q., Athar, M., & Muqtadir, A. (2026). Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs. Energies, 19(8), 1851. https://doi.org/10.3390/en19081851

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Baseline Load Estimation Using Intelligent Performance Quantification for Incentive-Based Demand Response Programs

Abstract

1. Introduction

Contributions of the Paper

2. Conceptual Foundations and Market-Oriented Requirements of Baseline Load Estimation

2.1. Reliability

2.2. Practicality

2.3. Fairness

2.4. Transparency

3. Taxonomy of Baseline Load Estimation Approaches

3.1. Rule-Based and Statistical Approach

3.2. Regression-Based Approaches

3.3. Probabilistic Approaches

3.4. Machine Learning Approaches

3.5. Hybrid and Physics-Informed Approaches

4. Incentive Mechanisms for Market-Oriented DR

4.1. Grid-Economy-Oriented Incentive Mechanisms

4.1.1. Energy-Based Compensation Structures

4.1.2. Credit-Scoring and Performance Rating Mechanisms

4.1.3. Self-Reported Baseline Declaration

4.1.4. Profit-Sharing and Revenue Allocation Models

4.1.5. Hierarchical Game-Theoretic Mechanisms

4.2. Grid-Reliability-Oriented Incentive Mechanisms

4.2.1. Capacity-Based Compensation Schemes

4.2.2. Mileage-Based Performance Payments

4.2.3. Incentive-Compatible Contract Design

5. Digital Trust Enablers in DR Market

5.1. Blockchain Technology

5.2. Federated Learning

6. Open Challenges and Future Research Directions

6.1. Invisible Behind-the-Meter Consumption Behaviors

6.2. Overlapping Consumers in Multi-Service Participation

6.3. The Dilemma of Unavailable Data

6.4. Trustworthiness of Baseline Load Estimation Methods

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI