Uncertainty Cost Functions in Climate-Dependent Controllable Loads in Commercial Environments

This article presents the development, simulation and validation of the uncertainty cost functions for a commercial building with climate-dependent controllable loads, located in Florida, USA. For its development, statistical data on the energy consumption of the building in 2016 were used, along with the deployment of kernel density estimator to characterize its probabilistic behavior. For validation of the uncertainty cost functions, the Monte-Carlo simulation method was used to make comparisons between the analytical results and the results obtained by the method. The cost functions found differential errors of less than 1%, compared to the Monte-Carlo simulation method. With this, there is an analytical approach to the uncertainty costs of the building that can be used in the development of optimal energy dispatches, as well as a complementary method for the probabilistic characterization of the stochastic behavior of agents in the electricity sector.


Introduction
The economic dispatch of electrical energy is an activity of great relevance within the sector. In this, the fulfillment of the energy, technical and economic expectations of the different agents that participate in the entire process of energy production, transport and consumption are sought. For this reason, when an energy dispatch is carried out, it is sought to be optimal, to meet the objectives of the different agents, taking into account the efficient use of infrastructure resources and with minimal associated costs.
Due to the above, several efforts have been presented around the world, focused on achieving an optimal energy dispatch. However, the development of these methodologies faces several challenges associated with the different variables that must be taken into account in the dispatch. One of them has to do with the inclusion of agents that bring characteristics of randomness and uncertainty in their behavior. This represents a challenge, since this type of behavior is incompatible with the development of optimal energy dispatch methodologies with analytical and deterministic approaches.
The agents that bring uncertainty variables within the energy system are mainly focused on renewable energy generators, such as solar and wind, and on controllable loads such as electric vehicles or environmental thermal control devices. For the case of generation, the randomness occurs due to renewable energy sources, which depend on solar irradiation [1] and wind speed [2], respectively. Regarding the loads, the uncertainty is present due to the ambient temperature conditions [3][4][5] and the transportation needs in a given place [6][7][8][9][10][11][12]. These types of phenomena are non-deterministic, so it is not possible to directly obtain a certain value and, therefore, try to define the exact behavior of the agents that base their behavior on them. Now, with the inclusion of these agents in the development of the activities of the energy sector, it is necessary that they also be taken into account when carrying out the economic dispatch of energy, and, moreover, that methodologies be developed so that this dispatch is optimum.
A recent methodology that seeks the inclusion of agents with random behavior in the economic dispatch is the development of uncertainty cost functions. This methodology allows us to obtain analytical cost equations, starting from probability density functions that describe the stochastic nature of these agents. With these equations, it is possible to carry out analytical economic dispatches that include all types of agents and that seek the optimization of both objectives and resources in all the main activities of the energy sector. The main characteristics of the uncertainty cost functions are detailed below.
In the past, different works have been developed that have sought to obtain uncertainty cost functions for both generating agents and demand agents, as can be seen in [10][11][12][13][14][15]. Most share the characteristic that, to obtain the analytical equations, predetermined probability density functions were assumed or approximated to characterize the random behaviors of the different agents of the sector under study in each work [16]. Although these equations obtained in this way are very useful when taking into account stochasticity in the economic dispatch, in several cases, they tend to skew certain behaviors of the agents they represent. This is because assuming or approximating random consumption or generation curves to probability distributions can eliminate characteristics of the analyzed curves [16][17][18][19][20].
Therefore, the need to find another way to characterize the consumption or generation curves of the agents that give stochasticity to the energy system has become evident, in order to obtain more precise representations of these agents in the development of economic dispatches. In the next sections, the state of the art, the methodological development, the results obtained and the conclusions are described, about an alternative method found to obtain more representative probability distributions of stochastic agents (controllable loads in the case of this work) and, as an ultimate goal, the uncertainty cost functions associated with its stochastic behavior.

Uncertainty Cost
The costs of uncertainty associated with the economic dispatch of energy refer to the economic sanctions or penalties that may occur, due to the underestimation or overestimation of the energy that needs to be delivered or received at a specific point in the electrical system [12]. These situations of underestimation or overestimation are generally presented by the inclusion of agents of random behavior in the electrical network. As mentioned above, this type of behavior does not allow obtaining predetermined values of energy to be delivered, as in the case of non-conventional generators, or of energy to receive, such as controllable loads. For this reason, to dispatch these agents, it is necessary to obtain an estimate of the energy they can generate or consume. In an economic dispatch with these conditions, cases of non-compliance in the committed energy can occur, both on the generation side and on the demand side, which in turn will lead to economic sanctions.
In its basic form, the costs of under-or overestimation can be represented by the following equations. C e,u,i (P e,i , P e,s,i ) = C e,u,i (P e,i − P e,s,i ) C e,o,i (P e,i , P e,s,i ) = C e,o,i (P e,s,i − P e,i ) where C e,u,i is the coefficient of costs for underestimating the energy to be consumed or delivered at node i; C e,o,i is the cost coefficient for overestimating the energy to be consumed or delivered at node i; P e,i is the real power delivered or demanded at node i; and P e,s,i is the power programmed to deliver or consume at node i. As can be seen from both equations, the costs are derived from the difference present between the programmed energy and the actual energy dispatched.
When applying these equations to the agents with stochastic behavior, the probability density functions that describe this behavior must be taken into account and that allow obtaining expected values of generated or consumed energy [10,12]. Therefore, it is necessary to use integrals of expected value in the cost equations for its overestimation or underestimation, having as the main variable the real energy demanded or delivered by the random agents. Considering this, the uncertainty cost equations for agents with stochastic behavior are presented below.

Uncertainty Costs for Overestimating
where f P e,i (P e,i ) is the probability density function used to describe the behavior of the stochastic agent and P e,∞ is the maximum scheduled demand that can be delivered to the controllable load by the system.

Agents of the Electricity Sector with Stochastic Behavior
As mentioned in Section 2.1, the uncertainty cost functions appear in agents that participate in the electric power dispatch process and that follow a random generation or consumption behavior, that is, not susceptible to being established in a deterministic way. This group of agents includes generators whose main source of energy production is a renewable resource. Renewable energy resources, such as the wind or the sun, are randomly time-varying, so anything that is directly related to them will have this same uncertainty characteristic. The foregoing also applies to another type of agent such as controllable loads. Electric vehicles that are connected within a distribution network in a city or automatic environmental conditioning devices within residential or commercial places, are examples of controllable loads that follow random behaviors associated with phenomena such as the need for mobility of people or the setting of the comfort temperature in a climate-dependent location, respectively.
As shown above, to obtain the uncertainty cost functions of these agents, it is necessary to use the probability density functions of the base phenomena of the agents behavior. In [12], it is possible to observe the procedure to obtain the uncertainty cost functions for photovoltaic and wind power generators. For wind generation, the probability density of wind speed [12] was taken into account, which in [11,15] appears as a Rayleigh distribution. Regarding photovoltaic generation, the behavior of solar irradiance was taken into account, which in [12] is assumed with a log-normal distribution. The above is true for specific regions, as specified in [21]. The process then focused on developing expected value integrals of the uncertainty costs taking into account these probability distributions, as detailed in Section 2.1.
Continuing with the characterization of non-conventional generation agents, in [13], you can see the determination of the uncertainty costs of run-of-river plants operating in a microgrid environment. As in [12], the behavior of these plants is subject to their source of generation, which in this case is the flow of the river from which they are supplied. In the literature [22][23][24], it is seen that an approximation can be made to the flow behavior through a Gumbel probability density. The remaining part of the process to characterize the uncertainty costs is followed, as in [12].
In the case of controllable loads, in [10], an approximation is made to their uncertainty costs through the analysis of integrals of expected value, taking into account probability distributions inherent to the behavior of controllable demand. The authors of [10][11][12][13] considered that the controllable loads can follow normal or beta distributions, so, in [10], the mathematical analysis is performed with these distributions. As a result, analytical expressions were obtained that can be integrated in the development of an optimal energy dispatch.

Climate-Dependent Controllable Loads
The work carried out in [10] considers the uncertainty cost functions for controllable loads, particularly those associated with electric vehicles. As a complementary effort, the work detailed in this report sought to achieve the uncertainty costs for climate-dependent controllable loads found in commercial environments. Therefore, we proceed to detail this type of loads.
Over the years, several studies have been developed that have tried to establish a consistent relationship between energy consumption and the ambient temperature of the place where one lives. As can be intuitively understood, non-linearity occurs due to the search for an ambient temperature that offers comfort in the place where it is inhabited, through the use of ambient conditioning devices. Therefore, if the temperature is below or above the optimal comfort temperature, generally, electrical devices such as air conditioners or heaters will be used to reach it again. This will be reflected in energy consumption, which will have a minimum point that corresponds to the optimal comfort temperature and maximum points related to temperatures far from the optimum [16].
Continuing with the work developed (see, e.g., [16][17][18]), it can be seen that this nonlinear relationship between ambient temperature and energy consumption is susceptible to being represented by a mathematical expression, when you have statistical data for the two variables. This is possible thanks to techniques such as linearization of the relationship through the use of variables exogenous to the model, multilinear regression and the use of dummy variables to weight the impact of different phenomena on the relationship to be analyzed. With this, it is possible to show the existence of controllable loads that are closely related to climate and temperature, such as environmental conditioning devices, which will act to maintain an ambient temperature that is considered comfortable.
The main contribution of the proposed study is its applicability in the calculation of uncertainty cost functions (UCF) and determining the probability functions required to calculate the UCF using kernels. In this way, the cost functions of uncertainty of controllable loads through the investigation of probability distributions and the analytical development of integrals of expected value were determined, for their application in an efficient energy dispatch.

Methodology for Probability Density Estimation
This section presents information on how to approximate or obtain the probability density functions that represent the behavior of stochastic agents. These are necessary to be able to develop the uncertainty cost functions of the latter.

Parametric Probability Density Functions
As mentioned in previous sections, to find the uncertainty costs of a random agent, it is necessary to have a probability density function that best describes the agent's behavior. Doing a check on the different works that have addressed the issue of uncertainty costs [10][11][12][13][14][15], it can be seen that most of them have used standard parametric probability density functions. These functions can be normal, beta, Rayleigh, Gumbell distributions, etc. They are known as parametric because, with 1-3 parameters for each function, it is possible to define them in their entirety.
It is important to mention that, to define the type of density function that best suits an agent and its base parameters, it is necessary to have statistical information on its behavior. Once this information is available, it is possible to approximate the probability density functions to the consumption or generation curve of the agent in question, through the use of software tools, such as the application of adjustment of the probability distribution of MatLab, among others. These tools allow the histogram of the statistical data of a random agent and the different probability density curves to be approximated to the histogram, with their parameters based on and calculated from the statistical data of the agent, to be viewed in the same graph. Bearing this in mind, the possibility is presented of seeing the density function that most closely approximates the agent. Figure 1 shows the behavior in one year of the controllable loads of the building that is part of the study of this article (the data come from a random building from over 1000 buildings over a three-year timeframe given by American Society of Heating, Refrigerating and Air Conditioning Engineers (ASHRAE) in the competition "ASHRAE-Great Energy Predictor" [25]; additionally, in Section 6, you can find more information on these loads) and the different probability densities that were chosen to represent it.
As shown in Figure 1, the density functions that would most represent the agent's behavior would be the normal, Gamma or inverse Gaussian functions (the data are from [10][11][12][13][14][15]). It is also seen that functions such as Weibull or Rayleigh would not be very useful in this case. Following the methodology developed in [10][11][12][13][14][15], one could think of using some of the densities closest to the agent to calculate its uncertainty costs. Although the above would represent a good approximation of costs, if you pay more attention to the past graph, you can see that there are sections of the histogram that are oversized or that are not covered by the closest density functions. Then, the uncertainty costs calculated with these functions will omit certain agent-specific details associated with these missing parts of their behavior.

Nonparametric Probability Density Functions
As mentioned above, although it is possible to try to approximate the behavior of a stochastic agent through parametric probability density functions, this causes certain statistical data characteristic of the agent to be omitted. To obtain analytical uncertainty cost functions that can be used in the development of efficient economic dispatches, it is necessary to use a methodology that allows obtaining probability densities closer to the random behavior of the agent.
A methodology used to estimate probability densities, given a set of statistical data, is the application of a kernel density estimator. This methodology consists of approximating a probability density curve to the behavior data of a stochastic agent, through the use of a function composed of a summation of kernel functions, which seek to represent each datum in the final density function. The composite function is known as the kernel density estimator. Here is an overview of the methodology mentioned above.

Kernel Density Estimator and Kernel Distribution
A kernel density estimator is a function that represents a special type of probability density known as a kernel distribution. This is characterized by being nonparametric, that is, it is not defined by one or more parameters. The kernel distribution is used when parametric density functions cannot properly describe the behavior of statistical data of a random variable, or when we want to avoid assumptions about their distribution [19].
If we have a series of n observations of an independent and identically distributed random variable (iid) of the form {x 1 , x 2 , ..., x n }, whose probability density f (x) is unknown, a kernel density estimatorf (x) can be used, which will approximate f (x) by the sum of functions K(x i , t) (kernel function), assigned to each observation of the series. The above is summarized in the following expression: In more detail with respect to the previous expression, it can be seen that the estimator f (x) consists of an average of core functions (K(x i , t)) which will have as main and individual parameter, each observation that belongs to the series [20]. It is worth mentioning that the kernel function is symmetric and generally smooth, with an arbitrary own parameter h, known as bandwidth, that controls the amount of smoothness which is wanted in the final estimated probability density curve [26]. The expression for K(x i , t) is shown below: From this expression, it should be noted that K kernel is a parametric probability density function that can be arbitrarily selected, as long as it meets the following characteristics: When reviewing these characteristics, especially the second one, it can be seen that we are talking about a probability density function. Then, it should be clarified that, in its most common practical applications, the estimator uses a symmetric K kernel (x); however, recently we have seen the use of asymmetric functions as a kernel for the recreation of probability density functions [20]. Qualitative characteristics of the use of one or the other are detailed below.

How the Density Estimator Works
When there is a series of observations of a random variable, of the form {x 1 , x 2 , ..., x n }, the density estimator will take each x i data as a main parameter of a small density function, graphically creating " bumps " of the form of the K kernel (x) function at each x i . Then, all these functions will be averaged and weighted to obtain the final density function of the set of observations [26]. Next, a graphical example of the operation of the estimator presented in [26] is shown. Figure 2 shows a dataset located at the points x i = {0.1, 0.15, 0.2, 0.5, 0.7, 0.8}, which are represented by the black bars. Each of these points is assigned a proper density function (K kernel (x)), which follows the behavior of a normal distribution in this case (purple curves). Then, the sum of these density functions typical of each point will lead to obtaining the estimated density curve for the dataset, represented by the brown curve.

Bandwidth h and Kernel Functions
As can be seen from Equation (6), the density estimator has two customization elements that can be adjusted as required in different applications. These elements are the bandwidth h and the kernel function K kernel (x).
In the configuration of a probability density estimator, the choice of the bandwidth parameter h plays an important role. This parameter is known by this name because it determines how much information of the base data series is reflected in the density function obtained from the estimator. It can be interpreted as a noise modulator in the final density curve [20].
A very small value of h can lead to the density curve obtained by the estimator showing insignificant or insignificant details. On the other hand, a very large value of h can cause a very smooth effect on the final curve, which could obviate important characteristics contained in the data information [20]. Figure 3 shows an example of the effect of choosing h. Regarding the kernel function K kernel (x), it is worth mentioning that symmetric functions are the most used in the most common practical applications; however, recently there has been an increase in the use of asymmetric functions [20]. The main difference between the use of one or the other has to do with the form that the kernel function takes depending on the location point of the data of the base series. As shown in Figure 2, symmetric functions, e.g., the normal distribution, have shapes that do not vary depending on where they are. On the other hand, asymmetric functions show different shapes for the different points of the data series.
The most common symmetric kernel functions are: Apart from these functions, there are other symmetric and asymmetric functions that can be used as kernels of the density estimator. Weglarczyk [20] presented different kernel functions.

Methodology for Cost of Uncertainty from Underestimating Electricity Demand
Looking at the uncertainty cost of underestimation equation below, you can see that the expression is quite similar to the cost of overestimation equation, except for the order of the addends P e,i and P e,s,i . Therefore, the methodology for obtaining the cost function for underestimation is the same as that used in the previous section.
Again, the density estimator with normal kernel function detailed in Section 6.2 is used: With this, the process of determining the cost function of the underestimation is: Next, the proposed indefinite integrals are calculated, with which the following expression is obtained (solving the integrals through the error function identities of Appendix A, the expected value of the cost function is obtained): Finally, taking into account that all the terms of the expression carry the sum ∑ n j=1 , it is possible to combine it to obtain the uncertainty cost function to be underestimated, as follows:

Uncertainty Cost for Overestimating Electricity Demand
As mentioned in Section 2.1, the determination of the analytical cost function for overestimating the electricity demand can be calculated by means of the following equation.
With the above, we proceed to calculate the expected value of the cost to be overestimated. First, the separation of the integral is performed on the addends of the expression. Next, the indefinite integrals are calculated, with which the following expression is obtained: Subsequently, a reorganization of terms is carried out: Finally, taking into account that all the terms of the expression carry the sum ∑ n j=1 , it is possible to combine them to obtain the uncertainty cost function for overestimation, as follows: Please note that the uncertainty cost functions for the analytical method does not depend of P e,i , since in the mathematical development f P e,i (P e,i ) is used, i.e., the probability density function used to describe the behavior of the stochastic P e,i .
In this way, the main contributions of the proposed methodology are: • Probability distributions able to handle the behavior of controllable loads in a residential, industrial or commercial context, by means of the applications proposed in the literature review are found. • The characteristic equations of the uncertainty costs of controllable loads by means of the analytical development of integrals of expected value are obtained . • An efficient energy dispatch that considers optimization of the costs of uncertainty of controllable loads is performed.

Simulation and Validation of Uncertainty Cost Functions
The validation of the cost functions (12) and (14), obtained in the previous section, was carried out from the methodology used in [10], with the exception of obtaining the density function probability, which is the result of the application of the density estimator in the series of statistical data of Building 944, as mentioned above.

Matlab Kernel Density Estimator
The numerical simulation software tool Matlab has an application with its own interface for the analysis of statistical data from the fitting of probability curves. This application is known as Distribution Fitter, and it is capable of fitting parametric or nonparametric density curves to statistical dataset that is to be analyzed and supplied as input to the system. Information about the use of this tool can be found in [27].
Using the Distribution Fitter, it was possible to obtain the probability density function based on the statistical data of Building 944, by using the density estimator with a normal type kernel function. Figure 4 shows the density curve obtained through the estimator and its comparison with parametric density functions. As can be seen, the estimator curve (red) has a more faithful fit to the data from Building 944, compared to the other functions, even with the normal function and the gamma function. This makes it the optimal option to represent the building's probability density function.
In addition, it should be mentioned that the Distribution Fitter automatically provided the value of the h bandwidth, which was 3.08079. As mentioned above, this value can be arbitrarily selected, so that a more or less smooth curve can be obtained, as required. With this value of h together with the dataset for Building 944, it is possible to simulate the cost functions (12) and (14).

Monte-Carlo Simulation
Similar to the methodology used in [10], to validate the uncertainty cost functions found, simulations were carried out with the Monte-Carlo simulation method. With this, a comparison could be made between the costs obtained from the analytical functions and those obtained from the simulations.
We next describe the validation algorithm. In a first stage, the random obtaining of the energy consumption values with nonparametric probability was carried out using the Monte-Carlo simulations. Then, we proceeded to determine if the consumption values fall into the categories of overestimation or underestimation. In case they did not enter either of the two categories, we went back to the first stage. Then, the uncertainty cost vector was formed, from which its mean value was obtained. In the final stage, this mean value was compared with the analytical calculation from the cost functions found, and the percentage error was found.
It is worth mentioning that Matlab's Distribution Fitter application offers the possibility of exporting the data of its density curve fits to the workspace, with which the characteristics of the nonparametric probability density function of Building 944 could be included in the simulation code.
In this way, the main steps used to perform the Monte-Carlo simulation are as follows: 1. A demand value is established that represents the power programmed by a network operator, normally from an economic dispatch model. 2. A Monte Carlo scenario is generated through a random value generated for a random generator according to the probability distribution (Section 2). 3. Given the value of the previous point, the value of the available demand is determined. 4. In this Monte Carlo scenario, the cost is evaluated: if it is in underestimated, the corresponding equation is used, and, similarly, if it is in overestimated, the corresponding equation is used. 5. Steps 2-4 are repeated several times. 6. A cost histogram is obtained where all Monte-Carlo scenarios are considered. 7. The expected value of the total accumulated cost is calculated, this quantity is the expected value of the uncertainty cost function and it is compared with the analytical value.

Data of the Controllable Load in Study
For the development of the work, statistical data in which hourly records of energy consumption and ambient temperature of buildings of different uses were evidenced, in Florida, USA, for the year 2016. In a first effort, a data treatment was carried out so that daily consumption and temperature records could be obtained in order to facilitate their manipulation. For the case of this article, the data of a building identified with the number 944 (hereinafter, Building 944) were used (a random building from over 1000 buildings over a three-year timeframe given by American Society of Heating, Refrigerating and Air Conditioning Engineers (ASHRAE) in the competition "ASHRAE-Great Energy Predictor"), due to its purpose for commercial use and the dependency relationship between energy consumption and ambient temperature, which is detailed below.

Climate Dependence
First, the dependence of the energy consumption data of the load under study, with respect to its ambient temperature, was determined. For the case of Building 944, when plotting its energy consumption data against its ambient temperature data, it was possible to visualize, in the first instance, the non-linear relationship with a "v" shape, which is characteristic in loads that present dependence between its consumption and ambient temperature. As mentioned in [17], this type of behavior is the result of the use of climatecontrollable loads such as heaters or air conditioners (room temperature control devices), which will activate to reach the temperature of comfort. Figure 5 presents the graph.
Then, as mentioned in Section 2.3, it is possible to obtain a mathematical expression that reflects the dependence of electricity demand on the ambient temperature for a controllable load. The necessary procedures that were carried out for the development of the expression for Building 944 are detailed below.

Multivariate regression
To obtain the mathematical expression of the dependence of the energy consumption on the ambient temperature, the multivariate regression method was used. This method consists of expressing a variable (energy consumption in this case), as the result of the weighted sum of the different variables that may be related to this [17]. Next, the variables used to carry out the multivariate regression are detailed.

Exogenous temperature variables
After visualizing the non-linear behavior of the relationship between energy consumption and ambient temperature in the building, we proceeded to linearize the curve by associating exogenous temperature variables to the expression. These variables are the Cooling Degree Days and the Heating Degree Days (CDD and HDD, respectively) [17]. These variables are related to temperature through the following expressions: Below, when reviewing the previous equations, it can be seen that, when one "acts", the other will be at 0, because the room temperature T amb will be on one side or the other of the optimal temperature T opt . With this, the points of the curve below T opt are covered by HDD and the points of the curve above T opt are covered by CDD.

Trend and seasonality variables
As can be seen in previous works [16,17], the behavior of energy consumption can present trends and seasonality related to the time series. The above refers to repetitive patterns or increases and decreases in the time series, sustained over time. These effects are reflected in the electricity demand due to certain behaviors that occur in consumption and that depend on the day of the week or the time of the year in which a certain demand record occurs. To take these effects into account, dummy variables (variables that take values of 1 when a condition is met and 0 otherwise) were introduced in the expression that represent the days of the week, the months of the year, holidays and the days immediately preceding holidays. As explained in previous works [16,17], these variables are taken into account to see the impact of both trend and seasonality on the expression.

Autoregression
Another effect that can occur in the treatment and analysis of time series is autoregression, as mentioned in [18]. This refers to the relationship that a datum in the series may have with previously recorded data. In this work, it is the dependence of the energy consumption that occurs in a day, with the consumption that has occurred in previous days. In [28], the presence of autoregression in time series related to electricity demand is evidenced, for which lag' variables were introduced to be able to observe the dependence of the series up to three days ago.

Structure of the regression expression
Taking into account the above variables, below you can see the equation used to characterize the relationship between consumption and temperature of Building 944.
where C is a constant; t is the variable time; e is the coefficient of the variable time; D is the Day; Mes is the Month; Hol is the Holiday; a is the coefficient of the variable CDD; b is the coefficient of the variable HDD; d i is the coefficient of the variable Day i of the week; m j is the coefficient of the variable Month j of the year; h 1 is the coefficient of the Holiday variable; h 2 is the coefficient of the variable Day before holiday; and l k is the coefficient of the variable lag k.
With the expression complete, it is possible to construct the regression equation using the Matlab functionality of multiple linear regression "regress". With this, in terms of the similarity of the replicated data from the found expression and the base data of Building 944, a correlation of R 2 of 0.9154 was obtained. Nevertheless, a high value of the correlation coefficient of 0.9154 does not yet prove the adequacy of the regression model; indeed, both F-test and t-test yielded p-values < 0.05 for several buildings (Table 1), which means the null hypotheses is rejected. In this way, the presented results of the multidimensional regression model do not justify the conclusions about the influence of temperature on the electricity demand (consumption), but this study assumes that, for the buildings of ASHRAE "ASHRAE-Great Energy Predictor" [25], there is a direct influence [29][30][31][32][33].
Over the years, several studies have been developed that have tried to determine a consistent relationship between demand and ambient temperature, in order to establish the former as a dependent variable. Taking into account characteristics of the behavior of ambient temperature such as randomness, seasonality and trends, the methods to find its relationship with electricity demand have used statistical (regression) and probabilistic (expected value functions) tools.The proposal is a linear multiple regression statistical procedure to validate this assumption for the set of ASHRAE buildings. Initially, there were hourly data on both demand and the building's ambient temperature. These were averaged to obtain daily data. Once we have the daily data, we proceed to obtain the characteristic curve of the behavior between electricity demand and temperature. The curve is characterized by having a "V" shape. Intuitively, it is understood that, when a place is inhabited, there will be an optimal comfort temperature (minimum point), and that, when the temperature is far from the optimal one, electrical devices (heaters or air conditioners, depending on the case) are used to return to the temperature again comfort, thus increasing demand at extreme points of temperature.
Additionally, this assumption comes from the comparison of the adjusted model found for the analyzed buildings with and without autoregression effect over ASHRAE buildings ( Table 1).
As shown in Table 1, in most cases, the models found through the multiple regressions have a high degree of similarity to the data examined for each building. Most of the models have an R 2 adjustment above 0.8, so the results simulated based on these have a relationship greater than 80% of similarity to the input data of the regressions. Furthermore, it is noteworthy that several models (5) have an R 2 close to or above 0.9 (90%). It is also worth mentioning the impact of the autoregression phenomenon on the models. The results of a first comparison reveal that in all the analyzed cases there is an improvement in the similarity in the models when autoregression is considered as a descriptive variable of energy consumption in buildings. This shows a dependence of energy consumption dependent on consumption in past days, speaking only of statistical data.
In particular, the variation that occurred in Buildings 984 and 1197 is highlighted by considering the effect of autoregression on the models. As shown in Table 1, the first goes from an R 2 of 0.529493 to one of 0.717893 and the second from an R 2 of 0.477236 to one of 0.901832. This implies that the behavior of the buildings is strongly influenced by autoregression, to the point that the found models initially have a low similarity relationship and then one with a high descriptive capacity of their behavior. This confirmed the relevance of the expression developed and therefore the dependence of the consumption data with respect to the ambient temperature of Building 944.

Building 944 Probability Density Function
After characterizing the dependence of energy consumption with respect to ambient temperature, the work focused on consulting probability distributions to be able to represent the statistical data of energy consumption of Building 944. After a literature review, the kernel density estimator described in Section 3.2.1 is reached. Thus, to perform the representation of the probability density of Building 944, the Gaussian kernel function of the Equation (7) was chosen.
To proceed with the calculation of the expected value integral of the uncertainty cost functions, it is first necessary to establish the probability density function f P e,i (P e,i ) that describes the behavior of Building 944. As mentioned above, for this density function, the density estimator with a normal type kernel function is used, applied in the statistical data of Building 944. Obtaining the expression of the estimator to be used is achieved by replacing function (7) in Equation (6) and finally in expression (5). This leads to the probability density function for Building 944, represented by the following equation.
where µ j is the data series for Building 944; n is the total of data in the series; and P e,i is the real energy demand as an integration variable.

Results and Discussion
In this section, the results obtained from simulation and validation are shown, for different conditions of dispatched power and maximum power that can be dispatched. This is to check the operation of the equations obtained, subjected to different conditions. Together, an analysis of the results is developed. It should be remembered that the power dispatched or scheduled power is P s and the maximum power to dispatch is P e,∞ . For these cases, the minimum, average and maximum consumption values of Building 944 are taken into account, identified as min944 = 8 kWh (Case 1), mean944 = 128.2749 kWh (Case 2) and max944 = 577 kWh (Case 3), respectively.
The results shown below are from 25 simulations for each case. It is worth mentioning that, for each simulation, 10,000 generations of random numbers were carried out for the consumption of Building 944, part of the Monte-Carlo simulation method.

Case 1
In Table 2 is shown the case 1 where P s = min944 and P e,∞ = max944. In Table 3 is shown the case 2 where P s = mean944 and P e,∞ = max944.

Case 3
In Table 4 is shown the case 3 where P s = max944 and P e,∞ = max944. The coefficient of costs for underestimating the energy c e,u,i is 30 USD/MW and the cost coefficient for overestimating the energy c e,o,i is 70 USD/MW using the data from previous studies [10][11][12]. The reason for large differences in costs for Cases 1-3 (USD 36,072.6, 7115.2 and 314,107.4, respectively) is that for the extreme case (Cases 1 and 3) the uncertainty cost increase due the probability distribution of the real demand (P e,i : the real power delivered or demanded at node i). P e,s,i (power programmed to deliver or consume at node i) is the set value for the controllable load to be determined by a optimization problem of economic dispatch, where there are decision variables not only of the power to be injected for the generation agents but also the controllable demand values to be schedules, that is to say, it is a exogenous variable assumed in the regression model.

Analysis of Results
The results obtained for Cases 1-3 show that the percentage differentials are minimal. Figures 6-8 present detailed graphical comparisons between the analytical functions and the Monte Carlo method for these cases. Visually, it can be seen that the values tend to be quite similar, in some simulations almost the same. It should be mentioned that Cases 1-3 have in common that the maximum energy to dispatch is equal to the record of maximum energy consumed by Building 944. Table 5 presents the average error in the 25 simulations for Cases 1-3. These errors do not exceed 1%, which confirms that the analytical functions found represent a good approximation to the uncertainty costs modeled from the Monte-Carlo simulation method.  In general, the previous simulation cases show that the equations found present an optimal behavior, as long as it is ensured that the maximum energy to be dispatched P e,∞ is large enough to cover the demand, in this case, feeding Building 944. This is related to the uncertainty cost function for underestimation that depends on the value of P e,∞ , as can be seen in Equation (12).

Conclusions
It is possible to determine uncertainty cost functions for agents in the electricity sector with specific characteristics from a probability density function that describes their behavior and the analytical development of integrals of expected value. The probability density estimator represents a very useful tool for approximating probability density functions for future use in analytical procedures. In this paper, the relationship between electrical demand and room temperature is considered, which is susceptible to linearization based on the inclusion of exogenous variables such as cooling degree days (CDD) and heating degree days (HDD).
In this way, the phenomenon of autoregression has a considerable effect on the time series used in this study. Its inclusion in the models obtained from multiple linear regressions makes it possible to obtain models closer to the data recorded for the different cases.
Through the uncertainty cost functions found, it is possible to obtain the costs of an agent in the electricity sector, as long as statistical data on their behavior are available. Additionally, the functions found represent a good approximation to the behavior of the uncertainty costs of the analyzed controllable load, since they present errors of less than 1%, compared to the Monte-Carlo simulations.
There is an intrinsic dependence between the analytical functions found and the maximum value of demand that could be scheduled to the controllable load. The inclusion of "dummy" variables that represent conditions external to the relationship to be analyzed allows a model adjustment more faithful to the reality of the recorded data.