Interval Optimization to Schedule a Multi-Energy System with Data-Driven PV Uncertainty Representation †

: Recently, multi-energy systems (MESs), whereby different energy carriers are coupled together, have become popular. For a more efﬁcient use of MESs, the optimal operation of these systems needs to be considered. This paper focuses on the day-ahead optimal schedule of an MES, including a combined heat and electricity (CHP) unit, a gas boiler, a PV system, and energy storage devices. Starting from a day-ahead PV point forecast, a non-parametric probabilistic forecast method is proposed to build the predicted interval and represent the uncertainty of PV generation. Afterwards, the MES is modeled as mixed-integer linear programming (MILP), and the scheduling problem is solved by interval optimization. To demonstrate the effectiveness of the proposed method, a case study is performed on a real industrial MES. The simulation results show that, by using only historical PV measurement data, the point forecaster reaches a normalized root-mean square error (NRMSE) of 14.24%, and the calibration of probabilistic forecast is improved by 10% compared to building distributions around point forecast. Moreover, the results of interval optimization show that the uncertainty of the PV system not only has an inﬂuence on the electrical part of the MES, but also causes a shift in the behavior of the thermal system.


Motivation and Background
The past decade has witnessed an increasing interest in multi-energy systems (MESs), where different energy carriers are coupled together to supply the energy demands of the customers (from electrical demand to thermal demand) [1]. Combined heat and power (CHP) and electric heat pumps are examples where electricity, heating/cooling, and a gas network are grouped together. However, the general definition of an MES is not only limited to these energy carriers (electricity and gas) and the mentioned services (electricity and heating/cooling). According to [2], an MES can be defined as a system with "multifuel inputs" and "multi-service outputs". Moreover, MESs can provide different benefits, such as a synergy effect, a greater integration of renewable energy sources and therefore greenhouse gas emission reduction, cost and energy saving, more flexibility in the energy system, a flexibility of supply, and so on [3,4].
In spite of the above advantages, a high number of decision variables and the combined carriers make the MES a more complex system compared to the de-coupled systems. Therefore, this complexity makes decision making and optimal operation more difficult [5]. Requiring more data, involving more sources of uncertainty and having more degrees of freedom are the examples that make the operation of MESs a daunting task [1]. Beyond the general complexity, local energy storage devices, such as electrical energy storage (EES) and thermal energy storage (TES), increase complexity due to the time dependency [6]. These energy storage devices can help not only to compensate the mismatch between local energy generation and local demand, but also to reduce the operational cost of the system by providing flexibility and grid balancing servicing. Therefore, obtaining the optimal scheduling of energy storage devices can result in improving self-consumption of the system and reducing the energy cost. On the other hand, the optimal scheduling of energy storage devices depends on the future energy generated by local distributed generation units, the future load of the system, and the prices for each energy carrier. All of the mentioned items are variable and uncertain. As a result, in order to obtain a realistic optimal day-ahead decision to schedule different energy devices, it is necessary to represent uncertainty in the system.

Relevant Literature
A large number of research projects have been conducted on the optimal scheduling of MESs in order to reduce the operational cost. An input-output model was introduced by Geidl et al., which links the input energy carriers to output energy carriers by a coupling matrix and so-called "energy hub" [7]. The energy hub model was used in [5,8,9] to optimally design and operate an MES, including some energy generation units and energy conversion units. In fact, the energy hub concept can be counted as an aggregation model of an MES, which can be useful in some applications [2]. However, in the energy hub approach, the model is based on aggregating different energy flows in a coupling matrix, which brings non-linearity to the model, because of the multiplication of variables, and increases the complexity of the problem. Therefore, a disaggregated model of the MES can lead to a more tractable problem [10]. A deterministic optimization approach was proposed to optimally schedule a battery and manage energy in a residential MES [11]. A day-ahead deterministic optimization was run in [12] to obtain an optimal schedule of an MES under different scenarios for wind power generation. In these models, the uncertainty of stochastic variables, such as the generation of renewable energy sources, energy carrier prices, and thermal and electrical loads, is neglected. Ignoring the available uncertainty in the system can result in non-realistic and non-optimal solutions.
In general, there are three approaches that can help to make optimal decisions under uncertainty: stochastic optimization, robust optimization, and interval optimization. A multi-stage stochastic optimization was proposed in [13,14] to optimally schedule an MES under the uncertainty of wind power generation. However, the main issue with stochastic optimization is the high computational time, which makes this method more difficult to apply to a complex system such as an MES. In spite of the stochastic optimization, robust optimization has a lower computation time and addresses the uncertainty by taking into account the worst-case scenario. Robust optimization was proposed in [15][16][17][18][19] to schedule the MECs, while different sources of uncertainty (such as the power generated by renewable resources, and electrical and thermal consumption) were taken into account. Although robust optimization presents a lower computational burden, it solves the problem pessimistically, which adds greater costs to the operation of the system.
Recently, interval optimization has become popular for addressing the uncertainty in MESs. In interval optimization, the stochastic parameter is represented by a level of uncertainty (uncertainty interval). In fact, interval optimization has the advantage over robust optimization of minimizing the objective function interval instead of the worst case scenario [20]. Moreover, interval optimization has a lower computation burden compared to stochastic optimization [21]. In [1], the certainty of wind power was addressed by interval optimization to optimally operate a gas-electricity system. In another research project, the uncertainties of PV and wind turbines were considered by interval mathematics to optimally operate an MES [22]. Yuan et al. proposed a method to model the correlation between irradiation and the output of PV and solar collectors, and the uncertainty of light intensity was then addressed via an interval optimization to operate an MES [23]. The uncertainties of household load consumption and PV production were modeled in [24]. The interval optimization was converted to a deterministic optimization, and a genetic algorithm was applied to solve it and optimally schedule the household appliances. In [25], an interval optimization was converted to a multi-objective optimization to schedule an MES's components while taking into account the uncertainty of the electricity price.
Although interval optimization seems to be a proper approach to address the uncertainty in a complex system such as the MES, the uncertainty interval is required as the input of this optimization. This uncertainty interval can be provided by a forecasting method or can be obtained by using a percentage of the point forecast value around itself [21]. In other words, this uncertainty interval is derived from the probabilistic distribution of forecast error. In all of the above literature, there is no good explanation about obtaining this uncertainty interval, and most of the time it was assumed that this interval was provided by another source. However, this assumption brings serious challenges when smaller-scale systems (e.g., a single MES) are considered, where neither the point forecast nor the probabilistic forecast are provided to the energy manager of a building. Therefore, it is worth focusing on obtaining a predicted distribution of stochastic variables in energy systems.
In general, there are two different approaches for obtaining probabilistic forecasts and generating uncertainty intervals afterwards: the parametric approach and the nonparametric approach. In the parametric approach, either a distribution function is fitted to the data, or the distribution is built around the point forecast. In [13], a t−location-scale distribution was fitted to obtain a probabilistic forecast of the wind power generation in an MES. However, in [1], the uncertainty interval was obtained from a building distribution function around the point forecast of wind power to be used in the scheduling of MES. In the second approach (non-parametric), no assumption is required on the type of the distribution of the stochastic variable, and the predictive distribution is obtained via a quantile function. A deep-learning-based probabilistic forecast was proposed in [26] to predict the intervals of wind power and the PV system. However, most of the forecasting techniques for renewable energy systems depend on weather forecast data, whereas for the scale of one system, weather forecast data are not freely and easily available [27]. As a result, a method that can predict interval numbers (predicted interval) and that is independent from the weather forecast needs to be proposed.

Contributions and Organization
As a result, in this paper, an interval optimization is applied to address the challenge of day-ahead scheduling of an industrial MES under the uncertainty of PV energy generation. An overview of the proposed method is shown in Figure 1. In the proposed interval optimization, the predicted intervals of day-ahead PV generation is used as the input to minimize the daily operational cost interval. The main novelties of this paper can be listed as follows: • The proposed method to forecast and represent the uncertainty of a PV system requires minimum data, which is an advantage over most current forecasting methods, which depend on weather prediction. • The reliability of the proposed probabilistic forecast method is evaluated and compared with a popular method as benchmark, where the distribution is built around the point forecast. • Not only is the impact of the uncertainty of the PV energy generation on both the electrical and thermal systems analyzed, but so is its impact on day-ahead and realtime operational cost. The rest of the paper is organized as follows. Section 2 introduces the non-parametric method to obtain the predicted interval for day-ahead PV generation. Section 3 models the proposed MES and formulates the interval optimization for the day-ahead scheduling problem. Section 4 discusses the case study and the simulation results. Finally, the paper is concluded in Section 5.

Point Forecast
In this paper, five different regression methods, including the persistence method, the simple moving average (SMA), the least absolute shrinkage and selection operator (LASSO), the feed-forward neural network (FFNN), and support vector regression (SVR), are implemented, and their performances are evaluated. In order to obtain the most accurate day-ahead point forecast of PV energy generation with a 15 min resolution, the results of all the methods are compared, and the most accurate method is selected. We would like to emphasize that this paper attempts to forecast day-ahead PV generation only based on historically measured PV generation data, without using any weather forecast information. This section briefly introduces the proposed regression methods as well as the metrics performance to evaluate the accuracy of the point forecast model.

Persistence Method
The persistence model is a basic method for representing PV generation in the future. The persistence model can be defined as a 'naive predictor' [28], which means that the output of the PV system at each time step on the next day is the same as its previous day at that time step [29]. For short-term predictions (less than 6 h), the persistence model is a strong competitor, while by increasing the time horizon, the accuracy declines [29]. The general formula for the persistence prediction is given in Equation (1): wherep d+1 (t) is the predicted PV energy generation at time t in day d + 1, and p d (t) is the measured value of the previous day.

SMA
In SMA, the future PV production is approximated by the average of the previous days measured data. The SMA can be formulated aŝ where l is the number of days ago or lags, which is a key element in SMA. The proper selection of l improves the accuracy of the regression method compared to the persistence method [30].

LASSO
We can formulate the forecast problem as a linear regression model: where the β values are the coefficients. Different methods have been proposed to estimate β values. LASSO is one of these techniques, which is a regularization method (which shrinks the coefficients towards zero) to avoid overfitting and is penalized by the L 1 penalty [31]. In order to obtain optimal coefficients, one needs to solve the following problem: where Y represents the observations, X is the features vector, which in this case is the previous observations, and η LASSO is the LASSO positive tuning parameter.

FFNN
The FFNN model is a type of artificial neural network (ANN), which is a common method for solving the regression problem [32]. In an FFNN, the input layer is connected to the output layer through (multiple) hidden layer(s), while each hidden layer includes a different number of neurons and an activation function. In this paper, the inputs of the FFNN are the historical measurements of PV generation, and the outputs comprise the day-ahead PV forecast at each time step t. In order to obtain the most accurate forecast, the hyper-parameters of the FFNN should be tuned, e.g., the number of neurons in each hidden layer as well as the type of activation functions in each layer.

SVR
Support Vector Machines (SVM) are well-known for complex classification problems [33]. The use of a SVM in a regression problem is known as Support Vector Regression (SVR), which was introduced by Drucker et al. [34]. SVR non-linearly maps the input vector into high-dimensional feature space and then performs a linear regression in that feature space. The SVR model can be formulated as a quadratic optimization, and a kernel function is needed to non-linearly map the data.

Evaluating Point Forecast
To evaluate the performance of the forecasting method, three different statistical metrics are defined [35]. To select the most accurate model to obtain day-ahead PV generation forecast, these metrics are calculated for each model and are compared.

1.
Normalized root-mean square error (NRMSE): The root-mean square error (RMSE) is one of the most commonly used evaluation metrics for time series point forecasts.
In general, the RMSE calculates the root-mean square of the error of the regression model. The NRMSE is calculated (see Equation (5)) when the real measurement and their forecast values are divided by the capacity of the PV installation.
where D is the total number of days in the data set, T is the time horizon, and P PV is the installed capacity of PV. Therefore, the NRMSE is reported in %.

2.
Normalized mean absolute error (NMAE): Another well-known performance metric is the mean absolute error (MAE), which considers the average of the forecast error over the whole data set. Here, if the point forecast and real measurement are divided by the capacity of the PV system, then the NMAE is obtained: 3.
R 2 : This metric shows the correlation between the predicted value and the real measured value: wherep is the mean value of measured data. A lower error of the forecast indicates a higher correlation.

Probabilistic Forecast
In this section, the methodology used to obtain a probabilistic forecast for day-ahead PV generation is explained. Assume a stochastic variable P with its cumulative distribution function (CDF) (F(P )). The quantile q α of P can be defined such that where α is the probability value between [0, 1]. According to (8), the CDF of the stochastic variable P can be determined via the quantiles.
In this paper, the historical error is used to determine the quantiles of predictive CDF. This method is based on the quantiles of the error e d (t) of the historical point forecast: After calculating the error of the point forecast for all the available data points in the data set, the empirical quantile distribution of the errors (q α e,d (t)) can be calculated. The challenge of applying the above approach to a PV system is that the generation of PV system is following a diurnal and seasonal pattern, which means that there is more production in the middle of the day, compared to the morning and evening. Moreover, the PV system generates more energy during the summer compared to the wintertime. Consequently, the absolute value of the forecast error is also higher during the middle of the day and the summer time, compared to the morning or evening and the wintertime. This pattern brings a challenge when the empirical quantile is calculated from historical errors. If the historical errors for all the previous days and all the hours are considered to calculate q α e,d (t), the irrelevant errors (Equation (10)) are also included in the calculations, which result in decreasing the performance of the probabilistic forecast. Therefore, in this paper, in order to improve the accuracy of the probabilistic forecast, only the errors of the important lags, which are selected from the point forecast, are taken into account to obtain the empirical quantile of the error. In other words, to calculate the quantiles of error for day d + 1, at each time step, the (q α e,d+1 (t)) can be calculated by the empirical quantile of each column: After obtaining the quantiles of the error, the day-ahead quantile function of PV energy generation at time step t can be calculated by To evaluate the performance of the proposed probabilistic forecast, the calibration (reliability) is used [36].

Obtain Predicted Interval
The quantiles of the PV forecast can be obtained from the previous part. However, as discussed before, the PV forecast should be presented in the form of a predicted interval as it is the input of interval optimization. It is important to make it clear that the predicted interval is different from the confidence interval, although in both cases the stochastic variable P is shown as an interval [P − , P + ] with confidence α%. In both cases, the probability that the stochastic variable P is within interval [P − , P + ] is equal to α%. In the confidence interval, this range is calculated based on previous observations, while this range is calculated based on the forecast values in the predicted interval. We can assume α 1 = Prob(P ≤ q α 1 ) and α 2 = Prob(P ≤ q α 2 ), where α 1 ≤ α 2 and q α 1 ≤ q α 2 and q α 1 , q α 2 are calculated from the probabilistic forecast. The predicted interval can be obtained by where α 2 − α 1 shows the probability (uncertainty level) that the random variable P is between P − and P + .

Scheduling an MES
In this section, first the components of an industrial MES, including a PV system, a CHP unit, two gas boilers, EES, and TES, are modeled. The optimal day-ahead schedule of the MES is formulated in the form of the interval optimization to address the uncertainty of PV generation. It should be mentioned here, since the data for PV production is reported on the AC side of the system, there is no need to model the converter's behavior.

CHP Model
The CHP unit in this study generates heat (h CHP ) and electricity (e CHP ) by consuming natural gas. Both generated electricity and heating by CHP should remain between their maximum and minimum values: where γ CHP (t) is a binary variable that indicates if CHP is on/off (1/0). The gas consumption by CHP to generate electricity (g CHP e ) is calculated by ∀t (15) where COP e2g is the coefficient of performance to convert kWh to m 3 . η CHP e (t) is the electrical efficiency of the CHP, which can be vary between 25% and 40% depending on the output of CHP. In this paper, the efficiency of the CHP is approximated by a linear function of its output value, which means η CHP e (t) = Ae CHP (t) + B ∀t (16) where A and B are calculated based on a linear interpolation between the full load efficiency and the efficiency at minimum load. The thermal part of the CHP unit is modeled in Equations (17) and (18). The thermal efficiency of the CHP (η CHP th (t)) is also approximated by a linear function of its heat generation h CHP .
As can be seen, the gas consumption of CHP at each time step is a non-linear function of the generated heat and electricity at that time step. In order to linearize these two equations (Equations (15) and (17)), McCormick Envelopes (see Appendix A) are applied, which is a convex relaxation method in general. It results in a linear problem in this case.

Boiler Model
There are two gas boilers available in the studied MES. Each of them is modeled by Equations (19) and (20): where h boi (t) is the thermal energy generated by each boiler at each time step, and H max,boi is the maximum range for the output of each boiler. The gas consumption (g boi (t)) for each boiler can be obtained by where η boi is the efficiency of the boiler.

EES Model
As proposed in [37], the state of energy (SoE) is used in this paper to model the EES. At each time step t, the SoE of the EES (soe EES (t)) is equal to where soe EES (t − 1) is the SoE at the previous time step, η EES is the charging efficiency, e ch,EES and e dis,EES are the amount of energy that is charged and discharged the battery, respectively. Moreover, the SoE should always stay between the minimum (SOE EES,min ) and the maximum value (SOE EES,max ): There is also a limit to transfer energy from and to the EES: where E ch,max is the maximum EES energy charging, and E dis,max is the maximum energy discharging. In order to avoid simultaneously charging and discharging, the binary variable γ EES (t) is defined.

TES Model
The model for TES is similar to the model for the EES. Here also, the SoE is used: where soe TES (t − 1) is the SoE of the TES from time step t − 1, h ch,TES is the amount of thermal energy that goes to TES, h dis,TES is the amount of thermal energy discharged from the TES, and η TES is the charging efficiency of the TES. At each time step, the SoE of the TES cannot exceed its capacity (C TES ), and it should always be a non-negative value: The total thermal charging and discharging of the TES cannot exceed the total heat generation and the thermal demand (H l (t)) for that time period t, respectively.

MES Optimal Scheduling
In order to obtain the optimal day-ahead schedule of the MES, the problem is formulated as MILP, while the controlled variables are energy generated by CHP and boilers, the charging and discharging schedule of EES, and the TES and optimal trading with the grid. The electrical and thermal load profiles, the PV generation including its uncertainty, and the prices of electricity and gas are provided to the optimization as inputs. In this section, first the model for deterministic optimization is proposed. The general formulation for interval optimization is introduced to address the uncertainty of the PV energy generation.

Deterministic Optimization
The main goal of this paper is to find the optimal day-ahead schedule of the units in an MES, which leads to a minimum operational cost. Therefore, the objective function (obj) of the day-ahead optimization problem can be written by where e pur is the total energy purchased from main grid, e sold is the amount of energy injected/sold to the grid, g pur is the gas consumption of the system, C pur e (t), C sold e (t), and C pur g are the price of buying electricity from the grid, the price of injecting electricity to the grid, and the price of buying natural gas, respectively. It is worth mentioning that it is assumed that the consumer is only a day-ahead price-taker and does not participate in the day-ahead market. C CHP , C boi,1 , and C boi,2 are the start-up costs of the units.
It should be mentioned that the production cost of the PV system, the maintenance cost of the system, and the CO 2 emission costs are ignored. Moreover, the optimal sizing of the units is beyond the scope of this paper, since it is a known real-life industrial MES (see Section 4.1).
The constraints of this optimization problem are the model constraints of each unit, i.e., Equations (13)- (28). However, there are some other equations that should be considered: grid limitation and energy balance constraints. The proposed MES can exchange electricity bidirectionally (buying and selling). Due to the size of the grid connection, there is a limit for the amount of energy that is purchased (E pur,max )/injected(E sold,max ) from/into the grid: 0 ≤ e pur (t) ≤ E pur,max ∀t (30) 0 ≤ e sold (t) ≤ E sold,max ∀t Equation (32) maintains the electrical energy balance, where E l (t) is the load of the system at each time step, and E PV (t) is the energy generated by PV at that time step. e sold (t) + E l (t) + e ch,EES (t) = E PV (t) + e CHP (t) + e pur (t) + e dis,EES (t) ∀t (32) Equations (33) and (34) show the energy balance at the thermal side of the system and in the gas part of the system, respectively. H l (t) in Equation (33) stands for the total thermal consumption of the system at time step t.

Interval Optimization
Interval optimization can deal with the uncertainty of the system where the uncertain parameter is reported by its lower bound and upper bound. According to the provided interval of uncertain parameters (Equation (12)), the interval optimization finds the optimal output bounds. In this paper, interval optimization is used to obtain the optimal day-ahead schedule of an MES, while the PV production is the source of uncertainty. It should be pointed out that, in this paper, it is assumed that the rest of parameters of the MES, such as energy carrier prices, and thermal and electrical load, are deterministic, and only the uncertainty of PV generation is considered.
We can assume the day-ahead PV energy generation is represented by where the bounds are calculated based on the proposed algorithm in Section 2. Therefore, all the decision variables and the constraints introduced in the previous part are also converted to an interval form. This interval optimization brings non-linearity to the problem, which is difficult to solve. However, according to [38], this original interval optimization can be transformed to two different deterministic problems. This means that, in order to find the interval of objective function ([obj] = [obj − , obj + ]), we need to solve the following problems: In the case of positive electricity prices, the minimum of the objective function is obtained when the PV generation is at its maximum value of the intervals (E PV,+ ), and the minimum value for the objective function is the time when the PV is represented by its lower bound (E PV,− ). This situation is reversed when the price for electricity is negative.

Simulation Results
In this section, the simulation results are presented and discussed. First, the case study and the parameters of the system are explained. The proposed method for data-driven interval prediction is then applied on the available data set. Finally, the simulation results for optimal day-ahead scheduling are discussed to show the effectiveness of the proposed algorithm.
All the simulations have been done in Python (Python Software Foundation, Fredericksburg, VA, USA) on an Intel(R) Core i7-7500U CPU @2.70 GHz computer (Intel Corporation, Santa Clara, CA, USA) with an installed memory of 16 GB. In order to implement the point forecast, the models from scikit-learn (Oakland, CA, USA) and Keras (Cambridge, MA, USA) are used. The optimization framework has been modeled in PYOMO (Oakland, CA, USA) [39], while GUROBI 9.0.2 (Gurobi Optimization, Beaverton, OR, USA) [40] is used as the solver.

Case Study
The proposed algorithm is evaluated on the data set available for an industrial MES in Germany. As shown in Figure 2, this industrial site includes a natural gas CHP system, two natural gas boilers, a local PV installation, an EES device to store electricity, and a TES device to store the generated heat. The parameters of each unit are summarized in Table 1. The local electrical consumption, thermal consumption, and PV energy generation were measured by a local measurement system. The data available are from 8 March 2019 to 17 March 2020 with a 15 min resolution, while the measurements for October 2019 and November 2019 are missing. Therefore, the data are available for 315 days in total. The quality of the local measurements were checked in advance. Missing values were replaced by linear interpolation, except the mentioned two months, which have been excluded from further analysis. The installed capacity of the PV system was upgraded in September 2019. Therefore, all the data are normalized regarding the capacity of PV at the moment of measurement.  The German day-ahead electricity prices were downloaded from the platform of the European Network of Transmission System Operators (ENTSO-E) [41]. In order to include taxation in the electricity price, the price for buying electricity from the grid at each time is assumed to be twice as high as the reported price by ENTSO-E at that time step. Moreover, the day-ahead price for selling electricity is assumed to be half of the price of buying electricity at that time step. Moreover, in order to analyse the real-time cost, the imbalance electricity price delta was downloaded via [42]. It is assumed that the real-time prices of buying and selling electricity are equal to each other. The price of natural gas is set to 0.4 e/m 3 (equal to 0.034 e/kWh) [43]. The electricity demand, the thermal demand, and the prices for energy carriers are shown in Figure 3 for 27 September 2019, which was randomly selected from our data set.

PV Uncertainty Representation
The first step to represent the uncertainty of PV energy generation is to obtain a point forecast. All the models proposed in Section 2.1 were applied on the available data set for PV generation, and the best model was selected based on the accuracy and the computational time. It should be mentioned that the PV data set is zero-inflated, which means that certain time periods are excluded (from 7 in the evening to 7 in the morning, since the PV production is (almost) zero). In order to obtain a good generalization performance, a 10-fold cross validation was used. Moreover, the data set was divided by 80% and 20% for the training set and the validation set, respectively. In order to obtain the optimal hyperparameters of each model, Grid Search was used to find the best performance (except for persistence method). The maximum lag for SMA and LASSO was set between two days ago and two weeks ago. It was determined that the best performance was obtained at 10 days ago (l = 10) for both methods. The FFNN also showed the best performance, with two hidden layers, each including 100 neurons, while the sigmoid function was selected as the activation function. Moreover, the best hyper-parameters of SVR are obtained with radial basis function (RBF) as the kernel function.
In Table 2, the results of different evaluation metrics are presented for all the point forecast models (the optimal hyper-parameter(s) of each model is(are) used). As shown, the LASSO performance is the best for all metrics, while SVR has the worst. The other models (persistence, SMA, and FFNN) perform similarly. More complicated models, such as FFNN and SVR, are not able to outperform LASSO because of the small size of the data set to train complicated models. Normally, these methods, especially the FFNN, require more data to be trained. Moreover, in the last column of Table 2, the computational time of each prediction model is shown. Since the persistence model only takes the last day values as the predicted energy production, it can be done very fast, while the FFNN takes the most time to provide the prediction values. To conclude, since LASSO outperforms the other models, it was selected as the day-ahead point forecaster. The next step was to calculate the quantiles of the forecast to build the predicted interval for day-ahead PV energy generation. A non-parametric approach based on historical quantiles of the error is proposed in Section 2.2. The quantile function of the errors are described by 99 quantiles with a probability value from 0.01 to 0.99 with a step of 0.01. According to the proposed method, only the historical errors from one day ago to the optimal (maximum) lag are considered (see Equation (10)), which turns out to be l = 10. Moreover, the forecast errors for the last 10 days at the same time step as well as the errors from 1 h before and 1 h after the previous days are considered to calculate the targeted time step quantiles of error.
Two probabilistic forecast benchmark methods are also implemented to be compared with the performance of the proposed method. In the first benchmark, the quantiles for each time step are calculated based on the historical errors from previous days, i.e., the seasonal behavior of the PV system is ignored. The second benchmark is based on building quantiles around the point forecast (the method used in most of the literature).
To evaluate the performance of the probabilistic forecast, the calibration diagram is used. It compares the similarity of predicted distribution (y-axis) with the distribution of the original data set (x-axis). A reliable probabilistic forecast should be well-calibrated, i.e., close to the diagonal. Figure 4 shows the average calibration diagram of the proposed methods versus two benchmark methods for all time horizons, over all days available in the data set. The solid black line represents the ideal case. The greater the similarity to the diagonal, the more reliable the method is.  As shown in Figure 4, the proposed method and the quantiles benchmark method perform quite similarly, while the proposed method is closer to the diagonal and is in the lead. On the other hand, the second benchmark, which is based on building distribution around the point forecast, has an uncalibrated reliability diagram. This means that the second benchmark is not able to provide a reliable result, and the predicted distribution is far from the original data. To quantify the above comparison, the maximum deviation and the mean absolute error with regard to the ideal case are calculated for each probabilistic method. The maximum deviation is 0.02%, 0.05%, and 0.3% for the proposed method, the first benchmark method, and the second benchmark method, respectively. Moreover, the proposed method ends up with a 0.01% mean absolute error, while the first and the second benchmark methods result in 0.02% and 0.12% of the mean absolute error. As a result, the proposed method, which is a combination of LASSO and quantile function based on historical errors, outperforms the two benchmarks. After obtaining the quantiles, the predicted interval can be obtained for any uncertainty level. Figure 5 illustrates the predicted interval for four different uncertainty levels (10%, 20%, 30%, and 40%) for PV generation together with real measurements and point forecast for the randomly selected day (27 September 2019). For instance, the interval for the 10% uncertainty is calculated based on the 45th and 55th quantiles.

Interval Optimization
The simulation results regarding the optimal day-ahead scheduling of the proposed MES are discussed in this section. In order to address the uncertainty of PV generation, interval optimization is proposed. In this part, by varying the uncertainty level for PV generation, the sensitivity of the operation cost of the system as well as the behavior of all the units are analyzed.
According to the formulation in Section 3, the interval optimization is applied to the proposed case study. Four different uncertainty levels are taken into account to represent the uncertainty of PV: 10%, 20%, 30%, and 40%. By solving the optimization model, the intervals for average daily operation cost are obtained and shown in Table 3, where obj − and obj + are the day-ahead scheduling cost interval. RT − and RT + are the real-time cost interval to compensate the representation of PV uncertainty. The values in Table 3 are reported in average of 315 days available in the data set. The 0% is simulated when only day-ahead point forecast values are used to estimate PV energy generation. As shown in Table 3, by increasing the level of uncertainty, not only does the width (obj + − obj − ) of day-ahead cost interval increase, but the width of the total operational cost (day-ahead and real-time) increases as well. Moreover, it can be seen that the interval optimization solution is sensitive to the PV uncertainty level, although the deviations are small. When the PV uncertainty level is changed from 10% to 40%, the lower bound and upper bound of the day-ahead cost would change by (1822.24 − 1852.14)/1852.14 = −1.61% and (1899.01 − 1870.61)/1870.61 = 1.52%, respectively. It can be concluded that the main part of the total cost (obj + RT) belongs to the day-ahead cost, and the real-time compensation costs are small compared to the day-ahead costs (obj − /(obj − + RT − ) = 98.98% and obj + /(obj + + RT + ) = 98.93% with a 10% uncertainty level). These cost intervals provide ideas for the energy management system of the MES regarding the uncertainty level of PV generation.
In order to analyze the effect of uncertainty on the behavior of the units and on the input energy carriers (electricity and gas), the average daily optimal values of these parameters are shown in Table 4 for two uncertainty levels (10% and 30%). As can be seen, the amount of purchased electricity and sold electricity are sensitive to the level of PV uncertainty, while the behavior of CHP and boilers do not change over different levels of uncertainty. Therefore, the gas consumption is approximately constant for these two levels of uncertainty. This behavior can be explained by the prices for electricity and gas as well as the limited capacity of TES. As formulated in Equation (29), the main objective of the proposed optimization is to minimize the operational cost of the MES. However, another factor that could be considered in MESs with local energy generation resources is self-reliance (see [44] for more information). Table 5 shows the values of the self-consumption (the ratio between the self-consumed local electricity generation and the total local electricity generation) and self-sufficiency (the ratio between the load supplied by local electricity generation and the total load) of the system for two uncertainty levels. As can be seen in Table 5, there are not many changes in self-consumption by switching from one uncertainty level to another. The reason is that the optimization, in order to minimize the cost, is more sensitive to the price of electricity, while maximizing self-consumption is beyond the scope of this paper. Moreover, in the daily average, there is more PV production in the lower bound analysis. Therefore, self-sufficiency is higher in lower bound analysis for both uncertain levels shown in Table 5. Moreover, in order to analyze the behavior of different units under PV uncertainty, their optimal operation under 30% uncertainty level over the representative day are shown in Figures 6-8. It should be mentioned that, in these figures, the dashed lines belong to the upper bound (shown by + ) analysis of the day-ahead objective function, while the solid lines represent the lower bound (shown by − ) analysis of the day-ahead objective function.  Figure 6 shows the optimal electricity interaction with the main grid. In day-ahead optimization, the injected energy is zero, while the real-time injected energy is shown in Figure 6a. As can be seen, the lower bound and the upper bound of real-time injection follow the day-ahead PV uncertainty. Due to the error in forecasting PV generation, there is more PV energy generation in the afternoon in real time. The bounds of the real-time purchased energy (Figure 6b) also depend on the uncertainty of the PV energy system. However, as shown in Figure 6b, the bounds for day-ahead purchased electricity do not only depend on PV uncertainty but also are affected by electricity price. The difference between lower and upper bounds of this parameter can be seen early in the morning when the electricity price is low. As the uncertainty of the PV system has an influence on the interaction of energy with the grid, the EES behavior is also affected by this issue (see Figure 8a).    The behavior of boilers and the thermal part of CHP is shown in Figure 7. These two units should supply the thermal demand of the site. As can be seen, CHP behavior is affected by PV uncertainty during the day. These changes also have an influence on the behavior of the two boilers and the TES (see Figure 8b). Since, in lower bound analysis, there is more PV generation compared to the upper bound analysis during the day, the CHP unit generates less energy (both thermal and electrical) during the day (also less energy stored in the TES in Figure 8b). However, in the evening, the CHP generates more energy in the lower bound analysis to supply the demand. Therefore, the uncertainty of the PV system shifts the generation of the CHP and boilers. However, the total generation of these units are constant in the lower and upper bounds analysis during one day.

Conclusions
This paper proposes a day-ahead scheduling model for the industrial MES, including CHP, PV, boilers, EES, and TES. The interval optimization is used to address the uncertainty of PV generation, while the uncertainty of PV generation is described by the predicted interval. A data-based forecasting algorithm is proposed to obtain the day-ahead point forecast as well as the predicted interval, which is only based on historical PV measurements. Moreover, the MES is modeled as MILP, while the non-linear behavior of CHP is approximated by a linear relaxation technique. In the simulation results, the effect of the uncertainty of PV is analyzed on the interval optimization solution and on the behavior of different components of the system. The results lead to the following conclusions:

1.
A combination of LASSO and historical quantiles of error is proposed to obtain a predicted interval, which only uses historical PV generation data. The proposed method results in a reliable solution. It is also shown that building the distribution around the point forecast, which is a popular method in some research papers, ends up with an uncalibrated reliability diagram.

2.
A sensitivity analysis is done for different uncertainty levels. It is determined that the interactions with the grid (both buying and injecting energy) are the most sensitive variables to the PV uncertainty level.

3.
Uncertainty of the PV system does not affect the total daily energy generated by CHP and boilers. The uncertainty only causes a shift in the production of these units.

4.
The whole uncertainty representation and optimization framework proposed in this paper require minimum data with low computational time, which make this approach feasible for implementation in real-life applications.
In our future work, the uncertainty of the loads will be added to optimal day-ahead scheduling.

Data Availability Statement:
The data is not openly available due to restrictions associated with the company under study. Please contact Glenn Ceusters (glenn.ceusters@be.abb.com) in order to have access to the data.
w ≤ x U y + xy L − x U y L (A4) w ≤ xy U + x L y − x L y U (A5)