Applications of Probabilistic Forecasting in Smart Grids: A Review

This paper reviews the recent studies and works dealing with probabilistic forecasting models and their applications in smart grids. According to these studies, this paper tries to introduce a roadmap towards decision-making under uncertainty in a smart grid environment. In this way, it firstly discusses the common methods employed to predict the distribution of variables. Then, it reviews how the recent literature used these forecasting methods and for which uncertain parameters they wanted to obtain distributions. Unlike the existing reviews, this paper assesses several uncertain parameters for which probabilistic forecasting models have been developed. In the next stage, this paper provides an overview related to scenario generation of uncertain parameters using their distributions and how these scenarios are adopted for optimal decision-making. In this regard, this paper discusses three types of optimization problems aiming to capture uncertainties and reviews the related papers. Finally, we propose some future applications of probabilistic forecasting based on the flexibility challenges of power systems in the near future.


Motivation and Contribution
Smart grids' operation and planning deal with different types of forecasts. The recent advances in information and communication technology (ICT) facilitate the real-time control of devices and resources based on the real-time system states. However, to establish an effective control, the mass of data received from smart meters should be processed and analyzed. These data are utilized to predict future system states. The results of this prediction are then employed to find the optimal control strategy. For example, in many studies, electricity consumption and renewable generation are forecasted and the results are used to determine the optimal commitment of the other conventional generation units and resources.
In this regard, well-trained data-driven forecasting models often give forecasters overconfident and point-forecasted values [1]. This means that the data-driven algorithms giving us point forecasts cannot model the uncertainties and errors of the forecasts. In addition, the decision-making process may be subjected to an information gap. This information gap creates a disparity between the information that a decision maker has and the information that could be known. Thus, the information gap produces possibilities, and this range of possibilities increases as the information gap grows. In this way, decision makers may decide to base their decisions upon the best-informed available model and disregard the uncertainties, which results in insufficient decision-making [2]. To resolve this issue, probabilistic forecasting is suggested.
A probabilistic forecast produces a predictive distribution of values rather than a single value. In general, there are two techniques to generate probabilistic forecasts, named parametric and non-parametric methods. Parametric methods associate a probability distribution with an uncertain variable and then try to estimate the parameters of this Table 1. Comparison of this paper with the literature reviewing probabilistic forecasting models and applications in smart grids and power systems.

Price Probabilistic Forecast
Pathway towards Decision Making under Uncertainties [4] [5] [6] [7] [8] [9] [10] [11] This paper To the best of the authors' knowledge, there is no comprehensive review paper proposing a comprehensive framework for smart grid decision-makers and stakeholders based on the probabilistic forecasting of all uncertain inputs (wind, PV, price, load, etc.). In addition, the applications of probabilistic forecasts and the decision-making pathway were not fully assessed in the previous literature. The decision-making process should have three main stages, including forecasting, scenario generation, and developing optimization problems. Thus, this paper is proposed to resolve this issue as follows: 1.
It aims to introduce a roadmap and a pathway towards uncertain decision making in a smart grid environment. The roadmap includes probabilistic forecasting models of uncertain parameters, scenario generation based on probabilistic forecasts, and solving stochastic, robust, and chance-constrained optimization problems according to the results of the previous steps.

2.
It tries to guide upcoming similar works by introducing the smart grid's needs in the future. In this regard, probabilistic forecasting models should be developed for a wide range of uncertain parameters and not be limited to loads, prices, and renewable generations.

Paper Framework and Organization
The main goal of this paper is to introduce the roadmap and direction of making decisions in a smart grid environment, using probabilistic forecasting models. Figure 1 shows the general framework of this paper.

Paper Framework and Organization
The main goal of this paper is to introduce the roadmap and direction of ma decisions in a smart grid environment, using probabilistic forecasting models. Figu shows the general framework of this paper. This paper first introduces probabilistic forecasting models and reviews how the tributions of uncertain parameters are predicted based on previous works. It then pre the most common methods utilizing the predicted distributions to generate scenarios ter that, we will discuss how the scenarios generated in the previous stage can help s grid management systems to make decisions considering different scenarios of unce inputs. Finally, this paper discusses two state-of-the-art applications of probabilistic casting that can be extended in future works. In the conclusion and discussion sec some future applications based on the future requirements of power systems are also posed. This paper first introduces probabilistic forecasting models and reviews how the distributions of uncertain parameters are predicted based on previous works. It then presents the most common methods utilizing the predicted distributions to generate scenarios. After that, we will discuss how the scenarios generated in the previous stage can help smart grid management systems to make decisions considering different scenarios of uncertain inputs. Finally, this paper discusses two state-of-the-art applications of probabilistic forecasting that can be extended in future works. In the conclusion and discussion section, some future applications based on the future requirements of power systems are also proposed. Appl

Examples of Parametric Probabilistic Forecasting Models
Each forecast has an error. This error was yielded by subtracting the forecasted value from the actual value in the post-processing stage. At the time of the forecast, a point forecast does not give forecasters any information about the error. Consider a linear regression model as an example. It is a forecasting model that aims to predict the response variable Y t based on some explanatory variables (x t,1 , . . . , x t,m ) as follows: whereŶ t is the point forecast in the deterministic model and b denotes the weight associated with each explanatory variable. Unfortunately, as can be seen in (2), the deterministic linear regression model does not consider the forecast error. Probabilistic forecasts, on the contrary, will give forecasters different sets of forecasts that follow a known specific probability distribution function (PDF). They will then forecast the parameters of the distribution function. For instance, a probabilistic linear regression forecast assumes that the error term where (3) states that the forecasted values are normally distributed in this model. µ denotes the mean and σ is the variance. The generalized linear models, however, consider exponential family distributions (Exp) of the response variables and utilize a link function g(.) to model the mean value in terms of explanatory variables [12]: where φ t denotes the variance associated with the exponential family distributions. There is a more advanced model, called the Generalized Additive Model for Location Scale and Shape (GAMLSS), that considers a huge set of distributions and is able to model all of the scale parameters related to the distributions [12]: where υ demonstrates the skewness and τ shows the kurtosis of the distribution function. This means that the GAMLSS model is able to give us the exact shape of the distributions of our forecast.

Examples of Non-Parametric Probabilistic Forecasting Models
A wide range of probabilistic forecasting models fall into the non-parametric category. The non-parametric probabilistic forecasting models do not use the existing known PDFs. Instead, they build predictive distributions of the response variable based on different factors or construct quantiles/ensembles considering historical data. Since they do not limit probability distributions to specifically known distributions, the non-parametric methods are more flexible. However, compared to parametric models, they often need larger datasets to be able to estimate the response variable distributions [13].
Random forecast (RF) and quantile regression forecast (QRF) models are two examples. RF models aim to forecast the conditional mean of the response variable given the input data and without associating any known distribution functions with the variables [14]. Similarly, QRF models predict quantiles of the response variable regardless of any known parametric distributions. To obtain the quantiles, QRF minimizes the sum of errors of the mean values, considering some asymmetric weights [15,16]. Kernel density estimation (KDE) is another non-parametric method estimating the probabilistic density of the response variables using kernel functions. The kernel density function can be mathematically written as follows [17,18]: where K indicates the kernel function, y i are the sample points, N refers to the total number of sample points and h is the bandwidth referring to the smoothing parameter.
In the KDE method, a proper kernel function should be selected according to the type of response variables. Another way of obtaining ensemble forecasts is considering various initial states or different boundary conditions for the response variable, such as the lower upper bound estimate (LUBE) [19]. This method provides forecasters with prediction intervals (PI). LUBE employs artificial intelligence tools to build the PIs. In addition to LUBE, the bootstrap method can fall into this category, since it resamples the data and constructs a distribution of residuals accordingly [20]. Short range ensemble forecast (SREF), as another example of non-parametric forecasts, takes into account the uncertainties of initial states [21].
A set of different types of machine learning-based and numerical forecasts can build a non-parametric ensemble forecast. This method uses N different data-driven and numerical forecasting models to predict the response variable. These forecasts are then considered as quantiles. The corresponding distribution F(y) can be mathematically modeled as follows [2]: where l(y −ŷ i,t ) denotes a Heaviside step function that shifts y to the ith ensemble member.

Artificial Neural Network-Based Probabilistic Forecasting
Artificial neural network (ANN)-based models can be also used to develop probabilistic forecasting models. NN architectures consist of a network of neurons as processing units. The neurons connect to each other through synapses, which are weighted connections. In the training stage, the optimal weights are determined.
In general, an NN has three layers-an input layer, an output layer and one or several hidden layers. A feedforward ANN passes the data forward from input to output. On the other hand, a recurrent NN (RNN) can connect some neurons in a backward direction as well as the forward direction for further processing. In this way, RNNs have the ability to consider the autocorrelation or time dependencies of data [22]. Figure 2 compares the architecture of a feedforward NN with that of an RNN. The literature proposed various architectures and developed probabilistic forecasting models for both feedforward and recurrent ANN-based models. In the following, some important models are reviewed. Appl. Sci. 2022, 12, x FOR PEER REVIEW 6 of 28

Examples of Probabilistic Forecasting Models Using Feedforward ANNs
Mixture density networks (MDN) and softmax regression networks (SRN) are two of the main feedforward ANN-based models aiming to obtain the distribution of uncertain parameters [23]. Regarding MDN, as a parametric model, the associated probability density is obtained from a linear combination of kernel functions [24]: where represents the input vector of the forecasting model, is the output vector, ( | ) is the kernel function selected for the model, and ( ) represents the mixing coefficients that control the inputs. In MDN models, the output neurons are the parameters of the distribution functions as well as the mixing coefficients. For example, the outputs can be the parameters of Gaussian distribution functions, including the mean values and the standard deviations as well as the mixing coefficients.
In comparison, in an SRN model, which is a non-parametric probabilistic forecasting model, each output of the neuron associates a probability fraction with a value of . In this way, there should be an interval for possible values of . Hence, the output of an SRN represents the probability related to each member within the interval. Figure 3 compares the architectures of MDN with SRN, two ANN-based approaches utilized for probabilistic forecasts.

Examples of Probabilistic Forecasting Models Using Feedforward ANNs
Mixture density networks (MDN) and softmax regression networks (SRN) are two of the main feedforward ANN-based models aiming to obtain the distribution of uncertain parameters [23]. Regarding MDN, as a parametric model, the associated probability density is obtained from a linear combination of kernel functions [24]: (14) where x represents the input vector of the forecasting model, y is the output vector, K i (y|x) is the kernel function selected for the model, and a i (x) represents the mixing coefficients that control the inputs. In MDN models, the output neurons are the parameters of the distribution functions as well as the mixing coefficients. For example, the outputs can be the parameters of Gaussian distribution functions, including the mean values and the standard deviations as well as the mixing coefficients.
In comparison, in an SRN model, which is a non-parametric probabilistic forecasting model, each output of the neuron associates a probability fraction with a value of y. In this way, there should be an interval for possible values of y. Hence, the output of an SRN represents the probability related to each member within the interval. Figure 3 compares the architectures of MDN with SRN, two ANN-based approaches utilized for probabilistic forecasts.

Examples of Probabilistic Forecasting Models Using Feedforward ANNs
Mixture density networks (MDN) and softmax regression networks (SRN) are two of the main feedforward ANN-based models aiming to obtain the distribution of uncertain parameters [23]. Regarding MDN, as a parametric model, the associated probability density is obtained from a linear combination of kernel functions [24]: where represents the input vector of the forecasting model, is the output vector, ( | ) is the kernel function selected for the model, and ( ) represents the mixing coefficients that control the inputs. In MDN models, the output neurons are the parameters of the distribution functions as well as the mixing coefficients. For example, the outputs can be the parameters of Gaussian distribution functions, including the mean values and the standard deviations as well as the mixing coefficients.
In comparison, in an SRN model, which is a non-parametric probabilistic forecasting model, each output of the neuron associates a probability fraction with a value of . In this way, there should be an interval for possible values of . Hence, the output of an SRN represents the probability related to each member within the interval. Figure 3 compares the architectures of MDN with SRN, two ANN-based approaches utilized for probabilistic forecasts.   RNN can also be combined with long short-term memory (LSTM) units. LSTM consists of some memory blocks that are recurrently connected. Each block consists of three multiplicative units including input gate, output gate and forget gate. The input gate memorizes either the new information or the previous states of the network. The forget gate disregards the irrelevant and unnecessary information obtained from the past. The output gate extracts the important information from the memory. In this way, unnecessary information is forgotten and only necessary ones are kept within the network [25].
Reference [25] is an example of research developing two RNN-LSTM-based probabilistic forecasting models. The first model is a parametric model and quite is similar to MDN. It first tries to exact a PDF of the uncertain parameter. Then, an RNN-LSTM network is trained to find the optimal values of PDF parameters. The other model is, however, a nonparametric probabilistic forecasting model that is integrated with RNN-LSTM. It employs the QR method with the objective to predict the quantiles of the uncertain parameter. The network is trained to minimize the quantile loss by minimizing their pinball loss.

Renewable Generation and Load Probabilistic Forecasting
In general, probabilistic forecasting methods have mainly been adopted to forecast the probability distributions of renewable-based power generation and/or load in smart grids. The following sub-sections aim to review important studies that proposed parametric and non-parametric probabilistic forecasts for energy demand and/or generation in a smart grid environment. Additionally, Tables 2-4 review some selected papers that tried to develop probabilistic forecasting models for solar generation, wind generation and loads, respectively.

Solar Probabilistic Forecasting
Network operators, generation agents as well as premises need PV generation to be forecasted at various horizons, including very short-term, hourly and intra-hour, intra-day, as well as day-ahead forecasts [2]. In this regard, future PV generation is forecasted either based on solar irradiance or PV generations of previous times. If the forecast reference is based on solar irradiance, it builds a model according to the past meteorological data or present observations. To construct the model, it utilizes the data from weather stations, satellites, and local sensors and images as inputs [6]. It then develops a model by mapping the inputs to the solar generation. The proposed probabilistic forecasting model could reflect the effects of atmospheric conditions on forecast errors [27] Two-day-ahead_30 min NWP-based solar irradiance forecast The paper used a post processing method that was able to improve the performance of the forecasting algorithm [28] Day-ahead_1 h NWP-based solar irradiance forecast It determined several confidence intervals for each region [29] Day-ahead and hour-ahead_10 min Different parametric and non-parametric models It assessed the effects of reconciliation on the improvement of forecasts [30] Long-term_1 h Three parametric models The model could describe the stochastic characteristics and features of solar irradiance [31] Intra-day (1-6 h)_1 s Two models developed using quantile regression It conducted graphical analysis of reliability to compare the performance of the forecasts [32] Three-day-ahead_1 h ANN-based combined with Analog Ensemble The combination of these two methods yielded best results compared to the individual models [33] day-ahead_1 h LSTM-based The proposed model performed better compared to the simpler models but got the same results as the fully connected ANN-based model In order to probabilistically forecast PV generation, some works proposed using numerical weather prediction (NWP) models under various scenarios. References [26][27][28] are some examples that employed NWP to obtain ensemble forecasts for solar irradiance and build an uncertainty model for PV generation. An NWP adopts some measured meteorological inputs, such as temperature, humidity and pressure. It then solves a set of partial differential equations to simulate the process through which solar irradiance is obtained [34]. To obtain a probabilistic forecast, an NWP method uses a training set consisting of the NWP ensemble members at timeslot t. After that, each member is weighted equally and utilized to make an empirical cumulative distribution function. NWP methods are usually very straightforward to implement. However, it was argued that NWP models are computationally expensive to implement, and thus it would be better to run them only a few times a day [35].
In the short-run, such as real-time and near real-time forecast horizons, probabilistic forecasters mainly utilize data-driven models including statistical and machine-learning ones [36]. For example, ref. [29] aimed to make short-horizon forecasts based on multistep-ahead (e.g., six forecasts with 10-min time slots), considering 11 data-driven machinelearning and parametric methods, including least angle regression, least angle regression with elastic-net regularization, lasso regression, generalized linear models, generalized linear model with elastic-net regularization, Bayesian generalized linear model, gradient boosting machines, linear regression, boosted generalized linear model, multivariate adaptive regression splines, and projection pursuit regression. These models tried to probabilistically predict solar generation. The paper also discussed how the so-called "reconciliation techniques" proposed by [37] can improve the probabilistic forecasts. Additionally, ref. [30] compared three parametric probabilistic forecasting models for solar irradiance. The first model considers the use of Beta distribution with several shape factors, the second model utilizes a generalized triangular distribution and the third one combines multiple probabilistic forecasting models to construct the probability distribution of solar irradiance. Regarding non-parametric approaches, ref. [31] compared two 1-to-6-h-ahead probabilistic models for predicting global horizontal irradiance. The first model directly produces a set of quantiles for each time slot using regression methods "linear model in quantile regression (LMQR)", "quantile regression forest (QRF)", or "gradient boosting machine (GBM)". The second model, however, consists of two stages. The first stage deploys the recursive least square autoregressive and moving average (ARMArls) method to make a point forecast for the irradiance. The outputs are used in the next step to estimate quantiles for each time horizon using the same methods. As mentioned earlier, the reviewed model tried to first forecast irradiance and then predict PV generation accordingly, using a physical model and relationships. These models are often called white-box models.
By contrast, non-physical or so-called "black-box" forecasting approaches that predict PV generation are purely based on historical data and do not deal with the physical process and the meteorological data. For instance, ref. [32] developed a probabilistic forecasting model using ANN and an analog ensemble to produce 72-h forecasts of PV power. Additionally, ref. [33] utilized a more complicated approach, long short-term memory (LSTM), to probabilistically predict solar power and compared it with simpler models. Similar to NWP models, although physical models may be more accurate in dayahead and long-term horizons, data-driven black-box models work better over short-term horizons such as intra-hour ones [38,39]. Figure 4 summarizes the probabilistic forecasting models for PV generation considering the forecasting horizon. Appl

Wind Probabilistic Forecasting
There are studies trying to relate a known PDF to wind generation. They mainly adopted Gaussian and beta distributions to probabilistically forecast wind power using parametric methods [40]. Some research also tried other types of distribution functions. For example, ref. [41] proposed a modified version to generalize logit-normal distribution for wind power. Reference [42] considered the same distribution for wind generation. The authors of [43] took wind power as a double-bound variable and obtained the appropriate distribution based on this assumption. In another study, ref. [44] solved an economic dispatch problem using a versatile probability distribution for wind generation. However, it was also discussed that relating specific distributions to the distribution of wind power cannot be applied since in some cases, the predictive error of wind power distribution is changing over the prediction horizons [45].
A huge number of papers have been proposed that utilized non-parametric probabilistic forecasting models for wind generation. For instance, considering QRF-based methods, ref. [46] proposed a novel direct quantile regression (DQR) method to probabilistically predict wind power, generating quantiles without using statistical inference. The prediction was based on multi-step 10-min forecasts that combined the extreme learning machine (ELM) and QRF models to make the non-parametric probabilistic forecast and use it in a linear programming problem. Their proposed novel approach was finally compared to four different forecasting techniques, including the bootstrap-based ELM (BELM)-normal distribution, the BELM-beta distribution, the persistence model, and the radial basis function-neural network (RBF-NN) model. The final results demonstrate that the model proposed by [46] performs better in terms of the sharpness criteria. In this regard, the proposed DQR model performed 25% better compared to the persistence model and its performance was 20% better than the RBF-NN probabilistic forecasting model. In addition to the sharpness criteria, DQR model presented a more acceptable computational time equaling 63.89 s, according to the result section of [46]. As another example, ref. [47] presented a joint quantile regression (JQR) model that reproduces kernel Hilbert spaces for wind power probabilistic forecast utilizing the primal-dual coordinate descent technique. The work then employed the multi-objective salp swarm algorithm (MSSA) to optimize the final results. It then tested the forecasting model and compared it with other models on a one-step-ahead and a multi-step-ahead basis.

Wind Probabilistic Forecasting
There are studies trying to relate a known PDF to wind generation. They mainly adopted Gaussian and beta distributions to probabilistically forecast wind power using parametric methods [40]. Some research also tried other types of distribution functions. For example, ref. [41] proposed a modified version to generalize logit-normal distribution for wind power. Reference [42] considered the same distribution for wind generation. The authors of [43] took wind power as a double-bound variable and obtained the appropriate distribution based on this assumption. In another study, ref. [44] solved an economic dispatch problem using a versatile probability distribution for wind generation. However, it was also discussed that relating specific distributions to the distribution of wind power cannot be applied since in some cases, the predictive error of wind power distribution is changing over the prediction horizons [45].
A huge number of papers have been proposed that utilized non-parametric probabilistic forecasting models for wind generation. For instance, considering QRF-based methods, ref. [46] proposed a novel direct quantile regression (DQR) method to probabilistically predict wind power, generating quantiles without using statistical inference. The prediction was based on multi-step 10-min forecasts that combined the extreme learning machine (ELM) and QRF models to make the non-parametric probabilistic forecast and use it in a linear programming problem. Their proposed novel approach was finally compared to four different forecasting techniques, including the bootstrap-based ELM (BELM)-normal distribution, the BELM-beta distribution, the persistence model, and the radial basis function-neural network (RBF-NN) model. The final results demonstrate that the model proposed by [46] performs better in terms of the sharpness criteria. In this regard, the proposed DQR model performed 25% better compared to the persistence model and its performance was 20% better than the RBF-NN probabilistic forecasting model. In addition to the sharpness criteria, DQR model presented a more acceptable computational time equaling 63.89 s, according to the result section of [46]. As another example, ref. [47] presented a joint quantile regression (JQR) model that reproduces kernel Hilbert spaces for wind power probabilistic forecast utilizing the primal-dual coordinate descent technique. The work then employed the multi-objective salp swarm algorithm (MSSA) to optimize the final results. It then tested the forecasting model and compared it with other models on a one-step-ahead and a multi-step-ahead basis. Table 3. An overview of selected papers developing probabilistic forecasting models for wind generation.

Ref. Forecast Horizon_Forecast Resolution Forecasting Methods
Advantages of the Work [41] 10-min-ahead_1 min Parametric (mixtures of generalized version of logit-normal distributions) The work considered the non-linear nature and double-bounded characteristics of wind power forecast [42] 8-h-ahead_15 min Parametric (censored normal distribution) The work considered the effects of spatio-temporal on wind power forecast [43] Two-day-ahead_1 h Several parametric probabilistic forecasts It tested several distribution functions and found Beta distribution function as the most appropriate distribution for wind power [44] Day-ahead and week-ahead_1 min Parametric (versatile distribution) The model was integrated with the economic dispatch problems which could simplify uncertainties of wind power [46] Hour-ahead_10 min Direct quantile regression combined with mashing learning methods The model was proved to be high efficient, reliable, and flexible to probabilistically forecast wind power [47] One-hour-ahead and several-hour-ahead_15 min and 1 h Joint quantile regression The forecasting model was improved by meta-heuristics algorithm [17] Different horizons from days to hours_30 and 60 min A tri-level adaptation function integrated with a fuzzy inference system The model outperformed other similar approaches in terms of computational efficiency and practicality since it avoided any pre-assumptions about forecast errors and data noises, and considered cost-based optimization problems in the model [48] 1-6-h ahead and day-ahead_different time steps Multi-distribution ensemble (MDE) forecasting model integrated with competitive and cooperative strategies The work tried to explore the best probabilistic forecasting accuracies by considering different forecasting horizons [49] Not specified Three neural network-based models Reinforcement learning was also utilized to combine and improve three kinds of deep learning networks [50] Different horizons from 1-24 h_1 h Kernel density estimation with regular vine copulas and ensemble learning The work proved that multi-distribution mega-trend-diffusion can improve the forecast when there are insufficient data [51] Hour-ahead_1 h Data processing techniques integrated with ensemble NWPs The work proved that data processing techniques can improve the probabilistic forecast of wind power considerably Another method for obtaining the non-parametric distribution of wind power is the use of KDE. However, the model is highly impacted by using different kernel functions. In other words, the appropriate kernel function should be selected based on the type of the random variable so to avoid the boundary effects on the PDF of wind generation [52]. To resolve this issue, ref. [17] proposed applying a tri-level adaptation function integrated with a fuzzy inference system.
As examples of ensemble forecasting models, authors of [48] suggested a multidistribution ensemble (MDE) forecasting model that is integrated with competitive and cooperative strategies. In this way, the work tried three different distributions as the ensemble members. Based on the comparison results presented by the paper, the MDE integrated with the cooperative model performed better in an hour-ahead forecast. However, the MDE integrated with the competitive model had a better performance in longer horizons including two-to-six-hours-and 24 h-ahead. Three different neural network-based probabilistic forecasting models were also presented by [49]. The work combined the ensemble deep learning method with empirical wavelet transform decomposition (EWT), which outperformed the other models considered in the paper. ref. [50] also combined improved kernel density estimation with regular vine copulas and ensemble learning to obtain an advanced probabilistic forecasting model. Similar to solar forecast, there are some studies proposing NWP models. For instance, ref. [51] suggested wind power probabilistic forecasting using data processing techniques and ensemble NWPs. This methodology comprises data preprocessing techniques, the model of adaptive-network-based fuzzy inference system (ANFIS) integrated with fuzzy c-means (FCM) clustering model, as well as LUBE for prediction of forecast intervals. The work tried to prove that data preprocessing and post-processing processes are very important to improving the forecast models. For this purpose, it compared the model with the ANFIS model, disregarding data preprocessing. The paper concluded that the model utilizing preprocessing outperformed the other models not deploying this technique.

Load Probabilistic Forecasting
Similar to renewable generations, probabilistic forecasting of loads has a wide range of variety based on the type of forecasts and the forecasting horizons [8]. In terms of the variety of forecasting models, different methods have been adopted, such as hybrid Kalman filters [53], Gaussian and lognormal processes [54,55], artificial neural networks [56,57] QR [58,59], RF [60], and stochastic time-series combined with Bayesian inference (BI) [61]. The method can be adopted for security analysis of power systems since it was able to generate demand scenarios at a specified risk level [63] Day-ahead_1 h Partially linear additive quantile regression The work combined the forecasting model with the unit commitment problem [64] Different horizons including 30-min-, one-hour-, two-hours-, and four-hours-ahead_30 min

LSTM-based
The model was designed to probabilistically forecast the individual consumer's load [65] Day-ahead_1 h Probabilistic methods The novel work focused on determining the reserves based on forecasting net loads. The work demonstrated that the method can decrease the amount of reserves bought for the system In terms of very short-term (real-time or near-real-time) probabilistic load forecasting, reference [53] is one of the early works proposing hybrid Kalman Filters to probabilistically forecast demand considering the 5 min temporal resolution. The authors of [58] performed half-hour resolution probabilistic forecasting of electricity consumption using QR that is integrated with gradient boosting. In [55], the authors used Gaussian processes to develop a probabilistic forecasting model for residential load considering half-hour resolutions. In [54], the authors performed half-hour-resolution forecasts of residential loads utilizing a lognormal process. In one of the recent study, proposed by [62], the Markov-chain mixture distribution model (MCM) was employed for the purpose of very short-term load forecasting considering residential households in Australia. The model forecasts on a half-hour-ahead resolution basis. The authors then proved the high computational speed as well as the acceptable competitive performance of their proposed model.
Regarding interval forecasting on a day-ahead basis, the probabilistic forecasting NWP models can be adopted. For instance, the forecasting model can be constructed according to the weather ensemble prediction taking into account the consumption under different weather scenarios [63] or be built according to the quantiles of forecasting errors of the historical data [64]. In addition, a partially linear additive quantile regression model was proposed by [65] to develop a probabilistic forecast of day-ahead hourly electricity loads with a focus on the demand of peak hours in South Africa. As an example of data-driven models, the long short-term memory (LSTM) model was used to forecast the quantiles of electricity loads aiming to minimize the quantile pinball loss function [66].
Finally, ref. [67] assessed the benefits of using probabilistic methods to estimate the required day-ahead reserves. In this regard, the authors developed two probabilistic methods to forecast the system net loads. The results of probabilistic forecasts are then utilized to quantify reserves that are required to compensate for the intermittency and uncertainties of renewable generation.

Electricity Price Probabilistic Forecasting
In electricity markets, electricity prices are affected by the total system's supply and demand. However, each participant playing in these markets needs to schedule their resources according to the predicted values of the market prices, in short-term horizons. Regarding future power systems, renewable-based generation will be the main source of electricity. Their intermittent nature and real-time fluctuations increase the electricity price fluctuations [68]. Accordingly, electricity price forecasting models need to be improved to actively follow the prices' fluctuations. In addition, it can be argued that probabilistic forecasting models are attracting increasing attention since they are able to consider various uncertainties and possibilities [10].
Regarding studies that proposed parametric models, the authors of [69] proposed a first-order vector autoregressive (VAR) model considering exogenous effects and using skew t distribution in a Bayesian framework. The model was then sent to the Markov chain Monte Carlo for uncertainty analysis. Reference [70] adopted the GAMLSS method and also estimated the PIs to be time-varying quantiles of the acquired density forecasts. In [71], the authors developed generalized autoregressive conditional heteroskedasticity (GARCH)-based time-varying models to estimate the density function of the variable. As a semi-parametric model, ref. [72] introduced a semiparametric model that is combined with a time-adaptive quantile regression [34] in order to predict day-ahead market price densities. The proposed model was then compared to four well-known GARCH models considering a three-year time span. They finally proved that their proposed model is more reliable in terms of generating quantile estimates.
Recently, non-parametric data-driven probabilistic forecasting models are more often employed due to their flexibility. For instance, ref. [73] introduced a deep neural network model-based method for the probabilistic prediction of electricity prices. In this model, they first made the price distributions using its historical data. Afterward, they employed a deep convolutional neural network (DCNN) for extracting high-level features. The obtained high-level features were sent to label distribution learning forests (LDLF) in order to construct price probabilistic forecasts. In another work, ref. [74] developed a two-step model to probabilistically forecast German-Austrian day-ahead prices. It first proposed estimating the mean of correlated time series prices using ordinary least squares and the elastic net method. In the second step, they estimated the residuals using the maximum likelihood method.

Uncertainty Modeling
Uncertainty modeling itself needs comprehensive study. In general, uncertainty modeling aims to capture the dynamics of the uncertain input data and generate scenarios based on their probability distributions [75]. The results are then used as the input of stochastic programming. Figure 5 overviews some uncertainty modelling techniques utilized for stochastic programming. In addition, Table 5 states some selected works using these techniques.
Appl. Sci. 2022, 12, x FOR PEER REVIEW Figure 5. Uncertainty modelling techniques utilized in a smart grid environment. Table 5. An overview of selected papers using scenario generation techniques in a smart g ronment.

Ref. Uncertain Parameters Uncertainty Generation Technique
General Objective of the Work [76] Wind generation, generators' reliability Sequential MCS Minimizing the total energy generation [77] Wind generation, PV generation, battery storage charging/discharging output, biomass combined with heat generation, and thermal energy storage output Sequential MCS Minimizing the total energy costs + minimizing economic risks [78] Renewable generation, electricity demand, household hot water, and space heating and cooling parameters

Pseudo-sequential MCS
Minimizing energy generation environmenta + Minimizing energy costs [79] Renewable generations, electricity demand, and water inflow Pseudo-sequential MCS Maximizing financial profits + Minimizing economic risks [80] risks Non sequential Optimizing net present value and analyzing g energy life-cycle for power and transportatio [81] Renewable generations GAN Generate samples for renewable generations historical data Monte Carlo simulation (MCS) is one of the most popular methods for scena eration purposes. The PDFs of the random variables, forecasting errors of the d market variability, in general, are employed by the Monte Carlo simulation m order to learn the uncertain data and generate their associated scenarios, accordin Figure 6 illustrates the steps of achieving output from uncertain inputs, using th Carlo simulation method.  Table 5. An overview of selected papers using scenario generation techniques in a smart grid environment.

Uncertainty Generation Technique
General Objective of the Work [76] Wind generation, generators' reliability Sequential MCS Minimizing the total energy generation costs [77] Wind generation, PV generation, battery storage charging/discharging output, biomass combined with heat generation, and thermal energy storage output

Sequential MCS
Minimizing the total energy costs + minimizing economic risks [78] Renewable generation, electricity demand, household hot water, and space heating and cooling parameters

Pseudo-sequential MCS
Minimizing energy generation environmental impacts + Minimizing energy costs [79] Renewable generations, electricity demand, and water inflow Pseudo-sequential MCS Maximizing financial profits + Minimizing economic risks [80] risks Non sequential Optimizing net present value and analyzing geothermal energy life-cycle for power and transportation sectors [81] Renewable generations GAN Generate samples for renewable generations based on historical data Monte Carlo simulation (MCS) is one of the most popular methods for scenario generation purposes. The PDFs of the random variables, forecasting errors of the data, and market variability, in general, are employed by the Monte Carlo simulation method in order to learn the uncertain data and generate their associated scenarios, accordingly [82]. Figure 6 illustrates the steps of achieving output from uncertain inputs, using the Monte Carlo simulation method. Appl  The advantages of the Monte Carlo simulation method can be described as follows [83]:

•
The method is able to sample from random processes and supports all of the distribution functions. • A transfer function is not required.

•
It does not need a mathematical formulation since it can model a problem in the form of a black box system and can obtain output considering samples of inputs.

•
The method is relatively easy to implement.

•
It is able to model both non-differentiable and non-convex problems.
Recently, Monte Carlo simulation algorithms have evolved and improved. For example, sequential MCS is able to model uncertainties of inputs in chronological order. With the help of the sequential method, uncertainties of time-series inputs (such as variable generation of renewable energy resources and electricity demand) are implemented in a better way [75]. Pseudo-sequential is another extension of the MCS method which has the ability to converge faster compared to the sequential version. The pseudo-sequential method can sample states through its non-sequential capability and uses chronological simulation for the failed states [75]. Finally, the non-sequential MCS method is another family member of the Monte Carlo method which cannot model uncertainties chronologically and requires high computational costs [75].
Recently, a model-free and data-driven method, called the "Generative Adversarial Network (GAN)", has attracted more attention for scenario generation purposes. The The advantages of the Monte Carlo simulation method can be described as follows [83]:

•
The method is able to sample from random processes and supports all of the distribution functions. • A transfer function is not required. • It does not need a mathematical formulation since it can model a problem in the form of a black box system and can obtain output considering samples of inputs.

•
The method is relatively easy to implement.

•
It is able to model both non-differentiable and non-convex problems.
Recently, Monte Carlo simulation algorithms have evolved and improved. For example, sequential MCS is able to model uncertainties of inputs in chronological order. With the help of the sequential method, uncertainties of time-series inputs (such as variable generation of renewable energy resources and electricity demand) are implemented in a better way [75]. Pseudo-sequential is another extension of the MCS method which has the ability to converge faster compared to the sequential version. The pseudo-sequential method can sample states through its non-sequential capability and uses chronological simulation for the failed states [75]. Finally, the non-sequential MCS method is another family member of the Monte Carlo method which cannot model uncertainties chronologically and requires high computational costs [75].
Recently, a model-free and data-driven method, called the "Generative Adversarial Network (GAN)", has attracted more attention for scenario generation purposes. The model employs artificial neural networks (ANNs) and aims to synthesize some understanding from the training of real data. The notable advantage of GAN scenario generation is that this model does not need distribution functions of uncertain variables [81].

Decision Making under Uncertainties
The future power system is heading towards being smart and decentralized. In a new smart grid system, there will be a number of agents and stakeholders that face uncertainties in their decision-making problems. Here are some examples: • A generation company that has renewable resources needs to submit offers to energy markets before knowing the resources' exact generation and market prices. • Every management system in smart grids (such as microgrid energy management system and energy community management systems) needs to deal with its intermittent renewable energy resources' output as well as the uncertain resources' behavior (such as the EV charging behavior) to come up with the optimal scheduling and management of resources. • Strategic agents try to deal with uncertain market prices and their competitors' strategies beforehand when they construct their optimal bidding strategies.

•
Retailers should buy electricity based on their customers' uncertain demand.

•
Balancing responsible parties require to schedule their flexible energy resources, such as their energy storage systems, based on the uncertain generation in a way to maintain the balance between their generation and their demand.

•
Transmission system operators (TSO) and distribution system operators (DSO) must decide on the amount of reserves and flexibility as well as the methods to operate their network and keep the security of supply and reliability of the system within the specified limit, in spite of different uncertainties and the intermittency of renewables.
Hence, the lack of perfect information affects optimal decision making. In this regard, stochastic programming, robust programming, and chance-constrained models offer to solve optimization problems with uncertain input data.

Stochastic Programming
A stochastic optimization problem models uncertainties of input data by weighting the decision-making solutions. The weights are selected based on the probabilities of occurrence, considering that each set of input data leads to a single solution. In this way, one will achieve the effects of uncertain input data on the decision-making solution [84]. A simple stochastic programming can be formulated as [6]: where x is the decision variable and ω denotes the scenario. As the formulation states, uncertainties should be modeled in terms of different scenarios. The most common and simple techniques for generating scenarios need to use the inverse of the cumulative distribution function (CDF) or PDFs of the uncertain parameters. Hence, the probabilistic forecasts of the inputs are necessary to develop stochastic programming. However, it would be easier if the random input data have specifically known parametric distribution [6]. This means that parametric probabilistic forecasts are more favored in this sense. Here, if one knows the probability distributions of the inputs, they can achieve the probability distributions of the output data.
In power system concepts, an independent system operator or a generation company are proposed to conduct stochastic unit commitment by using stochastic programming [85]. In unit commitment applications, stochastic programming is divided into two-stage and multi-stage problems. Two-stage models consider both day-ahead (hear and now) and real-time (wait and see) commitment decisions. In the day-ahead stage, the conventional generators are dispatched and their commitment decisions are determined while in realtime, uncertain renewable resources and flexible energy resources are dispatched. Regarding the second (real-time) stage, to develop the two-stage problem, one needs the PDFs of renewable generation as well as those of flexible energy resources to build a large number of relevant scenarios for the PDFs of the outputs. Table 6 summarizes some of the most recent studies that considered stochastic programming to find the optimal commitment decisions for their resources. Table 6. An overview of recent papers using stochastic programming for decision making in a smart grid environment.

Ref.
Uncertain Parameters Methods to Capture Uncertainties Objective [86] Prices (day-ahead market and balancing market), renewable generations, loads, driving requirement and the availability of electric vehicles (EV) Two-stage stochastic programming (day-ahead and real-time scheduling) Proposing a system for microgrid support by maximizing the expected profit of a microgrid aggregator [87] EVs' arrival, and departure time, as well as EVs' daily traveled miles and types, solar irradiation and wind speed, loads Two-stage stochastic programming Maximizing the retailer's profit (first stage: fuel cell scheduling second stage: distributed generation scheduling) [88] Electricity demand, wind speed Two-stage stochastic programming Minimizing day-ahead dispatch costs of the wind-thermal-hydropower-pumped storage system along with the system's expected balancing costs [89] Renewable generations and loads Two-stage stochastic programming Minimizing the costs of reserving flexibility services in day-ahead forecasting and their real-time activation [90] Wind generation, demand and market prices Two-stage stochastic programming Maximizing microgrid's profits participating in day-ahead and real-time markets taking into account the microgrid's reconfiguration [91] Wind generation, demand and market prices Simple scenario-based stochastic programming Obtaining coordinated network expansion planning by minimizing the operation cost of generation + Minimizing the annual investment cost of expanding the transmission networks + Minimizing the renewable resources' annual investment costs + Minimizing the annual investment and operation costs of energy storage systems + Maximizing the flexibility index of the system [92] Renewable generations Simple scenario-based stochastic programming Procuring ancillary services from flexible distributed energy resources in a day-ahead operational planning by minimizing network's costs [93] Electricity demand, renewable generations, market prices Two-stage stochastic programming Supplying the aggregated demand through their participation in the day-ahead market and maximizing their total expected profits [94] Electricity demand and PV generation Multi-stage stochastic programming Operating an energy community with PV-storage system by minimizing the electricity purchased from the grid [95] Wind generation Two-stage stochastic programming Economic dispatch for a hybrid distribution system based on active-reactive power coordination by minimizing the cost of gas-fired operation, power purchasing from the upstream grid, penalty costs related to substations' power fluctuations, network losses, and the costs of wind curtailment and load shedding [96] Renewable generations, electric vehicles, loads, and market prices Simple scenario-based stochastic programming Optimal energy management of microgrids by minimizing operational costs of distributed energy resources, the costs of purchasing power from the upstream network, the costs incurred from the energy not served, and those related to EV batteries' degradation costs Once we make a decision for the day-ahead stage, decomposition methods are then employed to treat the real-time stage scenarios independently. This approach leads to considerably fewer scenarios, compared to the non-decomposed method [75].
Multi-stage stochastic programming models construct a scenario tree and accordingly try to capture uncertainties in a dynamical way. In multi-stage models, the uncertainties are treated chronologically, meaning that the uncertainties at time t affect those at t + 1, . . . , t + m. However, the method comes at huge computational costs.

Robust Programming
The robust optimization approach aims to obtain a problem solution that is always feasible under different realizations of the uncertain inputs. In other words, the robust approach seeks optimal solutions in the worst-case realizations or worst-case scenarios [97]. Robust optimization deals with the uncertainty sets, and in the first stage aims to optimize the problem considering the scenarios under which the worst solutions are obtained. A simple robust optimization is formulated as follows: In (15), all possible sets of ω (scenarios) are included in the uncertainty set, i.e., W. In this way, the uncertainty sets of the uncertain parameters need to be defined adequately. The uncertainty sets are also required to cover the uncertain phenomena that can happen for the uncertain parameters. Naive and inappropriate definitions of the uncertainty sets may result in either too conservative or very risky solutions. This means that the appropriate uncertainty sets should comprise risky and conservative decisions [7]. There are other extensions of the robust optimization such as robust stochastic optimization, adaptive robust optimization and distributionally robust optimization. Although they all have the same concept, they try to keep the balance between the risky and conservative solutions in different ways and under various assumptions on uncertain parameters.
The applications of robust optimizations in smart grids and power systems can fall into one of these categories:

•
Robust network and generation planning and expansion (e.g., [98,99] A number of studies and papers utilized robust optimization or its extensions in order to solve their decision-making problems. Table 7 summarizes these papers. Table 7. An overview of recent papers using robust programming for decision making in a smart grid environment.

Ref.
Uncertain Parameters Methods to Capture Uncertainties Objective [108] Inflow and PV generation Robust optimization Maximizing the minimum power generated within the operation interval + Maximizing the operational interval + Maximizing the feasible solutions obtained by the operation interval [109] Electricity demand, generation capacity, as well as uncertain economic, environmental, and social parameters for customers Robust fuzzy multi-objective optimization programming Maximize the total profits of the whole system + Maximizing the social benefits of the system consumers + Minimizing the total negative environmental impacts [110] Energy price Robust optimization Minimizing the net costs of a smart home [104] Wind and PV generations Adaptive robust optimization Minimizing the operating costs of an isolated microgrid Table 7. Cont.

Ref. Uncertain Parameters
Methods to Capture Uncertainties Objective [111] Wind speed, demand, and solar irradiance Conditional value at risk (CVaR) combined with robust optimization Minimizing the costs of an energy hub participating in energy and reserve markets + Minimizing the emissions of pollution [105] Wind and photovoltaic generations Two-stage adjustable robust optimization Minimizing the costs of multi-energy system that supplies both electricity and heat loads [102] Load and energy price Hybrid stochastic/robust optimization Minimizing planning, operation and resilience costs of the distribution networks considering earthquake and flood situations [112] Energy price and PV generations Hybrid stochastic/robust optimization Maximizing the profits of a household customer [113] Wind and PV generations, loads, and market-clearing prices P-robust (a combination of robust and stochastic programming) Minimizing the operating costs of diesel engine, micro turbine, procurement costs, costs of pollutant treatment, and costs of reimbursing incentive-based demand response programs [114] Availability of microgrid equipment, active and reactive loads, parameters of EVs, energy price, wind and PV generations

Hybrid stochastic/robust
Minimizing the microgrid's costs including the cost of buying energy, the operation cost of non-renewable energy sources, the reliability costs in terms of non-supplied loads [115] Renewable distributed generations Robust model combined with prediction control Maximizing the amount of load restoration that is controlled by the output of the power units and remote-controlled switches [116] Electricity demand and facility installation costs Robust optimization Minimizing the installation costs of power plants, high-voltage/low-voltage substations, and feeders, feeders' power transmission costs + Minimizing the storage power cost, power losses' costs in feeders, feeder failures' costs [98] PV and wind generations, Robust optimization Minimizing the annual costs of the regional distribution networks [103] Wind generation, outages, La Niña and El Niño events (a long-term warming happening for the central and eastern Pacific and vice versa) Robust optimization Minimizing investments' and operations' costs of the system [99] Unbalanced power percentage Robust optimization Minimizing the annual investment costs of transmission network lines as well as the costs related to battery superconducting magnetic hybrid energy storage system under maximum the load-shedding conditions + Reducing the insufficient supply if N-k faults happen [117] Random N-K contingency, wind and PV generation Robust optimization Minimizing the investment costs of building candidate lines, the generation costs of conventional generators, the costs related to scheduling downward and upward spinning-reserves, costs of renewable generation curtailment and load-shedding [106] Market participants' offers and bids as well as real-time market prices Robust optimization Maximizing profits of a virtual bidder (optimal bidding strategy) [107] Wind generation and electricity prices Robust MPC-based optimization Maximizing the profits and minimize the operating costs of a wind-storage system (optimal bidding strategy) [118] Renewable generations and electric vehicle charging behavior Robust optimization Optimal location and sizing of renewable distributed generation and the charging stations based on maximization the station's total payoff [119] Loads and renewable generations Robust optimization Designing generation resources for a microgrid to meet its demand by minimizing the total generation costs of the resources [100] Loads, wind and PV generations Robust optimization Positioning and sizing of the energy storage system by minimizing the operation costs of the flexible energy resources as well as those of the network [101] PV generation Robust optimization Planning of distributed battery energy storage from a DSO viewpoints by minimizing the batteries' degradation and operation costs

Chance-Constrained Programming
Chance-constrained optimization problems aim to give an optimization constraint the possibility to be satisfied up to a specified level. It can be formulated as follows: where Pr refers to probability and η indicates the confidence level [120]. In this regard, η should be selected between 0 and 1. According to (16), constraint h(x, ω) ≤ 0 needs to be satisfied up to η level. In other words, operators or designers that use the chanceconstrained method ensure that h(x, ω) ≤ 0 will be satisfied in (η × 100)% of scenarios.
Chance-constrained programming has a wide range of applications in smart grids. Here are some examples:

1.
Operations of renewable-based systems to guarantee a certain level of reliability/ security/flexibility 2.
Planning of distribution/transmission networks in a way to guarantee a certain level of reliability/security/flexibility 3.
Determining system reserves to guarantee a certain level of reliability/security/flexibility Table 8 summarizes some recent works that adopt chance-constrained programming methods to capture uncertainties. Table 8. An overview of recent papers using chance-constrained programming for decision making in a smart grid environment.

Ref.
Uncertain Parameters Methods to Capture Uncertainties Objective [121] Electricity prices and PV generation Chance-constrained programming Minimizing the costs of energy trading between the power grid and microgrids + Minimizing the fuel costs of fuel-based power generation and boiler units [122] PV generation Chance-constrained programming The study aims to integrate renewable energy as much as possible by minimizing the hybrid system's power curtailment + Maximizing the renewable power generation injected into the system [123] Loads, market prices, renewable generation

Chance-constrained programming
Minimizing the total operation costs of a combined, power-based, cooling, and heating microgrid: Minimizing the costs of power and gas purchased + Minimizing the operation costs of microgrid's CHP units and micro turbine + Minimizing batteries' degradation costs + Maximizing the revenues obtained from selling electricity and heat to the upstream grids [124] Loads, wind generation Chance-constrained programming Minimizing the costs of buying power from thermal units + Minimizing the costs of buying spinning reserves from generators as well as demand-response resources [125] Renewable generations Chance-constrained programming Microgrid management by minimizing the operation costs of its units + Minimizing the costs of buying power from the upstream grid + Maximizing the revenue obtained from selling electricity to the upstream grid [126] Operational modes of the microgrids Chance-constrained stochastic conic programming Solving multi-site microgrids' investment problem and microgrid dual-mode network operations by minimizing microgrid operation and maintenance costs, microgrid electricity transaction costs, its network loss costs, and microgrid load curtailment costs  [128] Renewable generation and loads Chance-constrained programming Finding potential self-sufficient sub-networks within the existing electrical distribution grid by maximizing average load served in each sub-network [129] PV generation Chance-constrained programming Designing solar-based microgrid and solving related dispatch problem by minimizing the capital costs of PV panels and the capacity costs and installation costs of energy storage system, the expected costs of multi-year operation of the microgrid which include load shedding penalty costs and wind micro-turbine generation costs [130] Renewable generation Chance-constrained programming Minimizing the total cost of generating power and gas [131] Loads and system frequency Chance-constrained programming Optimal scheduling of grid-connected batteries providing frequency-related services by minimizing the cost of purchasing energy from the grid as well as the system costs [132] Renewable generation Mixed integer second order cone chance-constrained programming Controlled islanding strategy

Further Probabilistic Forecasting and Applications
As can be seen in the tables (Tables 6-8) that review studies using stochastic, robust, and chance-constrained programming, smart grids need more probabilistic forecasts [124] which are made for other uncertain parameters rather than loads, renewable generation and prices. EV charging-related behaviors, battery state-of-charge (SOC), dynamic line rating (DLR), and network states are some examples. In this regard, a few papers conducted studies to develop probabilistic forecasting of other uncertain parameters that are important for decision-making in smart grids. In this section, we will review these papers that employ probabilistic forecasting models for uncertain parameters, rather than the introduced popular uncertain parameters.

Probabilistic Forecast of BESS SOC
Authors of [133] developed a probabilistic forecasting model that analyzes the uncertainties of the battery energy storage system's state of the charge (BESS SOC) in providing the primary frequency control. The results were then used as inputs of the predictive optimization of BESS which schedules BESS for providing multiple flexibility services. In order to develop the probabilistic forecasts, the authors applied a multi-attention recurrent neural network (MARNN) to extract the most important contextual information in timeseries forecasting. Afterward, they proposed a robust forecast, utilizing the combination of mixture density networks (MDNs) and Monte Carlo dropout (MCD). Finally, the proposed model was tested for different regulatory frameworks of primary frequency control services, using the frequency datasets of real-world power grids.

Probabilistic Forecast of Time and Flexibility of EV Charging
In [134], the authors employed a quantile forecast to probabilistically predict EVs' parking duration as well as their upcoming trip distance using the forecast framework introduced by [135]. To develop the model, German datasets regarding travel logs were adopted. In this regard, the authors first determined the requirements that are used as inputs of smart charging systems. In the second stage, they extracted features from the travel logs. Then, the paper compared the charging stations' current information with those of historical parking events. If the EV users grant the permission, the travel data are extracted from the smartphone applications. As a result, the authors proposed a forecasting model based on cross-validation performance. In the final stage, the results demonstrate that the charging station operator using the proposed forecasting model can profit by selling flexibility services. The model was also proven to resolve the congestion issue within the station.

Probabilistic Forecast of Other Uncertain Parameters
In [136], the authors developed a probabilistic forecasting model for the current rating of transmission lines using QRF in order to solve the dynamic line rating problem with a focus on the reliability of the distribution network's lower part. In the second stage, the results were employed to conduct a cost benefit analysis using a bi-level stochastic problem. The problem considered two aspects of costs: (1) the reduced generation costs due to the higher power transfer capacity and (2) the increased reserves' adoption resulted from forecast errors.
As another application, ref. [137] investigated the probabilistic forecast of low-voltage states (voltages as well as active and reactive power) for effective operation of distribution networks. In this regard, it tested two quantile forecasting methods considering different levels of distributed renewable generation injection. The probabilistic forecasting results were then integrated with an optimization problem to avoid over voltages within the networks.

Conclusions and Future Direction
This paper discussed the roadmap towards making decisions under uncertainty in a smart grid environment. This roadmap started with introducing different types of probabilistic forecasting and continued with discussing for which uncertain variables the literature uses probabilistic forecasting. In this regard, the main focus of the literature was on obtaining the probabilistic forecasting models for renewable generation (both wind and PV generation), electricity loads, and electricity prices. Afterward, the paper described how probabilistic forecasting models were applied in the literature. For this purpose, it reviewed some papers adopting scenario generation techniques in smart grids. Two important methods of scenario generation, i.e., Monte Carlo simulation and generative adversarial Network, were introduced, and the paper explained in what way they are related to decision making. In addition, a limited number of papers utilizing introduced scenario generation methods in smart grids have been reviewed.
In the next step, decision-making under uncertainty was discussed. It was stated that energy management systems of smart grids need to solve optimization problems in order to operate and plan their resources. In fact, uncertainties should be taken into account in the optimization problem. Accordingly, stochastic, robust, and chance-constrained programming that consider uncertainties in the optimization problem were briefly introduced. It was discussed how these problems can be used in smart grid decision makings by reviewing the most recent papers.
Furthermore, two more applications of probabilistic forecasting were reviewed. Although there exist a wide range of uncertain parameters in smart grids, the probabilistic forecasting models were only developed for a limited number of these variables. For example, there are very few papers that proposed to develop probabilistic forecasting models for uncertain parameters of BESS and EVs (such as their SOC, charging power, etc.). However, decision-makers need the distributions of these variables along with those of loads and renewable output. Thus, future studies need to be conducted to obtain and develop distributions of EV's and BESS's parameters.
Finally, it should be mentioned that future power systems are heading toward hosting a high share of renewable generation. However, at the moment, power grids are not flexible enough to tolerate this situation and need more flexibility from different flexible energy resources and flexibility solutions (e.g., related to active network management). As a result, operators need to know:

•
What are the available flexible energy resources in the current system? • What are the potential flexible energy resources that should be activated?
• How much flexibility is needed for the future power system?
To answer these questions, the system operators need to forecast: • The flexibility required to deal with a high share of renewables in the future such as the reserves • The available flexibility (related to active power P and reactive power Q) of the system at different levels of the systems (flexibility at TSO, DSO, and customer levels) • Potential congestions (voltage and/or thermal limit violations) of lines and other passive power system key components Point forecasts of flexibility do not give operators a comprehensive insight in order to deal with the uncertainties of the future. However, with the help of probabilistic forecasts, decision-makers can assess different operational and planning decisions considering different renewable injection scenarios.

Conflicts of Interest:
The authors declare no conflict of interest.  link function for modeling the mean value in terms of explanatory variables K Kernel function l

ANFIS
Heaviside step function N total number of sample points h bandwidth referring to the smoothing parameter Y t , y t response variable at t Y t ,ŷ t point forecast at t P active power Q reactive power