Locational Marginal Price Forecasting Using SVR-Based Multi-Output Regression in Electricity Markets

: Electricity markets provide valuable data for regulators, operators, and investors. The use of machine learning methods for electricity market data could provide new insights about the market, and this information could be used for decision-making. This paper proposes a tool based on multi-output regression method using support vector machines (SVR) for LMP forecasting. The input corresponds to the active power load of each bus, in this case obtained through Monte Carlo simulations, in order to forecast LMPs. The LMPs provide market signals for investors and regulators. The results showed the high performance of the proposed model, since the average prediction error for ﬁtting and testing datasets of the proposed method on the dataset was less than 1%. This provides insights into the application of machine learning method for electricity markets given the context of uncertainty and volatility for either real-time and ahead markets.


Introduction
Forecasting of LMPs in electricity markets provide valuable information for decisionmaking. Real-time electricity markets and ahead markets provide a lot of information that can be used appropriately for decision-making purposes. This paper proposes a practical tool to be used to forecast LMPs. The LMPs reflects the cost to supply an additional unit of power in a given location considering the power system operation [1,2].
From an economic perspective, the LMPs represent the value of electricity in each location, and it is used as remuneration/payment of electricity for buyers and sellers [3]. From an operational perspective, the LMPs reflect the restrictions in the transmission system according to the availability of supply resources and demand. Therefore, this value tends to fluctuate over an operating horizon [4]. Hence, LMPs have been a widely used mechanism by many countries due to their already proven incentive schemes compatibility and cost tracking. Thus, it can facilitate decision-making for different scenarios and tasks [5].
Likewise, the LMPs are the shadow prices for real power restrictions for the optimal power flow (OPF) [6]. The OPF is an optimization model to find the most economic dispatch of real power considering the network characteristics, the available generation, demand, and operational restrictions [7,8]. The OPF could be expressed as DCOPF or ACOPF [9] for a single period and multi-period evaluation [10].
Accordingly, the agents participating in electricity markets are making decisions ahead and in real-time. The LMPs are economics signals that reflect the price of electricity and the agents make decisions on investment, trading, buying and selling of electricity based on this valuable information. However, due to the LMPs depending on operational conditions, the LMPs change with variations in the demand patterns, generation resources, and network restrictions.
On the other hand, with the emergence of various machine learning techniques, the use of their complex and nonlinear relationship handling features for different data-driven energy market applications, where complex tasks such as forecasting have produced significant results, which has generated more interest, especially for energy-related economic decision-making applications.

Literature Review
Due to the availability of data in electricity markets, recently a lot of effort has been made towards doing the forecasting with various approaches and targets [11]. Usually, price prediction is used as a forecast target [12]. A price prediction methodology is proposed in [13] using a Kalman Filter approach to provide information for investment. In [14], a comparison between support vector regression (SVR) and random forest (RF) shows that the SVR approach is slightly better for price forecasting. Similarly, hybrid approaches are proposed in [15,16], and they use processing stages in order to get price forecasting. In [17], a hybrid learning method is applied to do load forecasting based on neural network with particle swarm optimization. Another forecast target includes power generation of gas turbines [18]. In addition, deep neural networks have been used as an approach to forecast electricity prices [19,20].
LMPs forecasting have been recently explored in [21], and the authors present a methodology based on processing data through various algorithms. In [22], the authors present an ensemble approach based on the components of energy, congestion, and losses to forecast LMPs using three separate modules. Some electricity markets use price zones to remunerate electricity, so, in [22], the authors propose a non-parametric model to predict price and demand simultaneously. On the other hand, clustering approaches have been used to find groups of LMPs as prices zones given the relevance of LMPs in the electricity markets [23,24].
In this paper, we propose to use a direct method to forecast LMPs using the active power demand of each bus in a power system with a tool based on a multi-output approach based on support vector-based regression. Due to our method using only one stage, it can be used straightforwardly for data sets for real-time applications and offline applications with different purposes. We generate all required data with a stochastic approach using an ACOPF model in order to test the proposed tool. This paper is organized as follows: a brief introduction of the proposed tool development, including the background for SVR for single and multiple outputs, is presented in Section 2. Section 3 presents and discusses the forecasting results of the tool for a test power system. Section 4 draws the conclusions.

Materials and Methods
The scope of this article deals with the development of a tool based on support vector machines regression models, with special focus on a multi-output approach, all fed with data obtained from stochastic processes coming from an AC-OPF model. A detailed description of the main features of this approach is given below.

Support Vector Machine: Regression (SVR)
Support vector machines can also be used as a regression method, maintaining all the main features that characterize this algorithm. For this purpose, since the output can have infinite values (since they are real numbers) unlike the classification variant, the support vectors now play the role of establishing a tolerance margin as shown in Figure 1 (denoted by epsilon-ε) that allows the algorithm to contain the data between its margins [25]. However, the main idea is based on: minimizing the error and individualizing the hyperplane that maximizes the margin, considering that part of the error is tolerated [26] in order to get better generalization. Single-output regression is considered as the search for the mapping between an input vector x ∈ R d (d is a dimension of vector x) and an observable output y ∈ R from a given set of independent and identically distributed samples (defined by N) based on statistical learning theory [27] and through a regression function f (x). For this search, this technique solves the minimization problem as a structural risk function presented in Equations (1) and (2): where w (weight vector) and b (bias constant) correspond to the fitting parameters of the regression function f (x), φ(x) denotes a nonlinear transfer function mapping model inputs into a higher dimensional space (in this case, it is due to the ACOPF solution nonlinearity), ξ i and ξ * i are the control or slack variables of the error committed by the regression function when approximating an i-th sample, and C determines the balance between the regularity of f (x) and the tolerance to deviations more significant than ε. In practice, this means that both epsilon and C are hyper-parameters to be tuned in this method prior to its deployment since their values do not depend on optimization problem solving.
On the other hand, expressions presented in (2) correspond to optimization problem constraints. In this case, refer to the support vectors and the attached margin (i.e., dotted lines in Figure 1). Likewise, the values of slack variables ξ and ξ * i cannot be less than zero (expressed as distances between actual and predicted values).
Since it is an optimization problem, the solution of the primal problem presented in (1) and (2) simultaneously involves the solution of a dual problem associated with the value of the constraints, which is of great interest for this method. This is due to the constraints referring to the support vectors and the slack variables. In this way, the desired function would be obtained without depending on the resolution of the problem on the dimension d in which our input examples x are found and would depend only on the support vectors [28]. The description of the dual problem obtained by means of the Lagrangian method and Karush-Kuhn-Tucker (KKT) conditions is expressed in Equations (3) and (4): where α and α * are the dual variables associated with the constraints that can take values greater than zero and less than the penalty hyperparameter C, K(α i , α j ) corresponds to the application of a kernel function (well-known as kernel trick) that satisfies Mercer's conditions and transforms the nonlinear data into a higher dimensional feature space to make a linear separation possible [29]. For this case, a Gaussian radial basis function (G-RBF) [30] as expressed in Equation (5) was used: From the dual problem, it is possible to extract an expression of the prediction function under the support vectors and kernel dependence conditions mentioned above, as presented in Equation (6):

Multi-Output Regression with SVR
Multi-output regression (regardless of the individual machine learning technique involved) aims at fitting using many individual regressors from an input space x with multiple variables or patterns to an equally multivariate output space y, where both variable spaces must be numerical.
Therefore, when performing a multiple-output regression case using support vector machines, it is possible to extend the single-output SVR definition in this case. For this purpose, it is now considered that the regression output has M dimensions, such that y ∈ R M . This consideration modifies the objective function of the single-output SVR optimization problem as seen in Equation (7), since it implies that the parameters w, ξ, and b have increased their dimensional spaces: It is important to emphasize that the duality, nonlinearity (kernel trick) and support vector dependence conditions of the single approach of this method apply in the same way to the SVR-based multi-output regressor. However, a dimensional space adjustment on the variables is required. On the other hand, a further examination reveals that this approach is equivalent to m single output optimization problems. In other words, the regression problem solution can be decoupled among the different output variables.

AC-OPF Synthetic Data
The development of the proposed model required a dataset related to different operating scenarios of any power system. This means that the solution of the optimization problem corresponding to the single-period AC optimal power flow is used to obtain the LMP values from its dual problem.
Therefore, in order to perform different power system operating scenarios (i.e., a dataset), different active power load patterns have been generated using Monte Carlo simulations with bounded uniform distribution functions by their base value. It is important to highlight that the buses that do not initially present active power load (i.e., 0 MW), so their uniform distribution does not modify this condition.
The obtained active power load data have been incorporated to the power system in order to perform its AC-OPF and thus obtain the corresponding LMP values of each bus-related with each operating scenario. All operating scenarios have been performed through MATLAB in conjunction with the MATPOWER platform towards the dataset creation as shown in Figure 2. Hence, n operating scenarios of an i-bus power system produce a dataset with n rows and 2i columns (i.e., bus active power load and LMP data).

Tool Development
In order to perform the multiple and simultaneous bus LMP predictions, we develop a tool that uses a multi-output regressor based on support vector machines (multi-output SVR) implemented through the scikit-learn Python library. In this model, the input vector x corresponds to the different active power load values of each power system bus, and the output vector y refers to the LMP value corresponding to each bus. Figure 3 shows the model scheme and its interaction with all data. All AC-OPF synthetic data have been randomly divided into two sets, one of them with 85% for hyperparameter tuning with cross-validation and model fitting tasks, and the other set with the remaining 15% only for testing. This approach is well-known as a validation set method [31].
After splitting, in order to identify and select proper hyperparameters (i.e., hyperparameter tuning) for SVR, the cross-validation random search is used. Let C ∈ {C 1 , C 2 , C 3 , . . . , C m } and ε ∈ {ε 1 , ε 2 , ε 3 , . . . , ε n } where m and n refer to the number of C and epsilon values to be considered. For different random combinations {C, ε}, the mean absolute error (since it is less susceptible to atypical data) in the training dataset is calculated to determine the better hyperparameters combination for model fitting with these same data.
Following model fitting, in order to assess the forecasting performance of the proposed tool under new power system operating scenarios, tests have been carried out with test data (i.e., unseen data). To perform this assessment, the following regression performance metrics that allows multi-output and have been focused on these issues were included: R-squared (R 2 ) [20], mean square error (MSE) [32] mean absolute error (MAE) [2,20], and mean absolute percentage error (MAPE) [33], which are defined in Equations (8)-(11): where y is the actual value (real),ŷ represents the forecasted value, andȳ belongs to the forecasted values mean. The R-squared metric is used to quantify the correlation between the data predicted by the proposed model and the actual data. The ideal correlation corresponds to the unit value, so a value of r-squared close to 1 indicates an accurate estimate, which is an intuitive metric of analysis. Moreover, the other performance metrics (MSE, MAE, and MAPE) provide different model error dimensions (relative and absolute) related to the LMP prediction in order to perform a better analysis. However, since this approach presents an individual (all operational scenarios for one bus) and collective approach (one operational scenario for all buses), which were analyzed separately for a better comprehension. A summary of this entire development process is shown in Figure 4.

Results and Discussion
In order to test the proposed tool based on machine learning approach through an SVM-based multi-output regressor, a test IEEE power system is used to generate different operational scenarios with variations through a stochastic process simulation of bus loads, then solve the OPFs to obtain the corresponding LMPs. All this information (i.e., bus loads and LMPs) has been created in order to generate a dataset for fitting and testing of the proposed tool. The following subsections provide a comprehensive analysis of each result in order to observe the tool performance in different operating scenarios.

Case Description
The power system used for the deployment of this tool is the IEEE RTS 118-bus power system according to [37]. Thus, all described power system parameters belong to the test case included in MATPOWER. The one-line diagram for this system is shown in Figure 5.  Table 1.

Active Power Load Data
Since the power system that was chosen for this test has 118 buses and 1000 different operating scenarios have been considered, Monte Carlo simulations fed by discrete uniform functions bounded by ±50% of their base active power load value of each bus were used in order to create the corresponding scenarios. Therefore, after including and performing this bus load data in the AC-OPF in order to obtain the LMPs for each scenario, the structure of this dataset contained 1000 rows and 256 columns (i.e., 118 active power and 118 LMP values per scenario). The results of this process are shown in Figure 6. It is important to highlight that all datasets have been randomly divided into two sets, one of them with 85% (i.e., 850 samples) of the data for hyperparameter tuning and model fitting, and the other set with the remaining 15% (i.e., 150 samples) only for testing.

Tuning Model Hyper-Parameters
The tuning parameters in the proposed tool, as mentioned above, were the C and ε values, since it is based on an SVR algorithm. Therefore, we used a randomized search by cross-validation to tune the hyperparameters' combination with the best performance (i.e., lower mean absolute error, which, for this case, was 0.0791) with training data. The selected hyperparameter set for this tool is presented in Table 2. Table 2. SVR-based multi-output regressor hyperparameters.

Hyperparameter
Value After model hyperparameter tuning and its fitting, in order to assess forecasting performance of the proposed tool under new operating scenarios, trials have been performed with test data. As mentioned above, the results of these tests present two approaches (i.e., one operating scenario with all system buses, and one system bus with all proposed operating scenarios), which are presented below.

Results: All-Bus Approach
The LMP forecasting (red bars) with the proposed tool considering all power system buses for only one test operating scenario, as that presented in Figure 7, is close to the real LMP values (blue bars), with a trend to be lower than the real ones (e.g., buses 1-19, 21-39, among others). However, some buses have gone against this trend, mostly in those where it is comparatively low-priced to transmit energy (e.g., buses 81, 92-96, among others). However, these lags are not higher than ±1.4 % of the real LMP value (±0.25 USD/kWh approx).
It is important to highlight that these lower LMP trends are slightly more repeated in the proposed operational scenarios (i.e., greater than 54% of all buses for all samples). In other words, the proposed tool tends to indicate somewhat cheaper LMPs than they really are. Therefore, evidence of these trends can be clearly seen in the relative error percentage calculation without using absolute value, since the direction of the prediction error can be differentiated. Figure 8 shows for this operating scenario the relative error calculation approach. For this case, 100 out of 118 buses was negative, which means that the magnitude of the forecast is smaller than the real LMP data.

Results: One-Bus Approach
On the other hand, in relation to the one-bus approach related with only power system bus in all test operating scenarios, it can be seen that the individual buses have been predicted closely. Therefore, the individual SVR components of the multi-output regressor perform well according to the needs of each bus LMP value (together with all-buss approach findings) as shown in Figure 9.
For this approach, it is evident that the individual SVR modules present some lags with extreme values of LMP for all buses (a common drawback in regression tasks with or without time dependence) that have a negative impact on performance metrics. However, these regressors (red line) are able to replicate both increasing and decreasing behaviors of LMP (blue line) according to the active load variation in each bus. This does not mean that the predicted LMP values are far from the actual power system operation.

Evaluation of Forecasting Accuracy
For the proposed LMP forecasting tool, the regression metrics results for training and test datasets are shown in Table 3. In this case, the metrics shows that the proposed tool performance is good given that shown by the coefficient of determination R 2 , which is quite close to 1 (greater than 0.9 for both training and test sets). In other words, the LMP-active power load ratio with the proposed model can explain more than 90% of the total variation of the data, so that the forecasting performed with this model has been very close to the real values.
This appreciation is directly related to the presented by the MSE metric, which is close to zero for both training and test datasets (0.006 and 0.008, respectively). Therefore, the variance and bias values of the proposed multi-output model are small, which could indicate a fairly accurate prediction. On the other hand, the MAE and MAPE metrics directly indicate the gap between the prediction and the data. For this case, MAE indicates that there is an average absolute mismatch of about 0.06 USD/kWh, which corresponds to about 0.16% average error of the LMP values as shown by the MAPE metric. This evidence accompanies all the previous analysis of the other metrics, which validates the high performance of the proposed regression model on the LMP data.
These results show the high performance of regression models using machine learning for some key tasks related to the energy sector, so that different research scenarios can be generated according to the technique used or the specific application (e.g., other market approaches, economic variables, among others). On the other hand, this type of tool facilitates different stakeholders in the sector to have greater certainty for making decisions related to energy (buying, selling, and investing).

Conclusions
This paper has presented a tool based on machine learning techniques as SVR for forecasting LMP of each bus of a power system using only its active power load. In other words, it has been obtained in a straightforward way unlike other classical methods specialized in this task (use hybrid approaches). This feature can make it easier to get access to this information.
Likewise, the results yielded by this tool, due to the low absolute and relative error percentages presented in the metrics above, support its high performance, generating more confidence to address scalability scenarios of this tool applied with other regression machine learning or deep learning techniques, as well as other power system topologies. Likewise, the agents interested in knowing this information could have a framework that allows them to make bidding decisions with less uncertainty, especially in scenarios of energy decentralization.
On the other hand, it is important to mention that the development and testing of this tool have been carried out from the approach of non-precondition-dependent and immediate operational scenarios (i.e., single-period ACOPF). Therefore, it is recommended to expand the scope to longer-term and time-dependent operational scenarios (i.e., multiperiod ACOPF) to feed this type of tools in order to test its robustness and effectiveness in other contexts.

Data Availability Statement:
The data that support the tool development are available from the corresponding author, by prior request.