Exploiting Artificial Neural Networks for the Prediction of Ancillary Energy Market Prices

The increase of distributed energy resources in the smart grid calls for new ways to profitably exploit these resources, which can participate in day-ahead ancillary energy markets by providing flexibility. Higher profits are available for resource owners that are able to anticipate price peaks and hours of low prices or zero prices, as well as to control the resource in such a way that exploits the price fluctuations. Thus, this study presents a solution in which artificial neural networks are exploited to predict the day-ahead ancillary energy market prices. The study employs the frequency containment reserve for the normal operations market as a case study and presents the methodology utilized for the prediction of the case study ancillary market prices. The relevant data sources for predicting the market prices are identified, then the frequency containment reserve market prices are analyzed and compared with the spot market prices. In addition, the methodology describes the choices behind the definition of the model validation method and the performance evaluation coefficient utilized in the study. Moreover, the empirical processes for designing an artificial neural network model are presented. The performance of the artificial neural network model is evaluated in detail by means of several experiments, showing robustness and adaptiveness to the fast-changing price behaviors. Finally, the developed artificial neural network model is shown to have better performance than two state of the art models, support vector regression and ARIMA, respectively.


Introduction
The emergence of the smart grid has resulted in the development of numerous smart Distributed Energy Resources (DER) such as battery storage, adjustable loads and electric vehicles with two-way charging capability. There are several possible ways to exploit these resources profitably. Some of these involve the use of the resources to reduce electricity bills, and comparing the profitability of these approaches can be straightforward, especially if experiments are performed in the same region and a fixed electricity price is used. However, the flexible capacity of the resources can also be traded on various markets, in which case the owner of the resource is required to enter into some business relationship with a party that is willing to pay for the possibility of using the resource. One possibility is that the owner participates in markets such as ancillary services. Alternatively, the owner can make an agreement with an aggregator [1] that will participate in the market on his/her behalf.
Several works published in leading journals have addressed this problem, but on a limited scale, so it is unclear how the research could be exploited for financial gain on existing or emerging 1. identify what sources of data are relevant and openly available for the predictions of the FCR-N ancillary service market. 2.
identify and present a methodology that can be utilized for the prediction of ancillary market prices and the key design decisions to be made, highlighting the differences between ancillary market (such as FCR-N) and spot market prices' prediction, as well as employing the Artificial Neural Network (ANN) model in which numerous hyper-parameters are to be tuned for the ANN, with no prior work existing for ancillary service price prediction with ANN. 3.
evaluate the prediction performance of the FCR-N price. The experimental results show that the proposed ANN model was capable of adapting to the fast-changing price patterns of the FCR-N market. Moreover, the ANN outperforms the two state of the art models, Support Vector Regression (SVR) and the ARIMA model, in the prediction of the FCR-N prices.
This paper is structured as follows. Section 2 presents the related work, while Section 3 describes the problem analysis. Then, Section 4 proposes the methodology utilized for the prediction of the ancillary market prices. Section 5 presents and discusses the experimental results. Finally, Section 6 draws the conclusions of the paper.

Related Work
Relying solely on electricity markets is not sufficient to deliver a reliable power grid. Therefore, besides electricity markets, ancillary service markets are used as supporting mechanisms to ensure the continuous power balance in the grid [32,33]. The increase of distributed and variable renewable energy resources combined with new smart grid technologies has contributed to the expansion of ancillary markets [34]. However, ancillary service markets have been developed in various ways among different countries [35], i.e., with different regulations, as well as technical requirements. This study focuses on predicting the prices of a DR ancillary market called FCR-N [36]. The FCR-N market consists of operating reserves that constantly maintain the power balance in the power system based on the frequency deviations from the nominal frequency [37]. Hence, the FCR-N provides primary service frequency control by supplying power reserves to the power grid when frequency deviations occur. For the participation in the FCR-N market, the participants need to be able to meet the technical requirements specified by the transmission system operator [36]. The main requirements for the FCR-N provision are specified in Table 1.
The price prediction of ancillary markets can enable the market participation of smart grid stakeholders, such as aggregators [1] and electricity retailers [38]. The prediction of ancillary market prices can improve the decision-making strategies of the stakeholders, while reducing the risks involved in the market participation [39]. So far, few studies have investigated the prediction of ancillary markets prices [25], even though ancillary markets typically present significant differences in terms of characteristics and patterns compared with the commonly-studied day-ahead markets [40]. Table 1. FCR-N market technical requirements as specified in [36]. The FCR-N market is chosen among other ancillary services for the following reasons: firstly, because the FCR-N ancillary service can be delivered by DR programs [41], through the engagement of the demand side. In addition, the FCR-N does not have hard real-time constraints for the participation. In fact, the required activation time is within three minutes [36]. Moreover, the minimum power bid for participating is small (i.e., 0.1 MW) compared to other markets. Consequently, all these aspects of the FCR-N market are raising the interest of new stakeholders (e.g., aggregators, electricity retailers), making it a promising market for their DR participation. Therefore, this work attempts to predict the FCR-N prices, contributing to enhancing the decision-making strategies of such stakeholders [38], thus encouraging their market participation.

Market Minimum Bid Activation Time Activation Frequency How Often It Is Activated
In recent years, as a result of the deregulation process and the introduction of competitive energy markets [42], the prediction of energy market prices has seen a fast-growing interest. Several modeling approaches have been developed for the analysis and prediction of energy prices [43,44]. According to [43], modeling approaches can be classified into five categories: game theoretic models, fundamental methods, reduced-form methods, statistical approaches and computational intelligence. Among the different approaches, some of the most common and traditional methods are the statistical models for time series prediction, named Autoregressive Integrated Moving Average (ARIMA) models [45]. Furthermore, ARIMA models are commonly employed as benchmark models to compare the predictions' performance [46,47].
More recent attention in the prediction of energy prices has focused on computational intelligence methods [43], where SVR and ANNs have played a major role. SVR has been introduced for regression analysis in [48]. Since then, SVR has been used in several domains to predict time series, including the prediction of day-ahead electricity prices [49,50]. In [51], SVR prediction performance has been compared with ANN performance for the prediction of energy prices. Moreover, SVR has been used in [52] as the first attempt to predict day-ahead ancillary market prices by means of regression analysis methods.
ANN are the most commonly-used computational intelligence methods for energy price prediction. One of the possible ANN classifications is based on their architecture [43], in which two major ANN categories can be identified, namely feed-forward neural networks and Recurrent Neural Networks (RNN). Feed-forward networks have been demonstrated to perform well for the day-ahead prediction of spot market prices, such as in [46,53,54]. On the other hand, RNN have been shown to predict the spikes of the energy prices better [55]. However, no studies were found where ANN were employed to predict ancillary market prices in order to analyze the day-ahead prediction performance and the possible advantages or disadvantages of such an approach.

Problem Analysis
This section aims at analyzing the problem of predicting ancillary service market prices for the case of the FCR-N market in Finland. The first step of the analysis consisted of collecting and selecting the data sources that would be used for predicting the prices (Section 3.1). A second step consisted of analyzing the FCR-N prices to be predicted in order to understand the main statistical properties, as well as the differences between the ancillary and the spot market prices (Section 3.2). Furthermore, the autocorrelation of the analyzed prices is investigated in Section 3.3.

Data Collection
A crucial aspect of the prediction of time series is the selection of meaningful variables to be used as the input for the forecasting model [56]. Since, in recent years, energy markets have faced major changes in the fast-evolving power grid [57], in this work, two years of data were collected. In fact, adding earlier years to the collected data would add noise to the prediction models, worsening the prediction performance. Thus, two years of data, respectively 2015 and 2016, were collected for the selected variables from several data sources, such as Fingrid [58], Energia.fi [59], Nord Pool [60] and the Finnish Meteorological Institute [61]. Table 2 provides an overview of the categories of the data collected with the respective variables and data sources. The first category consists of variables associated with the FCR market, among which there are the FCR-N prices to be predicted. The second category is related to the import and export of electricity in Finland from and to the neighboring countries. Then, variables were collected for the electricity generation in Finland. Some examples of electricity generation variables are the total generation, nuclear generation and wind generation. The total electricity load was also considered as a variable, as well as the Elspot prices for the Nord Pool day-ahead market and the oil prices. Another set of variables consisted of weather variables that can affect the production and consumption of electricity. Temperature, wind speed, solar radiations and humidity data were collected from several locations in Finland. Moreover, calendar variables have been used to take into account seasons, holidays, weekdays and weekends.

FCR-N Price Analysis
The FCR-N prices have been analyzed for the two years of data that were collected. Figure 1 shows the 17,544 observations of the FCR-N prices; the maximum price on the vertical axis was limited to 160 €/MW for clarity reasons, due to the small number of occurrences above this threshold. Figure  The statistics of the FCR-N prices are presented in Table 3. The skewness value of the data was large, demonstrating how the time series had a non-symmetric distribution. Furthermore, the kurtosis value shows that the data distribution was a leptokurtic distribution [54], characterized by heavy tails and sharp peaks around the mean. The large value of the Jarque-Bera test was a further indication that the FCR-N price data were far from being normally distributed. Moreover, comparing the distribution of the FCR-N prices ( Figure 2) with the distribution of the Elspot market prices presented in Figure 4, it can be observed how the FCR-N prices behaved differently, with the data less distributed around the median and more towards the minimum and maximum value. A further difference with spot market prices was that the FCR-N prices were zero for around 30% of the time ( Figure 2). Thus, any insights gained from predicting spot market prices could not be assumed to apply to this market.  Furthermore, several studies on spot market price prediction employ seasonal adjustment methods to remove the spot market seasonal component [54,63,64]. In fact, spot market prices present a strong seasonal behavior [43,63]. On the other hand, the FCR-N market presented a significantly weaker seasonality, with price patterns that can change considerably in a short period of time, as also shown by Figures 2 and 3. Therefore, in contrast to the spot market prices, seasonal adjustment methods could not be employed in the methodology for the prediction of the FCR-N market prices.

Autocorrelation and Variable Lag
The autocorrelation of the FCR-N prices has been investigated for the data of 2016. Figure 5 shows the average autocorrelation of the FCR-N prices. It can be observed that the FCR-N prices presented a daily correlation (24-h lag), as well as a subordinate weekly correlation (168-h lag). Due to the daily correlation, a lag of 24 h was added to the variables for which no predictions were accessible for future times. Thus, the lag was added to all the variables except those belonging to the weather and calendar categories.

Methodology
The methodology that aimed at predicting the FCR-N prices was decomposed into five steps. In Section 4.1, a formulation of the prediction model is made, including the formulation for the employed ANN. A normalization preprocessing step of the input data is described in Section 4.2. Then, the employed validation method for the ANN prediction model is presented in Section 4.3, followed by the introduction of the performance measures utilized to evaluate the prediction performance of the ANN model (Section 4.4). In addition, the ANN model is empirically configured through the tuning of its hyper-parameters in Section 4.5.

Prediction Model Formulation
The primary objective of the study is to provide a solution for the prediction of the 24 distinct prices for each hour of the day-ahead FCR-N market prices. This means that the prediction horizon for our solution consisted of 24 distinct prices to be predicted. Thus, from the various machine learning strategies for time series forecasting [65], the Multi-Input Multi-Output (MIMO) strategy was selected [66]. The reason for using the MIMO strategy was that MIMO would return a vectorial forecast of the 24 h by modeling the time series in a multiple-input multiple-output regression model Therefore, through the MIMO strategy, it was possible to define a prediction model, which given as input n features, denoted by the matrix X, learned a function f (X) that generated as the output the prediction of the following ts = 24 time steps, denoted by the vectorŶ.
Usually, in machine learning, the input matrix X has the variables in rows and the training samples in columns. Similarly, each element of the vectorŶ is the prediction for one training sample. However, with the MIMO strategy with ts outputs, each training sample is a matrix X, and the prediction for each training sample is a vectorŶ with ts elements. Thus, the training sample X must have ts values for each variable. For one training sample at time t, the input matrix X and the output prediction vectorŶ are defined as: where t represents the time in hours, while x t 1 represents the variable 1 at time t and m = n * ts refers to the total number of elements in the matrix X.
The model employed for the prediction of the FCR-N consisted of an ANN. The ANN technique is a computing system that mimics the function of the human brain and nerves, organized in such a way that the structure simulates a network. As shown in Figure 6, ANN can be composed by a certain amount of layers, which can be divided into three main categories: the input layer, the hidden layers and the output layer. The input layer consists of input nodes where the X matrix will be represented. The hidden layers consist of any layer in between the input and the output layer, while the output layer consists of the output variable to be predicted, i.e.,Ŷ. The key elements of an ANN are the neurons, which are depicted in Figure 7. Each neuron receives as input the output values y i (i = 1, 2, ..., p) from each neuron of the previous layer with their respective weights w ij and calculates a linear function as follows: where p is the number of neurons of the previous layer and b i consists of a bias term. Then, in order to introduce nonlinearity into the ANN, thus allowing the ANN to better learn f (X), each neuron applies a non-linear function to W i , also called the activation function ϕ(), as follows:

Data Preprocessing
Each variable in the input matrix X needs to be normalized, in order to be standardized to the same range of values. In this study, a normalization method was chosen in order to rescale the features as a standard normal distribution [67]. The normalization method utilized was the unit variance [67]. Thus, the employed unit variance method distributed the variables in the range [−1,1], and it is specified as follows: where µ is the mean of the distribution and σ the standard deviation.

Model Validation
A forward validation method was employed to validate and select the best-performing ANN models [68]. Figure 8 shows the employed walk forward validation method, where an in-sample training dataset was utilized to train the prediction model, while an out-sample testing dataset was reserved for evaluating the prediction performance. The size of the sliding in-sample training dataset was determined by means of an experiment in Section 5.1. Moreover, since the aim of this study is to predict the day-ahead price, the size of the testing dataset was set to 24 h of out-sample data. Finally, in the following Section 4.5, several datasets are formed and utilized to evaluate and select the best-performing prediction model. In time series prediction, contiguous subsets of data (in time) are segmented into two fractions (Figure 8) [69]. The first fraction is used for training the model and precedes the second fraction in the order of time, while the second fraction is used for testing. Thus, the performance is then evaluated by averaging the testing error over the several datasets generated with the walk forward validation method.

Prediction Performance Evaluation
Since the design of a neural network consists of an empirical process, where several models are tested and handcrafted, we need to define performance measures to evaluate which is the best-performing ANN model. In this study, of the several measures of prediction accuracy presented in the literature [43,70], in order to evaluate the prediction performance, we utilized the Mean Squared Error (MSE). The MSE is defined as follows: where y i is the price data obtained from Fingrid, whileŷ i is the predicted price. The decision to employ the MSE is mainly related to one key feature of the FCR-N prices, which present several instances of zero-values (Table 3). In fact, due to the large amount of zero-values in the FCR-N price data, several state of the art performance measures would produce a division by zero or an undefined division of zero by zero [70].

Empirical Configuration of an Artificial Neural Network
One of the first design decisions is the selection of the best-performing ANN architecture, which includes the identification of the activation functions, the number of hidden layers and the number of neurons for each layer. Currently, the selection of ANN architectures is executed by means of an empirical process [71], in which various architectures are tested, handcrafted and adjusted. General guidelines for designing an ANN architecture are well known among practitioners, as described in [72]. Following those guidelines, several architectures have been tested for the prediction of the FCR prices. Among those, in this study, we show the results of the two best-performing architectures, with the final aim of selecting the one that provides the best prediction performance. The two architectures compared in this study are presented in Figures 9 and 10. The difference between the two architectures is that the architecture in Figure 9 is composed of 2 hidden layers, respectively containing n = 64 and 24 neurons each. The architecture in Figure 10 has 3 hidden layers, respectively with n = 64, n/2 = 32 and 24 neurons each. In the following, the architecture of Figure 9 will be referred as the 3-layer architecture, due the fact that the depth of the architecture is depth = 3, while the architecture in Figure 10 will be called the 4-layer architecture, since depth = 4.  In order to compare the performance of the two ANN architectures, namely the 3-layer and the 4-layer, a set of gradient descent optimization algorithms has been employed. The set included the Stochastic Gradient Descent (SGD) [73], RMSProp [74], Adam [75] and Nadam [76]. The performance of the two architectures for the different gradient descent optimization algorithms are presented in Figure 11 for the 3-layer architecture and Figure 12 for the 4-layer. The two figures show how the 3-layer architecture had better performance than the 4-layer, while the difference in performance for the four gradient descent algorithms was less significant. The best performance was obtained by training the 3-layer architecture with 750 epochs. Thus, in the following, the 3-layer architecture was chosen for the FCR-N price predictions, where the Adam algorithm was utilized to optimize the gradient with 750 epochs, due to the fact that Adam provided the best MSE performance among the examined gradient descent algorithms. One of the primary tasks when designing an ANN architecture is the identification of the best-performing activation functions. This identification is executed through an empirical process. Figure 13 compares the performance of the 3-layer architecture using different activation functions (ϕ) for the hidden layers [77], namely tanh, softmax, ReLU and sigmoid. For this experiment, the ReLU function was used for the output layer, due the fact that the ReLU is defined as ReLU(x) = max(0, x) and the FCR-N prices are non-negative numbers. The box plot in Figure 13 shows the MSE prediction performance for a sample of thirty days uniformly distributed in the year 2016. This result shows that the best activation function for the hidden layers in terms of MSE performance is the sigmoid function. Thus, in the following, ϕ = sigmoid will be used as the activation function for the hidden layers, while for the output layer, ϕ = ReLU will be used as the activation function.
In order to prevent the ANN model from overfitting, the dropout regularization was utilized for the FCR-N price predictions [78]. The dropout is a regularization technique that temporarily drops neurons from the ANN model during the training phase in a random order based on a probability p. For the selected 3-layer ANN model, a validation set of sixty days uniformly distributed in the year 2016 was utilized to identify the best dropout probability value p (expressed as a percentage %) for the FCR-N price predictions. The results of the experiment are presented in Figure 14, which shows the average MSE performance for each of the tested dropout probability values p. The results indicated that the best p-values were in the range of [40,60] %, with p = 40% being the best-performing value. Thus, a dropout probability value of p = 40% was chosen for the 3-layer ANN model.

Empirical Results and Discussion
The selected ANN model for the experiments consisted of a three-layer model, as in Figure 9, where the gradient optimizer utilized was Adam with 750 epochs. Moreover, for the hidden layers, the selected activation function was the sigmoid, while for the output layer, the ReLU function was selected. Finally, a dropout value of 40% was utilized in order to prevent the ANN model from overfitting. In the following section, at first, the size of the training window for the ANN model is identified through an experiment (Section 5.1), then the prediction performance of the ANN model is extensively analyzed in Section 5.2. Section 5.3 completes the empirical experiments by comparing the prediction performance of the ANN model to two state of the art models.

Determining the Training Window Size
Since the FCR-N market is subject to rapid changes [57], it is necessary to establish a suitable window size to train the model. Therefore, an experiment was executed in order to define the size of the training window for the ANN model. Figure 15 shows the MSE for the 3-layer model for a sample of thirty days uniformly distributed, where it can be noted that the size of the training window tended to start stabilizing the prediction performance after 180 days on. Thus, since no fundamental improvements or degradations of the performance were registered with training window sizes larger than 360 day, the training window size chosen for the following experiments was 360 days, which is a good trade-off from being too small, thus starting to degrade the prediction performance, and from being too large, thus slowing the training of the ANN model. Moreover, further enlarging the training dataset could add noise to the prediction model, thus worsening the prediction performance. Hence, to summarize, in the following, the training window size is 360 days of data, while the testing window size corresponds to the 24 h ahead unless otherwise stated.

Prediction Performance Analysis
The prediction performance of the FCR-N prices was analyzed for the entirety of 2016 using the ANN model. For each day of 2016, one ANN model was trained for predicting the day-ahead FCR-N prices. Therefore, the first experiment aimed at analyzing the prediction performance for each month of the year 2016. As shown in Figures 2 and 3, the FCR-N market prices have a fast-changing behavior even between consecutive months.Thus, this fast-changing behavior was also expected to affect the prediction performance. The prediction performance for each month of 2016 is presented in Figure 16 in the form of a box plot, where the median value is noted with a line within the boxes. As can be observed, the worse performance in the predictions was during the months of May and July. This performance result reflects the analysis presented in Section 3.2. In addition, the months of May and July presented a distribution of the price data, which differed considerably from their respective previous months (Figure 2 However, it can be noted that the median values in Figure 16 stand on the lower part of the respective boxes, demonstrating how the prediction performance was affected by a small number of outlier days. In fact, this can be observed by looking at the prediction performance more in detail. As examples, we analyzed the performance for the months of July and August, respectively the month with the higher variance and the next in the order of time. Thus, the prediction performance for each day of July and August is presented in Figures 17 and 18, respectively, where the monthly mean and median of the MSE of each day are also represented. The figures show how the difference between the mean and the median of the performance was due to a small set of outlier days. The main four outlier days of July and August are presented in Figure 19. All the outlier days present very unusual price patterns, which were not registered in the past observations, making them exceptionally difficult to predict. On the other hand, Figure 20 shows the prediction of the FCR-N prices for four days in the time span of two weeks between July and August 2016. As can be noted, the pattern of the FCR-N prices could vary quite rapidly in a short time period, making the FCR-N price prediction more challenging than the spot prices, which presented an evident seasonality between contiguous days. Moreover, the first row in Figure 21 presents two days with an outlier behavior (i.e., price patterns specific to the analyzed period of time and not commonly observed in other periods), which were adequately predicted by the ANN model. In the second row of Figure 21 is shown two price patterns with their respective predictions, which recurred regularly along the entire period of time analyzed (i.e., 2015-2106), where the ANN model succeeded to provide accurate predictions .   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30   Part of the methodology involves the decision on how often the computationally-intensive retraining of the ANN needs to be performed. Thus, the second experiment aimed at analyzing how the performance of the ANN model degraded when the ANN model was utilized for predicting several days in the future without a retraining of the model. Figure 21 presents the prediction performance (in terms of MSE) of the ANN model up to seven days in the future for the year 2016, where the median of the daily performance is represented with a line, while the daily mean performance is represented as a triangle. The prediction performance degraded linearly with respect to the number of days in the future to be predicted, going from an MSE of 115.8 for the first day to 139.6 for the seventh day after training the ANN model. At the same time, the distance between the median and the mean, as well as the variance of the prediction performance increased, indicating that a short retraining period of the ANN model was desirable to reduce the prediction errors on outlier price patterns.

Comparison with the State of the Art
The last experiment aimed at comparing the prediction performance of the ANN model with two state of the art methods. The two selected methods were SVR and ARIMA, respectively, which were employed as benchmarks of the state of the art for predicting the FCR-N prices. Therefore, an SVR model was trained with the same training data as for the ANN model, i.e., with the X matrix as the input. While for the ARIMA model, one-year FCR-N price data prior to the day to be predicted were used as the training data, where the ARIMA parameters (p, d, q) (i.e., respectively autoregressive, differencing and moving average parameters) are ARI MA(p = 1, d = 1, q = 1). Similarly to the ANN model, the test window size was for the 24 h of the day-ahead for both selected methods (i.e., SVR and ARIMA). Figure 22 compares the FCR-N price prediction performance of the ANN model with the two selected benchmark models: SVR and ARI MA (1, 1, 1). The results revealed how the ANN model outperformed the SVR, as well as the ARIMA. The results showed how during the first four months of the year, when the prices were distributed around the median and had a low variance, the three models had similar performance, even if the ANN model always outperformed the other two models. However, once the prices experienced an increase in the variance and a dispersion from the median value (i.e., between May and August), the ARIMA failed to adapt to the fast changes of the FCR-N prices, while the ANN model provided predictions with a superior accuracy even compared to the SVR model. Further, the ARIMA predictions for the last few months of 2016 (i.e., September-December), where the FCR-N prices have a similar behavior, were largely affected by the previous period (i.e., May-August), showing how the ARIMA failed to adapt to the new changes of the prices. On the contrary, the ANN model was capable of adapting also in the third period of the experimented year and significantly outperformed the ARIMA benchmark, without being affected by the noise of the prices from the preceding period (i.e., May-August). In addition, for the same period, the ANN model performed better than the SVR model, which resulted in being more robust to the noise introduced by the previous prices than ARIMA. Thus, ANN was shown to outperform the ARIMA benchmark model largely, being more capable of adapting to new price patterns, while maintaining memory of the patterns occurring regularly and being more robust to the fast changes occurring in the FCR-N market prices. Moreover, the ANN also outperformed the prediction performance of the SVR model.

Conclusions
The objective of this study was to predict the day-ahead prices for ancillary markets, with the FCR-N market as a case study. The study outlined the problem analysis and methodology used to predict the FCR-N market prices. The problem analysis consisted of identifying the relevant data sources available for the predictions and then continued by analyzing the FCR-N market prices and their substantial differences from the spot market prices. Moreover, the methodology outlined the selected MIMO strategy for time series forecasting. Then, the methodology defined the data preprocessing and utilized the model validation method and the performance evaluation coefficient. Finally, the methodology introduced the main design decisions that were made for the configuration of the ANN model, which consisted of tuning several hyper-parameters through an empirical process.
In Section 5, the performance of the developed ANN model for the prediction of FCR-N market prices was analyzed. At first, the size of the training window size was determined through an experiment. Then, the predictions for the entire year 2016 were analyzed in detail, demonstrating that the performances were affected by a small set of outlier days. Moreover, a third experiment proved how a frequent retraining of the ANN model can mitigate the effect of the outlier days on the prediction performance. Finally, the last experiment aimed at showing how the developed ANN model outperformed two benchmark state of the art models, namely the SVR and ARIMA models, in the prediction of the FCR-N market prices.
Author Contributions: C.G. conceived, designed and performed the experiments; C.G. analyzed the data; C.G. and S.S. wrote the paper; R.I., V.V. and S.S. supervised the work and helped to improve the quality of the manuscript.

Funding:
The research has been funded by the project "Harnessing the consumer for a flexible energy system architecture" (285029) of the Academy of Finland and by the Finnish Society of Automation.
Acknowledgments: The calculations presented above were performed using computer resources within the Aalto University School of Science "Science-IT" project and the GPU servers provided by the Ichise laboratory at the National Institute of Informatics.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: