Data-Driven Trajectory Prediction of Grid Power Frequency Based on Neural Models

: Frequency in power systems is a real-time information that shows the balance between generation and demand. Good system frequency observation is vital for system security and protection. This paper analyses the system frequency response following disturbances and proposes a data-driven approach for predicting it by using machine learning techniques like Nonlinear Autoregressive (NAR) Neural Networks (NN) and Long Short Term Memory (LSTM) networks from simulated and measured Phasor Measurement Unit (PMU) data. The proposed method uses a horizon-window that reconstructs the frequency input time-series data in order to predict the frequency features such as Nadir. Simulated scenarios are based on the gradual inertia reduction by including non-synchronous generation into the Nordic 32 test system, whereas the PMU collected data is taken from different locations in the Nordic Power System (NPS). Several horizon-windows are experimented in order to observe an adequate margin of prediction. Scenarios considering noisy signals are also evaluated in order to provide a robustness index of predictability. Results show the proper performance of the method and the adequate level of prediction based on the Root Mean Squared Error (RMSE) index.


Introduction
The digitalization era is pushing the power generation sector to rapidly adapt driven megatrends such as the large-scale renewable energy integration which is motivated by zero-emissions policies, and the continuous expansion need for increased energy supply and use of sustainable sources. To respond to these megatrends and challenges, cleaner, smarter and more efficient solutions for energy production shall be accounted in parallel with the development of smart grid technologies, and innovative applications [1].
System operators face numerous and increasing challenges to maintain and optimize their grids since the power network are being stressed by the adoption of renewables [2]. The natural consequence is that timescales for operational decision making have decreased [3]. Therefore, it is essential to rapidly foresee undesirable situations, and have forecasting analysis on grid contingencies of short-coming situations as a functionality that helps the Transmission System Operators (TSOs), to resolve risks and secure the continuous grid stability for the current and future operation [4,5].
With the increasing complexity operation, data-driven methods in power systems have opened the possibility to incorporate system dynamics analysis and control mechanisms from measurements, and realistic scenarios rather than model-based power planning stages [6]. Power systems data analysis is the current frontier of innovation, exploration and productivity, and its research is on the rise. Such data analysis has addressed diverse system areas like coherency groups identification [7], trajectory prediction to identify system dynamics from a noisy measurements [8], short-horizon wind power forecasting [9], online prediction of transient stability with renewables [10], photo-voltaic power production nowcasting in microgrids [11], power system collapse prediction [12], wide area control [13], load frequency control [14], and for renewable integration impact assessment [15].
Machine Learning (ML), as a sub-set of data-driven methods and computational intelligence has been in continuous development during the latest decades, making possible to extract critical information from the collected data sets and provide accurate predictions and insights that can be used to power critical analysis that can bring more flexibility and resilience, transforming the next decades of power generation [16]. Artificial Neural Networks, (ANN) as a branch of Machine Learning, have been applied to power systems in a variety of areas such as wind power forecasting [17], photo voltaic day-ahead forecast [18], coherency clustering [19], inertia estimation [20], and system state estimation [21].
Equally, time-series forecasting based on Neural Networks (NN) has been proved to help to understand large data sets that require future-ahead visualization on non-linear problems [22,23], and has been successfully used in power systems in load forecasting [24], and wind power pattern prediction [25] for instance. Additionally, it has been demonstrated that ML models, specifically based on NN, have shown better results for time series forecasting applications, when comparing the traditional linear models employed in forecasting such as Auto-Regressive (AR), Mobile Average (MA) or combined ARMA models. Other strategies based on seasonal or integrated as SARMA, ARIMA and similar methods produce results with lower performances indexes, as seen in [26][27][28]. Nonlinearity is observed on the grid frequency time-series behavior, making that the aim of the present document to analyze the employment of basic architectures of NN for frequency forecasting as a first approximation in the particular time series from frequency deviations in a power system Frequency monitoring has been always necessary as an indicator of the power balance in the grid, and to activate protection and controllers in a power system [29]. Several approaches have been proposed to predict the grid power frequency. For instance, a conceptual look-ahead frequency dynamics tool is presented in [30] to highlight the importance of a correct detection of frequency contingencies for the control adequacy and further monitoring calibration. Meanwhile, a study of dynamics and statistics of power grid frequency fluctuations considering frequency time-series with arbitrary noise distributions is given in [31].
Authors in [32], propose the minimum frequency value forecasting applying a Newtonmethod-based approximation and the interpolation of the frequency second derivative for Under-Frequency Load Shedding (UFLS) scenarios. A data-driven approach using the second-derivative of the system frequency is proposed in [33], where the disturbance initial time and response time are estimated. A real-time predictor for frequency evolution for active network management is presented in [34], where auto-regressive estimation is used on frequency observations considering a prediction horizon of one time-step. A dynamic frequency prediction methodology, based on adaptive neuro-fuzzy inference system (ANFIS) is proposed in [35]. An analytical model for the frequency nadir prediction with polynomial fitting on the parabolic trajectories overall governor's responses is given in [36].
A power system frequency prediction using ANNs has been presented in [37], using training data sets for treatment of significant outages of generators. Such approach uses the NN to approximate the entire frequency time-series data. Another approach that utilizes ANNs for system frequency behavior prediction is given in [38]. The system used in this case is the Terceira Island system. ANNs, whose weights are learnt using Genetic Algorithms (GA) are used for estimation of power system sudden change in load and noise conditions in [39]. Again, the entire frequency data is used.
Grid power-frequency prediction has several applications, such the online-time estimation of the power system inertia [40,41], the improvement of frequency containment reserves [42], the implementation of advanced network management schemes and protection architectures [43], synchro-phasor accuracy measurements improvement [44], Under Frequency Load Shedding (UFLS) schemes improvement [45], steady-state security assessment [46], emergency conditions by disconnecting the feeders [38], further renewable connection assessments [38], HVDC-AC interaction studies [47], and anticipatory secondary controllers [48]. Additionally, entities like the European Network of Transmission System Operators (ENTSO-e), are working on a more accurate frequency estimation since several interconnections among countries are expected with an increasing deployment of renewables [49].
Typically, the frequency studies on prediction have used time-series analysis taking the whole data to be entirely reconstructed by the NN model and have very short estimation time. From the aforementioned contributions, it is seen that active power network management requires further research on frequency prediction advancements. Thus, the motivation of this document by proposing the NN as a frequency-event forecaster with a minor portion of data. A previous contribution of this work has been presented in [50], where the frequency nadir forecasting with a considerable inertia reduction was analysed. However, in [50], only a few cases were analysed. This paper instead is an extension of that previous work and adds more cases to the simulated low-inertia reduction studies, and includes collected PMU data for the Nordic Power System (NPS). Additionally, a noise performance indicator is shown for the simulated cases. Moreover, it proposes a new time-horizon window for the data where the time-evolution of the measured frequency after a large disturbance is predicted as the major contribution of this paper. Nonlinear Auto-Regressive (NAR) and Long Short Term Memory (LSTM) methods, based on NN architectures, are compared with the same time-horizon percentage of data.
The rest of this paper is organized as follows: Section 2 describes the theoretical framework of the frequency response and shows the theoretical behaviour of grid system frequency. Section 4 briefly explains the machine learning theoretical background and their relationship with time-series forecasting. Section 4.2 explains the methodology used for the time-series frequency trajectory forecasting. Section 3 introduces the system data studied, having two cases, the simulated data and real measurements. The simulation results and discussions are presented in Section 6. Finally, the conclusion are given in Section 8.

Frequency Response Preliminaries
The frequency deviations in a power system after a disturbance caused by either a loss of generation or a load event, can be categorized into different control zones: namely inertial frequency response zone, Frequency Containment Reserve (FCR) (primary control) and Frequency Restoration Reserves (FRR) (secondary control) zones.
Inertial response is inherent in the power system due to the synchronizing torque present in the rotating masses of machines, providing a counter response within seconds, to oppose the frequency deviations following the disturbance. In a power system, in the case of sudden disconnection of a generating unit, the system frequency drops because of the imbalance between generation and load as the rotating mechanical power stored in the machine masses is converted into electrical power in the system. During the inertial response period, the rotating machines in the power system either release or store kinetic energy tending to reduce the frequency deviation. The system frequency response of the power system is a reflection in the power system of the total amount of kinetic energy stored in all rotating masses.
The inertial time constant of an individual generator takes typical values varying between 2-9 s and can be interpreted as the time that the generator can provide full output power from its stored kinetic energy. Beyond the inertial response, the frequency is first stabilized and then restored to its nominal frequency by the FCR (governor action) and secondary controllers, respectively. FCR acts as a proportional controller to counteract large frequency deviations, and the response of this control is typically given in <30 s. However, since the FCR is a proportional controller, a steady-state error would still remain. An additional controller such as the FRR then corrects this steady state error over time. Figure 1 shows the frequency response including the time-frame control reactions. Following (1), the dynamics of the system frequency are written as: where ω i (in rad/s) is the frequency of generator i, P m i is the mechanical power (p.u.), P e i is the electric power (p.u.), H i ∈ R > 0 is the inertia constant and D i ∈ R > 0 is the droop damping.

Performance Metrics
Following a disturbance in the system, in particular given a negative step disturbance such as a sudden load increase or generation drop at t = t 1 , the following metrics are defined for quantifying the performance of the forecasting.

•
Nadir is the maximum dynamic frequency deviation following an active power disturbance/contingency [51]. • Nadir time is the associated time t = t 2 to the nadir occurrence.

Measurement Metrics
In order to have an aggregated frequency measurement of an entire interconnected power system, the Center of Inertia (CoI) is used, which is computed based on the individual speeds ω i and the inertia constants of the synchronous generators H i .
Assuming the set G of synchronous generators, the expression to compute the CoI is:

Nordic Test System
The single-line diagram of the Nordic test system is shown in Figure 2. This system contains 32 high voltage buses, 20 synchronous generators with different types of generation, in four geographical identified areas. The North and External areas are hydrodominated while the South and Central areas have a mixture of nuclear, thermal and coal power plants. The central area has the highest level consumption whereas the North area has the lowest level. The transmission system is designed for 400 kV (19 buses) with some regional systems at 220 kV (2 buses) and 130 kV (11 buses To model system scenarios with large amounts of non-synchronous generation, some of the synchronous generators are replaced with the non-synchronous generation with back-to-back FRC with the same active and reactive power outputs. Power outputs are fixed through all simulations. The different cases Case 1 (C 1 ), Case 2 (C 2 ), and Case 3 (C 3 ) represent the replacement of synchronous generation by different amounts of nonsynchronous generation based Full Rated Converter (FRC). C 1 considers the replacement of one generation only, and C 2 considers the replacement of two generators including the one in C 1 , and so on. It is assumed that the dispersed generation is connected to one established substation [52]. These five scenarios are summarised in Table 1 and visualized them in Figure 3: Table 1. Generator Replacement for Each Case. Case

PMU Measurements
In order to have a more realistic observation of the frequency behavior and the performance of the forecasting through the neural models, a dataset is taken from several PMUs installed in the Nordic Power System is studied [53].
The frequency data used is from 2012 sampled at 1 Hz, using a ABB RES521 in agreement with the IEEE standard [54] of 2005. A frequency event is considered as a large disturbance if the frequency goes below 49.80 Hz. It is ensured that the events are at least 15 min apart, i.e., the time when a frequency disturbance should return to normal according to the balancing agreement in the Nordic Region. This separation of events also remove any cascading frequency events, thus ensuring that the system is at a sturdy operating point prior to the disturbance with sufficient primary reserve. Furthermore, it is important that the initial frequency, is within [49.9, 50.1] Hz in order to ensure that the system is not already responding to a frequency deviation outside the normal operating range.

Neural Models
Neural Networks (NN) models are set of architectures and algorithms based on the behaviour of the brain. The objective behind these models is to learn from examples and to adjust the connections between basic units known as neurons, in a similar way as the human cerebral cells do. The NNs have numerous applications due to their capacity to find patterns based on data, and their adaptability to determine nonlinear relationships in a input-output mapping [55]. This process is carried out in a supervised learning approach, where it is necessary to provide to the network a set of input-output pair values which the model is trained. According to the application, the supervised learning achieves classification or regression tasks. For the present study, the NN is used as a regressor. Then, the forecasting for time-series is implemented, introducing non-linearity and the ability to learn from data in a horizon window methodology. In the present work, two main NN models were employed: Nonlinear Autoregressive (NAR) and Long Short Term Memory (LSTM).

Nonlinear Auto-Regressive Model
NAR models employ an architecture based on inputs and two layers (see Figure 4), in an equivalent mode as the multilayer perceptron architectures [56]. This approach works likewise the classical auto-regressive (AR) model, where an output is determined by a linear computation from its own previous values, according to an order p and the minimization of the error , which is white noise in most of cases [57]. A nonlinear mapping is provided by the NN model, which is implemented through the activation function, employing hyperbolic tangent or logistic functions. Equation (3) represents the NAR model, where it is possible to adjust the a i coefficients, similar to a classical AR model but renamed as synaptic weights (w ij ) from the NAR model. The order p in NAR processes determines the number of lags or inputs of the approach, determining the number of input nodes. To evaluate the performance of NAR architecture, different models were trained with a variation in the number of past (input) values (order). Based on past values of the time-series y i minimizing the error between the original time series and the estimated one as: where y i is the time series to be modeled and a i are the coefficients of the model (called synaptic weights (w ij ) in other applications). Parameter b is a bias value used to fix the function to be found.

Long Short Term Memory Model
Alternative NN architectures, known as recurrent neural networks (RNN), are also employed for forecasting, where connectivity information forming the input-output mapping is stored as hidden states. This RNN holds a recursive structure that implies that fewer parameters are demanded than NAR models to learn the relation input-output in data [58]. LSTM are architectures based on RNN, where the ability to preserve part of the information that belongs to the hidden layer can be used in specific moments for forecasting [59]. These feedback connections inside the LSTM architecture are the main differences compared to NAR models. Figure 5 shows the typical LSTM structure.
This NN model holds an architecture composed by gated memory units or cells, which regulates three parameters related to the forecasting as the input, output and a forgetting component. This last has as function to determine what must be remembered by the network. In this way, three gates are used to control the flow information that get into and out of the cell [60]. The formulation of the expressions for the LSTM training can be describe for the input (4), output (6) and forget (5) parameters. Also, the management of the cell state is given by c(t) in (7), which is the long-term. Equation (8) represents the output or hidden state, which is the memory focused for be used.
where i t , f t , o(t) correspond to the input, forget and output gates, and c(t) determines the cell activation vectors with same size for the hidden vector h(t). Activation functions are labeled by the θ symbol, W i , W f , W c , W q , and W h are the weight matrices related to the input, forgetting, cell (long-term memory), output and hidden layer respectively [61]. The LSTM advantage, in relation to common RNN models is the improvement of the performance to the gradient vanishing problem, which represents a difficulty in the traditional back-propagation algorithm employed for training. This is alleviated by tge LSTM, according to its deep connections over extended periods in time through its cell proposal [62], making that this structure based on control gates do not affect the training [60,63].  Figure 6 includes the steps to complete the employed methodology. The process begins with a time series normalization to avoid exploiting values in the NN parameters. In the development of machine learning models as NN, it is necessary to divide the data into training and validation steps at least. For this reason this division was implemented, according to a horizon window approach. Training is followed, searching the best model based on the modification of a minimum of specifications and performance metrics. Finally, the analysis and comparison of the results are considered to find the best strategy for forecasting.

Horizon-Window Approach
For validation of models in ML techniques, different strategies can be employed [64]. Forecasting of frequency time-series in the present case is implemented using an out-ofsample (OOS) evaluation. For this, a section from the end of the series is employed for evaluation due to the the size of datasets. In this way, only one evaluation on a test set is considered, and then, metrics for evaluations are computed. The first portion of time-series is called the training set, where the parameters of the NN models are adjusted, searching for the error minimization. Then, the generalization of the models is analyzed, employing a test or evaluation set and compared to the original time series. Even though the Nadir prediction is the first target of this document, the training section was selected before this point happens, simulating a real context where the model never knows when the Nadir will happen. According to both scenarios, three lengths for the training sections were employed. Table 2 shows the portion of the time-series used for the models' training.  Figure 7 visualizes the described methodology, where three different size segments of the time-series are employed for training. These three portions were selected before the frequency Nadir happened. Based on this data and trained NN model, the next samples of the series are estimated in terms of a forecasting. In Figure 7 the already-trained model is given by the blue line, whereas the forecasted model is represented by the red line. The green dashed line represents the frequency nominal reference value.

Models Training
According to previous explained models, a comparison was accomplished to analyze differences between NN models, exploring possibilities related to what is the more appropriate for the present application. For the simulated and PMU scenarios, three aspects were considered in the comparison: For the last two aspects, where parameters related to the neural models architectures were analyzed, an approach based on equivalence for the NAR and LSTM networks were used. Number of inputs were modified from one to five in a similar way as the number of neurons in the hidden layer. For the same comparison, the number of training epochs was 50 epochs and data was similar to that case. Computational experiments were driven by using Python 3.7 and scikit learn library.

Prediction Validation
In time series forecasting, there are different measures to quantify how the performance of the model is. For the present work, the Root Mean Square Error (RMSE) was employed for this task due to the wide used in this kind of applications. Expression (9) describes the computation of the error in the way: where e i is the error between samples given by the value of the original time series y and the forecasted value y k , N is the number of the samples,

Results and Analysis
This section presents the results of the test system data simulations and real-time measurements. Both scenarios, the non-synchronous generation inclusion cases and PMU measurements are evaluated with NAR and LSTM models. The neural models of the simulated data have been evaluated with noise addition in order to test the trajectories forecast.

Non-Synchronous Generation Integration Cases
The data-set of frequency observables is extracted as time-series with distinct initial conditions, e.g., gradual non-synchronous generation inclusion, which produces different time-response conditions due to the inertia reduction in the system. Note the inertia estimation is out of the scope of this document. The frequency measurement for these cases uses Equation (2), which quantifies the frequency in the entire system. We initially evaluate the model's performance on the simulated cases at each case. Note that all the cases are evaluated with the same time-series future trend percentages in order to compare the performance of the prediction. Figure 8 shows the forecasting results for the cases using the NAR models with the percentage proportion shown in Table 2. Case C 1 shows that for the 6.25% margin the NAR model is unable to reconstruct the signal and do an appropriate forecast. Whereas with the 7.5% margin the trajectory is well reconstructed. In case C 2 by using the margin of 6.25% the frequency signal is able to forecast the rest of the signal. In the case C 3 , only by using the 6.25% margin the forecast is achieved. For cases 4 and 5 only the 7.5% margin achieves an appropriate prediction model of the frequency time-series.  Figure 9 shows the cases using the LSTM models. In general, only by using the 7.5% margin the forecast is partially achieved. The performance of both methods are quantified in Figure 10, where a computation for all five cases is obtained in relation to the number of lags, neurons in the hidden layer and RMSE. There, it is possible to see that smaller portions needed more lags to make the prediction, but the number of neurons in hidden layer increased when the window size augmented.   Figure 11 shows the forecasting results for the cases using the NAR models with the percentage proportion and white added noise. In this case, the effect of noise is evident for all time-series, showing how for cases C 4 and C 5 the models had the worse performance. Three first cases exhibited better results for window size for 6.25% (case C 1 ) and 7.5% (cases C 2 and C 3 ). Figure 12 shows the cases using the LSTM models and time-series affected by the white noise. A similar situation as the NAR models, the three first cases obtained better results with a window size of 7.5% for training. Last two cases did not have a higher quality for the prediction. As previous scenario, a comparison for the simulation cases when white noise was added is represented by Figure 13. There, the increment in the RMSE values can be seen but a different effect is found. The number of lags is rising in a similar way as the window size is augmented. For the number of neurons in the hidden layer, the prediction for the model with 6.25% portion in training demanded more units in mean in the NAR case.

PMU Measurement Cases
Ten time-series cases are used for the testing of the neural methods. Note the nature of the load, control settings and inertia conditions are unknown. Most of the cases have an initial decay due to a disturbance and subsequently have frequency oscillations after the re-establishment of the load and/or the control actions. Additionally, in most of the cases is possible to observe the Nadir and the frequency overshoot. Except Case 3 which represents the stochastic frequency behavior under normal operation which typically operates in a band of 50 ± 1 Hz. Since the PMU measurements are real frequency behaviours, the response contains oscillations after the frequency dip. This data is evaluated with different margins as mentioned in Table 2. Figures 14 and 15 show the results for the NAR and LSTM models respectively. In general the NAR models exhibit an adequate trajectory forecasting of the given data. The results obtained results for LSTM models show comparable performance to the NAR ones. An exception in case 9 that none of the margins are close enough to the original time-series. Finally, Figure 16 resumes the comparison for two NN strategies and PMU cases. In general, it is possible to see that NAR models required more lags (input) in mean than LSTM models. Similarities were found in relation to the number of neurons in the hidden layer. The RMSE presented higher values for the LSTM models.

Discussion
As mentioned before, the forecasting of the grid power frequency can be of used for TSO planners due to the concern of primary frequency control regulations for the the Nadir adequacy [65], that prevents the minimum frequency (Nadir) from dropping below the UFLS settings relays as the TSOs require. The maximum nadir value established by the Nordic TSOs joint of the interconnected power systems of Finland, Norway, Sweden and parts of Denmark or NORDEL is 49.4 Hz. Additionally, a frequency assessment security can provide operative regulations to keep the frequency inside the statutory limits for different values of aggregated system inertia and disturbances of different magnitudes [66].
Digital twin models of frequency behaviour can be used for diagnosing and health the physical system in advance [67]. The use of collected data of a system will be enhanced by using ML techniques to guarantee the correct behaviour of the system and foresee undesired events.
Studies related to the use of windows to predict next samples were employed in [68]. However, aspects with few works were treated due to the forecasting performance is deteriorated when time steps is increased. This was taken into account in the present study, developing models with as maximum five samples in the input [68].
Error of training process is employed to chose the model. In this way, the values exhibited in Figures 10, 13 and 16 are associated with training error. Due to this, the error for first window (5% and 18%) belong to segment of time-series when the behavior is stable without variations before the Nadir point.
Currently, the ML techniques are employed to provide decision support in many fields related to power systems [69,70]. For the present case, it is noticeable than NN models could learn the behaviour of time-series. Despite the fact that information related to Nadir was not included for the training of models, the generalization reached performances that indicate the ability to obtain the representation of the physical system. This can be useful for power systems operator, which have an extra-help to determine how to act. Even though the conventional LSTM model is the current state of art for time-series forecasting, when implementing the two methods, NAR and LSTM, the NAR showed a better forecast approximation of the data after the training. This is due to the LSTM approach can be a sophisticated model for the present application. In some applications, comparisons between traditional forecasting methods with LSTM can be found in [71,72], and specifically for power systems applications are in [73,74]. However, a direct comparison with NAR and LSTM has been not reported for time-series in power systems. The closest cases were reported for traffic application in [75], where LSTM models worked better. Furthermore, the results depend on specific application of ML method and data, which can modify the analysis totally as seen in the present study.
One of the limitations of this study is related to the unidimensional analysis of timeseries, the present proposal searches to indicate the performance of using values of the same time series. The inclusion of more variables to do the forecasting could be explored in future work with the employment of vector auto-regressive models. Another aspect of being analyzed is the use of many more models to do the forecasting. Deep learning approaches can be studied for the present application, including more architectures with more layers. However, the necessity of large numbers of data to produce satisfactory results is a requirement that must be considered. In the process of using applications related to NN models, the present work analyzed smaller architectures first. Finally, in the specific field of development of machine learning models, the exploration of hyper-parameters of the NN could be more comprehensive, making use of more sophisticated techniques. Despite this, the obtained results allowed to observe the effect of the chosen parameters in terms of the frequency-time series forecasting Data-driven control systems is another possibility of use data-trajectory models as previously mentioned in [6]. Applications of data-driven or model-free controllers require the development of advanced control techniques as proposed in [76], where optimal parameters were found, while counteracting the disturbances in the system. Another example of a model-free approach is shown in [77], where stochastic adaptive control is applied. Since large-power systems present several stochastic conditions to be analyzed in such fashion, a data-driven approach is used in [15], for non-synchronous generation impact analysis. False injection attacks also uses data-driven controllers to detect anomalies in the power systems data as mentioned in [78]. Data-driven electromechanical oscillations controllers are part of such development [79], where oscillations-data paths can be identified for power systems monitoring [80]. In general, machine learning data models can provide strategies for further power systems data interpretation and control. From these applications, further developments are expected in this area.

Conclusions
This paper has shown the application of an strategy for time series forecasting based on NAR and LSTM neural models, for the power grid frequency case. Simulations with various scenarios were analyzed, including non-synchronous generation with changes in the inertia and frequency response in the system when noise addition was performed. Additionally, PMU measurements were analysed to confirm the possibility of using it for field applications or power system planning.
The power grid frequency forecasting presented specific particularities that can be learnt by the classical NAR models. Despite the prediction performance of the LSTMs, which reached comparable results, trained models based on more straightforward technique given by the NAR models were light better for this particular application.
Future works will require to involve a more significant number of data and cases in order to test the performance of the neural methods. In light of the limitations described above, a reasonable next step would be trying NAR models with exogenous (NARX) inputs, searching for the improvement of the results. Only one training univariate time-series data set was used for each training since each frequency response is unique and its grid conditions. Additionally, Vector Autoregressive (VAR) models with the nonlinearity based on NN models also will be considered as further studies due to the possibility to analyze simultaneous measurements like inertia or active power and correlate them. Other aspect to include is the employment of models based on deep learning with more layers or cells, specifically for the LSTM models can be explored.