1. Introduction
The share of renewable energy sources in the energy mix in most European countries is rising steadily. The process is irreversible. That happens also because the cost of energy generated by RES is becoming more and more competitive against the prices of energy from conventional fuels, such as fossil fuels, gas, or nuclear energy [
1]. Photovoltaic and wind sources take a special place among renewable energy sources on account of their potential and availability. For example, the energy potential of wind is estimated at 52 TW [
2], and solar energy at over 1.94 TW [
3]. Primary energy resources from those sources strongly depend on the existing weather conditions. That, combined with a growing share of renewable energy sources in the overall energy balance of countries/regions, forces energy producers and electric power system managers to collect increasingly more data, including, in particular, data on the current and predicted energy production from those sources.
There are two main factors that necessitate energy generation forecasting:
The proposed models for forecasting the electricity production of wind farms are the answer to problems in balancing the energy market from the day-ahead and intra-day perspectives. They make it possible to forecast energy production based on meteorological conditions, and thus allow the accuracy of the forecasting models of energy prices described in [
4,
5,
6,
7] to increase. The authors [
4] emphasize the stochastic nature of wind farm production and their negative impact on the predictability of energy prices on the market. In an article [
6] on the management of the multi-energy microgrids market, they consider five cases where the variability of wind farm production is predicted based on wind speed recording. Similarly, in [
7], it is the only factor determining the amount of energy produced from wind farms. The analysis of the impact of weather factors carried out in this article, and the proposed method may have a positive impact on improving the credibility of microgrid and energy price management models. Quickly developing IT technologies allow the implementation of forecasting models that are complex in computational terms. They help to minimize forecasting errors to an increasing extent, thus enabling effective optimization of the load distribution and energy outflows, increasing energy supply security, and reducing the cost of its production and transmission. In addition, the adverse impact of the energy sector on the natural environment is reduced. A quick development of energy production volume forecasting can also be observed for wind power plants, which are the focus of the discussion in this paper. This area is being actively developed by research units from across the world. That is demonstrated by numerous papers devoted to this topic published by first-rate publishing houses [
8,
9,
10,
11,
12,
13,
14,
15,
16]. The forecasting models they describe can be broken down into [
16]:
Physical models (e.g., [
17,
18]).
Statistical models.
- -
Based on data sets:
- *
Naive models (persistence method) (e.g., [
19]);
- *
Grey forecasting models (e.g., [
20]);
- *
Models based on Kalman filters (e.g., [
21]);
- *
ARMA (Autoregressive moving average model) (e.g., [
22]);
- *
ARIMA (Autoregressive moving average model) (e.g., [
19,
23,
24]).
- -
Based on “artificial intelligence”:
- *
Artificial neural networks (e.g., [
25,
26,
27]);
- *
Vector machines (e.g., [
28]);
- *
Wavelet analysis (e.g., [
29,
30]).
Hybrid models (being a combination of the above) (e.g., [
26,
29,
31]).
Below in this paper, a review of the energy production forecasting methods is provided, with a particular focus on those that represent an alternative to the neural networks proposed in this article or methods employing neural networks in hybrid configurations. Those solutions have many proponents, but also opponents, who argue that the method uses fuzzy definitions, and hence its results are ambiguous and imprecise. It was long believed that they were not suitable for multi-stage computations (i.e., when the results of the first stage are inputs in the next one) [
32]; however, deep learning networks come in handy here. Many years ago already, George Box, a British statistician dealing with quality control, analysis of time series, and Bayesian inference, noticed and accordingly titled one of the chapters in their paper [
33], namely, “All models are wrong, but some are useful”. It is emphasized in the article that it is a very difficult task to map reality with a simple model. Linear models are well known and described by mathematicians, but they disappointingly rarely accurately reflect reality. Non-linear problems that are closer to reality are more difficult to define and solve. In those cases, artificial intelligence-based models come in handy, and even though they are unable to provide accurate solutions, owing to the proper definition of the attributes of the identified facilities, they can find relations that a given phenomenon shares and provide a solution to the problem with satisfactory accuracy.
Many statistical models can be distinguished, e.g., WPPT, FUGS, and other models described below. They were selected on account of a different approach to the data processing and forecasting process. Authors of the papers referred to above proposed their own forecasting methods based on individual input and verification data sets. The WPPT (Wind Power Prediction Tool) model [
34] allows the forecasting of energy for a time horizon of up to 48 h, and a resolution of 30 min. Numerical weather forecasts are used for the forecasting, as well as wind farm measurement data updated on an ongoing basis, which can update the non-linear model on a continuous basis. The model combines two approaches by forecasting the output of the individual turbines within the farm and the output of the entire wind farm.
The FUGS (Forecasting Using Gaussian Processes) model [
14] is dedicated to short-term (h-24 h) forecasting of wind energy production. It comprises two models: GP-CSpeed and GP-Direct. GP-Direct is built on data coming directly from numerical weather forecasts (without their additional adjustment). The second model, named GP-CSpeed, first adjusts the input data, and then the model itself is built on the basis of already adjusted values. Wind velocity adjustment involves filtering out the set of data that goes beyond the permitted limits. The same forecasting model is used in both cases.
The SVM (Support Vector Machine) [
15] is an effective statistical tool that can solve multi-parameter non-linear problems. SVMs are learning machines based on support vectors. An important advantage of the SVM network in comparison to MLP (Multilayer Perceptron) neural networks is turning the problem into a task that typically has a single minimum of the purpose function. A disadvantage of such solutions is the dependence of the results on the adopted values of constant parameters, such as the width of Gaussian function
, factor
for a multinominal core, the constant regularization value C (reducing network complexity), or tolerance
[
35].
The topics of network load and generation forecasting are very similar and do not differ in terms of the structure of the applied model. Paper [
36] describes topics relating to the load forecasting of a specific part of the power supply system, being the effect of energy consumption and generation, paying particular attention to the challenge of RES generation forecasting. Paper [
37] presents the options for load forecasting using the hybrid method based on artificial neural networks and using Mixed Integer Linear Programming techniques. In the latter method, the models that use historical and current electrical and forecasting data, combined with weather forecasts, are capable of effectively forecasting the system load. As noted by the authors, hybrid models that employ artificial neural networks and linear algorithms are mutually complementary, ensuring redundancy. The forecasts are generated by linear algorithms, whereas the neural network “fine-tune” the forecasts. That enables the incorporation of non-linearity of the predicted load and allows the achievement of significant accuracies.
To improve the forecasting accuracy, hybrid models are used that combine the advantages of both model types. A practical example of a hybrid forecasting model is described in paper [
8]. A model presented there is dedicated to the energy production prediction for fourteen wind farms. The proposed hybrid model, combining neural networks and vector machines, was compared with other models based on either neural networks or vector machines only. Paper [
8] describes ANN (Artificial Neural Networks) learning on the basis of historical data covering one year. The model considers wind velocity and direction as parameters underpinning the volume of energy production. A simple neural network was proposed, consisting of a single hidden layer, because adding new layers did not improve the results. There were from 15 to 25 neurons in the hidden layer, depending on the wind farm for which the forecasts are made. It follows from the observation of the authors of [
8] that the qualitatively worst forecasts covered the power plants located in rough terrain.
In the hybrid model [
29], the data were divided into four intervals on the basis of IEWT (improved empirical wavelet transform), and then processed by a network of carrier vectors using the LS-SVM (least-squares support vector machines) method to find its parameters. The least-squares method (LS) is a tool that enables the minimization of the average-square error in the course of the determination of linear regression or estimation of non-linear model parameters [
35]. At the same time, the BSA (Bird Swarm Algorithm) was used to achieve more precise forecasts [
38]. As noted by the authors, that algorithm ensures better forecast accuracy relative to other models based on biological systems, such as differential evolution (DE), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO). The application of BSA was used to select LS-SVM network parameters, which, as reported by the authors, markedly reduced the complexity of computations and improved forecasting accuracy.
An example of a neural network-based hybrid model with regressive error propagation is described in paper [
39]. The authors of the paper effectively used that type of network in combination with NARX (Non-linear Autoregressive with External Input). NARX network models perform better with short-term forecasts, whereas network-based models with regressive error propagation ensure better results for longer forecasting horizons. The following was used for the forecasting: real-time production, structural information, NWP (Numerical Weather Prediction), and power forecast from various providers.
Many articles have been published referring to the prediction of energy production from wind farms using NWP data for forecasting. Unfortunately, these models are also burdened with an error, which can be minimized with the knowledge of the specificity of the operation of the wind farm. The authors [
40] propose the use of additional data streams for forecasting. In addition to the classic Deterministic weather research and forecasting (WRFD), they also use Radar weather research and forecasting (RWRF) to reduce the horizontal resolution grid to 2 km. Increasing the amount of data allowed them to obtain better predictions; in most cases the MAPE error was below 8%. Promising results were obtained for the XGBoost (Extreme gradient boosting) and ANN models. In this case, only the wind speed was used for the forecasts, which seems to be a weakness of the models. In paper [
41], long-term data from 2009 to 2011 were used to train the model. Wind direction was taken into account as an additional factor. Data from 12 points of the NWP forecasting grid were used to improve forecasts. Model verification was performed only for data from two months. For forecasting, support vector regression (SVR) was used, extended with stacked-denoising-autoencoder (SDAE) and bat algorithm (BA) optimization. The reduction in the NRMSE error to 11.6% is visible for the three hidden layers of the SDAE model. Larger errors were obtained for a smaller and larger number of layers (with one, two, four, or five hidden layers), which may indicate the need to select the appropriate number of layers. The combination of the SDAE-SVR-BA model allowed for a 1–2% increase in accuracy. The authors of [
42] proposed alternative models being combining wave division (WD), improved gray wolf optimizer based on fuzzy C-means clusters (IGFCM), and Seq2Seq model with an attention mechanism based on the long short-term memory model (LSTMS). The advantage of LSTMS models is that they have feedback in their structure. As a result of the analysis, two models were selected for which the NMAE and NRMSE errors were the smallest in a one-day perspective. The first is the IGFCM-LSTMS model (Seq2Seq model with IGFCM) and the second is its modification WD-IGFCM-LSTMS that takes into account the wave division. The models generated forecasts with an NMAE error of 10.28–10.32% for Wind Farm A and 9.53–10.18% for Wind Farm B. Additional input data streams to the model are included in the paper [
43]. Data from three NWP suppliers and many points located in the vicinity of the power plant were used. The power plant consisted of 68 wind turbines. As a result of combining data streams in the MIX model, it was possible to reduce the NMAE error of the wind speed forecast from 6.7 to 8% to 6.1% and the wind power forecast to 6.91%. The authors of the article confirm the benefit of using data from different providers of NWP models. In [
44], the LSTM model with the modification of wind power ramp events (WPREs) was proposed. The method was tested for three wind farms. As a result of the prediction, the NMAPE error was obtained in the range of 9.4–15% (depending on the tested object). As the authors emphasize, better results were obtained using the WPRE functionality. The model better identifies changes in the type of ramp events, which has increased its accuracy. The authors of [
45] used alternative machine learning methods such as support vector regression (SVR), random forests, and artificial neural networks to forecast farm power. Data with hourly resolution were used, such as wind speed at 10 and 80 m above sea level. As the authors emphasize, in most cases, the highest accuracy was obtained for classic artificial neural networks. Only for one power plant was the SVR method better.
Artificial neural networks were selected for this study because this method
is easy to implement, verify and improve;
enables the development of models for various objects, as presented in this article;
allows the use of data of various natures (meteorological, electrical);
makes it relatively easy to take into account additional factors affecting the amount of energy produced;
enables the continuous learning of the neural network. As a result, the quality of the forecast improves over time.
The purpose of this paper is to present new short-term (24 h) models for wind power plant generation forecasting. Forecast sensitivity to the specific weather factors (
Section 2.2 and
Section 2.3) was tested on the basis of actual measurements performed at four wind farms (
Section 2.1). The analysis allowed us to select the most advantageous input data set structures and improve forecasting accuracy. The forecasting models proposed in this paper (
Section 2.4) are a combination of the statistical model using artificial neural networks based on numerical weather forecasts. To improve the forecasting effectiveness, a few versions of the model were proposed and subsequently tested.
2. Data Set and Methods
The basic object for which energy production forecasts are performed is a single wind turbine. Forecasting for the entire farm, comprising many turbines, is more complex. In that case, additional factors should be considered, such as the number of currently working turbines, or terrain shape around each turbine comprising the wind farm. A change in wind velocity close to the turbine activation threshold may result in deactivation of some. Start-up is time-consuming, which affects the volume of the actual production and forecasting error.
On the basis of data from 4 wind farms, factors that impact the energy production value were analyzed, and sensitivity analysis was performed. Only such data were selected for the analysis that can be monitored for a given object, and which can be potentially significant for the forecasting. The descriptions of the respective facilities are heterogeneous due to a different set of recorded values and available information. The collected database and analysis performed on its basis seeks to simplify, in the future, the process of developing forecasting models for new facilities.
In the case of the proposed forecasting methods that are based on artificial intelligence, data from the power plant provide information about key factors influencing its operation.
2.1. Analyzed Wind Power Plants
Figure 2 shows the location of the contemplated power plants on the map of Poland.
Data come from the following facilities:
Wind power plant No. 1 (FW1), located in the central area of the Baltic Sea coast, whose total installed capacity of wind turbine units equals 8 × 2.5 [MW];
Wind power plant No. 2 (FW2), located in northwestern part of the country, whose total installed capacity of wind turbine units equals 60 × 2.0 [MW];
Wind power plant No. 3 (FW3), located in south-central part of the country, whose total installed capacity of wind turbine units equals 15 × 2.0 [MW];
Wind power plant No. 4 (FW4), located in central-western part of the country, whose total installed capacity equals 1.0 [MW], comprises a single turbine, and therefore, local wind variability directly affects changes in the generating unit’s output. That hinders the forecasting process, and therefore, it was decided to include it in the analysis.
Facilities were selected so that conclusions and results of the analysis could enable a proposal for a universal model that is scalable for other wind power plants located in the country.
The facilities identified above were selected on account of their variability in terms of the following:
Number and unit output of the turbines;
Location;
Type of generators/gears;
Elevation of the hub’s axis;
Rotor’s revolving surface area.
2.2. Dependence of Output on Wind Velocity
Figure 3 presents the declared power characteristics of an example wind turbine of power plant FW3 prepared on the basis of the manufacturer’s documentation [
46] and as-measured historical data.
When analyzing
Figure 3, the following can be identified:
The theoretical power characteristics are within the area demarcated by the measurements. That is because the wind turbine operates under variable weather conditions, different than those adopted when the theoretical power characteristics were developed;
There is a set of points on the actual power characteristic for the wind velocity from 4 [m/s] to 17 [m/s] for which power P = 0. This may be caused by an emergency or operating shutdown of the turbine, for example, if the wind velocity is too high, or due to icing or maintenance.
The wind velocity distribution within the wind farm, even though it is located in a relatively small area (270 ha), is not even. This is shown by
Figure 4, presenting the annual deviation from the average electricity production for the respective turbines E1–E15 comprising wind power plant FW3.
Figure 5 presents wind power plant output variability as a function of wind velocity value. The subsequent figures present similar characteristics for the selected turbines within power plant FW3 (
Figure 6).
The volume of electricity production from wind power plants depends on many factors relating to both the construction and the type of generation, as well as the interconnected power equipment. Of significance are also weather factors, marked by strong variability and largely independent of humans. As time passes, power plant parameters are also subject to change as a result of the equipment aging process and reduced efficiency, which impacts the volume of energy production by the power plant.
2.3. Impact of Weather Factors on the Wind Power Plant Operation
The volume of electricity generated by the wind power plant strongly depends on weather factors, or the locations of the respective turbines within the farm. In the first case, it follows from the laws of physics governing the turbine’s operation, and in the second, from the shape of the terrain and mutual impact of the turbines, the so-called “overshadowing”. Wind power plants are typically located over a large area, and in effect, wind velocity distribution is uneven.
Wind velocity clearly has the largest impact on the volume of energy generation, but other impact factors were also analyzed, such as temperature, pressure, and wind direction.
2.3.1. Impact of Ambient Temperature on the Wind Power Plant Operation
The analysis of the temperature impact, on account of the collected data set, was performed for wind power plants FW1, FW3, and FW4. The source data set is, in each case, divided into two subsets. One comprises data related to ambient temperature T
C, and the other for temperature T
C. The selection of the ranges was arbitrary, so that the adopted temperature limit values enabled reconstruction of the wind power plant power characteristics, meaning that sufficient quantity of data was present in the respective subsets. Temperature in the set was changing in the range from −17
C to 39
C. The selection of limit temperatures allowed us to obtain sets in a duplicated number (relative to the temperatures). Neural network models were developed for the two data sets received (for each power plant) (
Figure 7).
Two prepared data sets (set A—T C, set B—T C) were used in the process of organizing the learning set and in the process of network learning. Wind velocity V and power P were provided for the neural network input and output.
After the end of the optimization process within the meaning of the medium-square error minimization, networks were obtained that allowed us to reconstruct the wind power plant characteristic P = f(V) for the highlighted subsets.
In both cases, the learning covered the neural network of identical typology. In the first two layers, the network comprised five neurons in each layer with a logistics activation function, and a single linear neuron in the output layer. The results of the wind power plant power curve reconstruction by the network are shown in
Figure 8,
Figure 9 and
Figure 10.
Figure 8 shows two power characteristics, which are the neural network’s response to low (T
C) and high temperatures (T
C).
Figure 9 and
Figure 10 include, as in the previous case, two power characteristics for power plants FW3 and FW4, respectively,
It follows from
Figure 8 to
Figure 10 that temperature affects the shape of the wind power plant power characteristic. In the case of wind power plant FW3, the largest difference is for the wind velocity of 9.4 m/s and is equal to 1.63 MW. That represents over 8% of the wind power plant’s rated output. The differences for wind power plants FW1 and FW4 grow bigger as the wind velocity increases, reaching as much as 10–15% of the power plant’s rated output.
2.3.2. Impact of Atmospheric Pressure on the Power Plant Operation
The impact of atmospheric pressure was analyzed in the same way as for the temperature. On account of the data in stock, the analysis was performed for one wind farm, FW1. Two data sets were separated. One included data for pressure lower than 997 hPa, and the other data for pressure that is greater than 1018 hPa. The selection of limit pressures was organized so that to obtain sets with a similar size, enabling reconstruction of the power characteristic. Neural network models were developed for the two data sets received (
Figure 7). Neural models were proposed having the same structure as for the analysis of the impact of the ambient temperature on wind power plant operation, reconstructing the wind power plant’s power characteristic. Results of network learning are shown in
Figure 11.
Pressure and power correlation coefficient for the wind power plant equals −0.24, which may testify to this factor having two times the impact of temperature. In reality, the impact may not be that significant due to proportionately small pressure changes over the year (pressure varied from 982 to 1035 hPa). Average pressure in the analyzed period of nearly two years was 1012 hPa, which, considering the difference between the highest and lower pressure being 53 hPa, gives the value changes equal to only 5% of the average. The results shown do not confirm significant impact of atmospheric pressure on the power characteristic curve of wind farm FW1.
2.3.3. Impact of Wind Direction on the Wind Farm Operation
Eight data sets were selected, where the wind direction changed every 45
. The analysis was performed for the cases discussed above. Network learning results for wind farms FW1 and FW3 are shown in
Figure 12 and
Figure 13.
In analyzing the charts from
Figure 12, one can notice that the differences in the velocity range from 3 to 8 m/s are insignificant, not exceeding 0.5 MW. At higher velocities, they grow to over 1 MW. There are slightly greater differences for wind power plant FW3 (
Figure 13). However, this may be caused by the learning set being too small for the selected directions, which prevented the achievement of the full range of wind velocity and power variability (for power characteristic-building purposes).
Figure 14 shows that the dominant directions are northern and northwestern, and the wind velocity variability for the remaining directions is small.
It is difficult to differentiate a trend in the analyzed data set that would indicate that wind direction impacts the power characteristic curves. The curves intertwine in the entire selected range, and no single or outlier curve can be singled out.
Similar information is supplied by the correlation table, where the factors for wind direction are 0.05–0.14.
Table 1 shows the factors broken down into the respective wind directions. The correlation coefficient for wind direction and velocity broken down into 8 angular ranges varies from −0.19 to 0.1, which may show that there is no correlation between those parameters.
The volume of wind farm energy production strongly depends on whether factors variable over time that exist in the power plant’s immediate vicinity. The more precise the estimation of the value, the less expensive its balancing in the power system. However, these are not the only factors that affect energy production by those plants.
2.3.4. Summary of the Impact of Weather Factors
Table 2 presents the Pearson correlation coefficient among the analyzed factors in relation to the power plant’s output.
The strongest correlation is for the wind velocity, which appears obvious for wind farms. There is a slightly smaller correlation with the number of turbines working. This may be due to production variability over time and scheduled maintenance. The number of turbines working provides information about a potential maximum output from the power plant, while the variability itself depends on weather factors. In the case of wind direction for one FW1 power plant only, there is a noticeable positive correlation. Pressure and temperature are also interesting factors, showing negative correlation coefficients.
Other factors that impact instantaneous active power generated by the power plant include, for example:
Instantaneous power factor;
Auxiliary energy consumption;
Wear of elements over time;
Scheduled maintenance and overhauls.
In the case of wind power plants, only some of that information is recorded and entered into the installed SCADA system. Forecasting models currently in development, in most cases, leave out the impact of those factors on production volume. Such data are available only to turbine manufacturers/maintenance contractors and are not made available to the user.
2.4. Forecasting Models Dedicated to Wind Power Plants
Various neural network structures were proposed and tested to arrive at the most advantageous option independently for each analyzed object. The proposed forecasting models are simple in design, and demonstrate quick learning process and network adaptation for power plants differing from each other in terms of the number of turbines (installed capacity), their location, and in technical terms. The selected models were tested on data recorded by the analyzed facilities. Long-term measurement data were used for a period of over 1 year, as a result of which it was possible to consider the variability of weather factors characteristic of the respective seasons of the year. Automation of the process enabled the models to learn multiple times, with the weights of the neural networks set so as to minimize the average absolute forecasting error.
The quality of energy generation forecasts strongly depends on the accuracy of numerical weather forecasts and the capacity, for example, of neural networks, to adjust them. The development of technologies allows the fitting of wind turbines with measurement and control devices that are more functional. In effect, ever more precise data are obtained. Bigger disk arrays of servers that support SCADA systems enable an increase in the data sampling frequency. All those factors make it possible to obtain more accurate forecasts in the future. The introduction of signals from those devices to the neural network models does not require a significant workload. In effect, it is possible to improve the model in the short term and with low financial expenditures through the interpretation of phenomena that could not be identified earlier.
The capacity to generate forecasts for a selected time horizon strongly depends on the available input data used to develop them. Considering the collected data set, the analysis presented here proposes models dedicated to short-term forecasts with a horizon of few to several hours.
Neural Network Models
Modeled on biological systems, artificial neural networks (ANN) are used to process information. The authors of papers such as [
8,
48,
49] and books such as [
50,
51,
52] show that they are effective tools for solving forecasting problems. That happens mainly because the models have the capacity to approximate any multi-dimensional, non-linear function (
Figure 15). The approximated function is obtained in the network learning process. This sets the model apart from the other models. The learning process consists of the creation of the so-called learning set, i.e., a set of input and output parameter values of an object. In the learning process, the neural network modifies its parameters to find their relations. In the learning process, the model is optimized on the basis of historical information from the object.
Artificial neural networks are information processing systems based on the concurrent work of many neurons. A neuron (similar to its biological counterpart) is an element with many inputs and one output. There is a weight associated with each input, which is modified in the learning process. The model of a single artificial neuron, the method of learning and calculating responses is shown on
Figure 16.
Neuron weights (input signals) are modified in the learning process, so that to minimize the quality factor, the following is used (
1):
where
—weight of
i-th,
—value of
i-th input,
—actual/measured output value,
—learning result.
The learning process is to minimize function (
1) by iterative methods using the steepest descent algorithm. The process for a simple linear neuron boils down to adding (or deducting if the error is negative) to/from the vector weights of the parts of the input signals vector (
2):
where
—new, adjusted value of the
i-th input,
—value of
i-th input for iteration t−1,
—network learning speed factor,
—learning error,
—value of
i-th input.
Regular structure of neural networks can generalize the learning algorithm used for the learning of a single neuron and apply it to the entire neural network. The network learning method that is most often used is the error backpropagation method (BP) [
54]. The Levenberg–Marquardt algorithm is also referred to as the non-linear optimization algorithm. It combines the steepest descent method and Gauss–Newton method, and it is an iterative algorithm. Compared with the BP method, it shows a greater operating speed, but at the expense of higher cache memory requirement for the computations [
55]. It is based on the non-linear least-square problem-solving algorithm written by Marquardt in paper [
56]. The Levenberg–Marquardt regularization method boils down to replacing the Hessian matrix (during Newtonian optimization), its approximation based on gradient calculations, along with a suitable regularization factor [
55].