ARIMA Models in Electrical Load Forecasting and Their Robustness to Noise

The paper addresses the problem of insufficient knowledge on the impact of noise on the auto-regressive integrated moving average (ARIMA) model identification. The work offers a simulation-based solution to the analysis of the tolerance to noise of ARIMA models in electrical load forecasting. In the study, an idealized ARIMA model obtained from real load data of the Polish power system was disturbed by noise of different levels. The model was then re-identified, its parameters were estimated, and new forecasts were calculated. The experiment allowed us to evaluate the robustness of ARIMA models to noise in their ability to predict electrical load time series. It could be concluded that the reaction of the ARIMA model to random disturbances of the modeled time series was relatively weak. The limiting noise level at which the forecasting ability of the model collapsed was determined. The results highlight the key role of the data preprocessing stage in data mining and learning. They contribute to more accurate decision making in an uncertain environment, help to shape energy policy, and have implications for the sustainability and reliability of power systems.


Introduction
Electrical load forecasting plays a key role in the management and control of a power system. Electricity is a peculiar product-there is currently no practical possibility to store it on a large scale at a desired time. It is necessary to balance energy supply and demand in real time. Imbalance may cause problems with the stability of the power system. Breakdowns resulting from power system instability have serious implications for the sustainability of regional, national, and international energy systems. They may be a cause of many human systems failures and of serious environmental disasters. Precise analysis and forecasting of electric load are necessary to make rational decisions at all levels of energy sector control, management, and policy (technical, managerial, regulatory) as they are closely linked with countries' energy security, resources, and natural environment. Effective forecasting decreases uncertainty, thus allowing for more accurate decisions at the operational, strategic, and policy levels.
The importance of modelling and forecasting of electricity consumption is reflected in numerous studies [1]. There are many modelling approaches dedicated to the volume of demand/consumption/load. Depending on the time horizon, forecasts may be divided into: (i) very short-term load forecasting-VSTLF (up to one hour ahead), (ii) short-term load forecasting-STLF (from one hour to one month ahead), (iii) medium-term load forecasting-MTLF (from one week to one year ahead), and (iv) long-term load forecasting-LTLF (more than one year ahead) [2,3]. Diverse methods are applied depending on the time horizon and the aim of the forecasts. Three main classes of forecasting methods may be distinguished: (i) statistical methods, e.g., exponential smoothing models (ESM), multiple linear regression (MLR), and autoregressive and moving average (ARMA); (ii) artificial intelligence methods, e.g., artificial neural networks (ANN), fuzzy regression models (FRM), and support vector machines (SVMs); and (iii) hybrid methods comprising of two or more methods from one or both classes [4,5]. A comprehensive review of models and techniques used in load forecasting is presented in [1][2][3][4][5][6].
Auto-regressive integrated moving average (ARIMA) models are among the most popular approaches in the statistical methods class successfully applied in electrical load forecasting, in VSTLF, STLF, and MTLF tasks [1][2][3][4][5][6]. ARIMA model identification means specifying the class: moving average-MA, auto-regressive-AR, or mixed -ARMA, and the order. The background for model identification may be found in the general guidelines on the pattern of autocorrelation (ACF) and partial autocorrelation functions (PACF); therefore, much depends on experts' knowledge and experience. Factors shaping the load process in the power system are to a large extent random in nature. Electric load is influenced by highly diverse and non-deterministic human activity, numerous technological processes and their random alterations, changing weather conditions, and other stimuli that are non-deterministic or difficult to include in the modeling. Random noise is an inherent component of all measurement data. It may come from various sources: measuring and transmission instruments as well as factors external to the process. Growth of the noise in the measured signal (observed load time series) significantly impacts the capabilities of forecasting models [7]. Taking into account the specificity of ARIMA models, the level of the random component in a time series (measured by its amplitude) significantly impacts the possibility of the correct model identification [8].
The above considerations have led to the formulation of the following research problem: insufficient knowledge on the impact of noise on the ARIMA model identification in electrical load forecasting. Authors' motivation for this research is two-fold: exploratory and pragmatic. First, the authors desire to fill the knowledge gap that exists in the research on noise laden electrical load time series forecasting with ARIMA models. Second, they wish to provide a robust methodology for the evaluation of the tolerance of ARIMA models to unavoidable noise occurring in the electric load time series. Additional motivation for this research is to provide practical guidelines for the use of ARIMA models in noise laden electrical load forecasting. Achieving those goals would imply more effective load forecasting, thus better-informed decisions in managing power systems.
The contribution of this paper to forecasting theory and practice is threefold: • Development of studies on the impact of noise in time series on the forecasting model identifiability and their robustness, • Assessment of the tolerance to noise of ARIMA models, • Formulation of practical guidelines for the use of ARIMA models in noise laden electrical load forecasting.
The paper has the following structure. In this section (Section 1), the research problem is formulated and the motivation for the study is offered. Section 2 presents the essence of ARIMA models and their identification. Section 3 provides a review of the publications on ARIMA models in electrical load forecasting. Sections 2 and 3 are the basis for the formulation of the problem solution. Section 4 describes the adopted research methodology and the simulation experiment design. Section 5 presents the results of the experiment. The subsequent sections explore the ARIMA model identification for real load data and impact of random noise on ARIMA models identification and their ability to predict electrical load time series. Section 6 presents the discussion on the contribution of the results to the development of forecasting methodology and practice of load forecasting. The article ends with conclusions.

ARIMA Models and Their Identification
The ARIMA class of models, also referred to in the literature as the Box-Jenkins models due to the ground-breaking contribution of G.E.P. Box and G.M. Jenkins (1970) [9], integrate the autoregressive AR(p) and moving average MA(q) component so that: where: e t -error distributed as white noise, p-the order expresses the earliest value included, θ i -coefficient at error e t−i , q-the order of the series depends on the earliest previous error.
Using the backshift operator: B j y t = y t−j , Equation (1) may be expressed by Equation (2): where: represented as a polynomial in the backshift operator; θ q (B) = 1 − θ 1 B − · · · − θ q B q -autoregressive operator, represented as a polynomial in the backshift operator.
The ARMA assumes that the process is stationary. It means that time series has at least constant mean and variance, and its covariance function depends only on the time difference. When nonstationarity is observed, data transformations are needed. Considering nonstationarity in variance, logarithm transformation is the most popular solution, whereas nonstationarity in mean is commonly removed by differencing. Differenced processes are modelled by auto-regressive integrated moving average-ARIMA(p,d,q)-and the general form for the model is: where: d-order of integration (differencing).
The standard process of ARIMA modelling covers six consecutive phases: (i) preliminary analysis, (ii) transformation to stationary, (iii) identification of the components, (iv) parameters estimation, (v) testing, and (vi) application.
Identification of an adequate ARIMA model depends on the autocorrelation and partial autocorrelation pattern. The general idea is that ACF of the p-order AR process decays gently (exponentially), whereas the PACF cuts off after the p-th lag. In contrast, the ACF of the MA process of q order cuts off after the q lag, whereas the PACF gently decreases. If both ACF and PACF decay exponentially, this suggests a mixed ARMA process. A popular approach to the determination of the appropriate order of ARIMA is based on fitting [10]. A common approach is the automation of model identification. It relies on the algorithmic comparison of models with different parameters and results in the choice of a model which best fulfils the fit criteria [11][12][13]. Models that pass the Ljung-Box test are accepted as statistically adequate. To compare and determinate the fitting accuracy, several other criteria are used: simple statistical metrics, such as mean absolute error (MAE), mean absolute percentage error (MAPE), final prediction error (FPE), Akaike information criterium (AIC), or Bayesian information criteria (BIC) [14][15][16].
There is a large variety of ARIMA models. In the case of series with a seasonal component, the seasonal ARIMA (SARIMA) model may be used. The general notation of the SARIMA seasonal model is: ARIMA(p,d,q)(P,D,Q)s, where s is a number of seasons in the seasonal cycle: where: It is worth emphasizing that ARIMA models also consider multiple seasonality, which is especially useful in the case of energy forecasting: where: Further extensions and variations of the classic ARIMA are models that include exogenous series as input variables, referred to as an ARIMAX: where: ω i (B)-numerator polynomial of the transfer function for the ith input series, δ i (B)-denominator polynomial of the transfer function for the ith input series, k i -the pure delay for effect of x i,t . at time t.
Other variants include the multivariate approach, e.g., vector ARIMA (VARMA). When d is a fraction rather than an integer, the process is called fractionally integrated ARMA (ARFIMA or FARIMA). There are also models like ARCH/GARCH (generalized auto-regressive conditional heteroskedasticity) to deal with data with nonconstant autocorrelated variance. Commonly applied ARIMA based time series modeling approaches are hybrids derived from models of the same family, e.g., AR-GARCH-AR models with GARCH residuals or based on models with dissimilar assumptions, e.g., ARIMA and neural networks. In the case of energy load prediction, it is justified to use the Reg-ARIMA compound of regression and ARIMA time series errors, for example, hourly temperature data as a regressor [17].
Glossary of all abbreviations and acronyms used in this article can be found in Table 1.

Background Literature
In the last dozen or so years, Clarivate WoS, the Scopus database, and IEEE Xplore have recorded several hundred publications each year, in which key words include the terms ARIMA and "electricity" or "energy" and "volume" or "demand", "consumption", "power", "load". The number of relevant publications from the last 10 years reported by different scientific databases is presented in Figure 1.

Background Literature
In the last dozen or so years, Clarivate WoS, the Scopus database, and IEEE Xplore have recorded several hundred publications each year, in which key words include the terms ARIMA and "electricity" or "energy" and "volume" or "demand", "consumption", "power", "load". The number of relevant publications from the last 10 years reported by different scientific databases is presented in Figure 1. with keywords "ARIMA" and "electricity" or "energy" and "volume" or "demand", "consumption", "power", "load" in Clarivate WoS, Scopus, and IEEE Xplore databases.
ARIMA models are among the most popular forecasting techniques in the energy sector alongside artificial neural network (ANN), support vector machines (SVM), and uncertainty solving approaches under discrete data such as grey or fuzzy [18] Figure 1. Number of publications with keywords "ARIMA" and "electricity" or "energy" and "volume" or "demand", "consumption", "power", "load" in Clarivate WoS, Scopus, and IEEE Xplore databases.
ARIMA models are among the most popular forecasting techniques in the energy sector alongside artificial neural network (ANN), support vector machines (SVM), and uncertainty solving approaches under discrete data such as grey or fuzzy [18] or rough [19] and are often employed in hybrid approaches [20]. A large group of articles focuses on comparing or combining approaches. The models used are, among others: noted earlier neural networks [21][22][23][24], linear regression with ARIMA [21,25], Holt-Winters or exponential smoothing [21,26], and seasonal and trend decomposition using loess [27], metabolic grey model [28], or data mining [29]. Often it is the hybrid approaches that are indicated as providing better accuracy. However, approaches to increase predictability also include, for example, innovative data filtering methods [30].
Considering applied areas, ARIMA is used to forecast energy issues at various levels of data aggregation, based on timeseries or panel data of a group of countries, e.g., the EU [31], individual countries [26], sectors of the economy [31][32][33], institutions [25,34], or production processes [22]. Among the articles that forecast the demand/consumption/load of energy in general, a popular subject is modelling the generation/consumption of energy from renewable sources. Forecasts address the development of renewable energy consumption in total [35] or from particular sources, i.e., wind [36][37][38], hydropower [39][40][41], solar [42], thermal [43], or biogas [44]. ARIMA models are also used for modelling and forecasting that are inextricably linked with energy CO 2 generation [45][46][47]. Another area is price forecasting [48] or a volatility index [49].
Review of ARIMA models employed in electric load forecasting points at the importance of proper model identification. Unfortunately, none of the cited studies have studied the impact of noise (and its level) in the load time series on the adequacy of model identification and its predictive capacity. The authors have encountered only a few studies indirectly related to this problem [7,8,73,74]. In paper [7] a pattern recognition technique was used to examine the influence of noise on the one-step ahead time-series forecasting in the case of the exponential smoothing with non-linear neural networks methods. Results of studying different forecasting techniques (nearest neighbors, artificial neural networks, ARIMA, fuzzy neural networks, and nearest neighbors combined with differential evolution) from the perspective of their susceptibility to random fluctuation were presented in work [8]. Paper [73] provides the analysis of the possibility to reconstruct the attractor of a noise affected time series using a hybrid approach of nonparametric regression and optimal transformations. Two algorithms that estimate the noise level in a time series are exhibited in article [74]. In work [75], a data-filtering method for short-term load forecasting was proposed. It was demonstrated that statistical data-prefiltering improved the efficiency of STLF forecasting in the case of ARIMA models as well as the artificial neural networks.
In this paper the authors embrace the unexplored problematic of the robustness of ARIMA model identification in the case of time series affected by white noise of different noise to signal ratios (NSR) [8].

Research Methodology and Experiment Design
A dedicated research process was designed to study the tolerance to noise of ARIMA models in electrical load forecasting. Its logic and main stages are presented in Figure 2.
treated as a reference in further study. In the following steps, (9) the reference time series is additively disturbed with noise of different levels measured by the ratio of the standard deviations of the signal and noise (NSR-noise to signal ratio). For each noise level, the identification of the ARIMA model of the disturbed time series is performed together with the assessment of its predictive capacity by setting the 95% confidence interval.  The designed research process consists of the following stages: (i) review of the scientific literature related to the methods in electrical load forecasting, which resulted in (ii) the identification of methods used in electrical load forecasting; (iii) review of the scientific literature related to the applications of ARIMA method in load forecasting, which resulted in (iv) the specification of ARIMA models employed in electrical load forecasting; (v) review of the scientific literature related to the noise impact on time-series forecasting.
The literature review fed into the (vi) experiment design, which was the basis of the conducted (vii) simulation study that concluded with (viii) the final research report.
The simulation experiments play a key role in the study. The flow diagram of the experiment process is also presented in Figure 2. The starting point is (1) the graphic analysis of the electric load time series obtained from measurement data. It allows (2) to assess the time series with a view on the occurrence of trend and seasonal components, and consequently to decide on the needed transformations. The length of the seasonal periods is determined by (3) analyzing the periodogram and the time series attractors. The next step is (4) the determination of the ACF and PACF functions of the time series. This allows us to (5) specify the ARIMA model class and (6) estimate its parameters. The next step is (7) the analysis of model fit with the MSE and AIC criteria. The ARIMA model constructed in this process is used to (8) generate a clean time series model, which is treated as a reference in further study. In the following steps, (9) the reference time series is additively disturbed with noise of different levels measured by the ratio of the standard deviations of the signal and noise (NSR-noise to signal ratio). For each noise level, the identification of the ARIMA model of the disturbed time series is performed together with the assessment of its predictive capacity by setting the 95% confidence interval.
The designed experiment allows us to evaluate the stability of an identified ARIMA model class and to assess the changes in the model's predictive capacity in relation to the occurrence of different levels of white nose in the time series.

ARIMA Model Identification
Energy load is a stochastic data series with values that depend on many factors: type of receivers; atmospheric conditions; time of the day, month, and year; sports and cultural events; and many other random events affecting the operation of receivers. In this paper, an hourly load time series registered in the Polish Power System (PPS) between 6 July 2020 and 27 September 2020 (12 weeks-2016 observations) is considered. The data were collected from Polish Power System Operation-Load of Polish Power System (https://www.pse.pl/ (accessed on 1 June 2021)). Basic characteristics of the data used in this study are presented in Table 3. Table 3. Basic characteristics of data used in the study.

Characteristic Value
Period of load data collection 6 July 2020-27 September 2020 (12 weeks The plot of the studied time series is presented in Figure 3. Two seasonal components (daily and weekly; lower load on weekends) as well as a slight linear trend are clearly visible in the time series. A periodogram was used to illustrate the harmonic structure of the data in more detail [76].
Two dominant periods, 24 h (1 day) and 168 h (1 week), are clearly visible in the periodogram (Figure 4), which indicates the daily and weekly seasonality of the load time series. This is quite a typical pattern in European countries [77,78]. The plot of the studied time series is presented in Figure 3. Two seasonal components (daily and weekly; lower load on weekends) as well as a slight linear trend are clearly visible in the time series. A periodogram was used to illustrate the harmonic structure of the data in more detail [76].  (Figure 4), which indicates the daily and weekly seasonality of the load time series. This is quite a typical pattern in European countries [77,78]. Reconstructions of the studied load time series in two-dimensional phase-spaces (Yt, Yt-1), (Yt, Yt-24), and (Yt, Yt-168) are presented in Figure 5a-c, respectively. It may be noticed that the attractor is quite easy to distinguish in both cases. This implies good forecastability of the time series [79]. Time series differencing is a standard procedure to remove the nonstationary components (trend and seasonality) from data. Trend is removed by single differencing (linear trend) or multiple differencing (equal to the degree of the polynomial describing the trend) with lag 1. Seasonal components are eliminated through seasonal differencing with the lag corresponding to the number of observations in the seasonal cycle [9]. Reconstructions of the studied load time series in two-dimensional phase-spaces (Y t , Y t−1 ), (Y t , Y t−24 ), and (Y t , Y t−168 ) are presented in Figure 5a-c, respectively. It may be noticed that the attractor is quite easy to distinguish in both cases. This implies good forecastability of the time series [79].  (Figure 4), which indicates the daily and weekly seasonality of the load time series. This is quite a typical pattern in European countries [77,78]. Reconstructions of the studied load time series in two-dimensional phase-spaces (Yt, Yt-1), (Yt, Yt-24), and (Yt, Yt-168) are presented in Figure 5a-c, respectively. It may be noticed that the attractor is quite easy to distinguish in both cases. This implies good forecastability of the time series [79]. Time series differencing is a standard procedure to remove the nonstationary components (trend and seasonality) from data. Trend is removed by single differencing (linear trend) or multiple differencing (equal to the degree of the polynomial describing the trend) with lag 1. Seasonal components are eliminated through seasonal differencing with the lag corresponding to the number of observations in the seasonal cycle [9]. In the case of the analyzed load time series, differencing with lag 1, 24, and 168 was carried Time series differencing is a standard procedure to remove the nonstationary components (trend and seasonality) from data. Trend is removed by single differencing (linear trend) or multiple differencing (equal to the degree of the polynomial describing the trend) with lag 1. Seasonal components are eliminated through seasonal differencing with the lag corresponding to the number of observations in the seasonal cycle [9]. In the case of the analyzed load time series, differencing with lag 1, 24, and 168 was carried out. ACF and PACF function plots for the differenced time series (d = 1, D 24 = 1, D 168 = 1) are presented in Figure 6. ACF function plot (significant values for delays 1, 24, and 168) and PACF function plot (combination of exponential decays starting from delays 1, 24, and 168) indicate the ARIMA(0,1,1)(0,1,1)24(0,1,1)168 model.
The model described by Equation (9) was used to generate a reference (clean) load time series that was the basis for further simulations: Energies 2021, 14, 7952

of 22
The reference (clean) time series of hourly load values is presented in Figure 7.

Simulation of the Impact of Random Noise
In the next step, the time series (signal) generated with use of Equation (9) was additively disturbed by white noise with zero mean ( = 0) and standard deviation equal to the product of NSR ratio multiplied by the signal standard deviation: where: -inference noise standard deviation, -signal standard deviation, NSR -noise to signal ratio.
In Figure 8, the weekly load time series repeatedly disturbed with white noise with standard deviation (Equation (10)) determined for various NSR values from 10% to 500% is presented.

Simulation of the Impact of Random Noise
In the next step, the time series (signal) generated with use of Equation (9) was additively disturbed by white noise with zero mean (µ = 0) and standard deviation equal to the product of NSR ratio multiplied by the signal standard deviation: where: σ noise -inference noise standard deviation, σ signal -signal standard deviation, NSR-noise to signal ratio.
In Figure 8, the weekly load time series repeatedly disturbed with white noise with standard deviation (Equation (10)) determined for various NSR values from 10% to 500% is presented.
After the time series was additively disturbed, the ARIMA model parameters were re-estimated and the values of residual mean square (RMS) and the Akaike information criterion (AIC) were recalculated. Results of the calculations for NSR = 10%, 20%, 30%, 50%, 100%, and 200% are compiled in Table 4.
Simulation results indicate that, in the case of the analyzed time series, disturbances not exceeding NSR = 20% do not cause significant alterations in parameter estimation. The observed changes in parameters Θ 1 and ϑ 1 values do not exceed the 95% confidence interval of the reference model parameter estimation. Parameter θ 1 only slightly exceeds that interval. RMS and AIC values do not change significantly, either. Increasing the disturbance level above 30% causes more significant changes in the values of the estimated parameters and in the RMS and AIC values. Parameters were not estimated for noise levels higher than NSR = 200% because the ϑ 1 parameter value was above the irreversibility boundary of the model in that case.
The behavior of ACF and PACF functions is worth attention. Their plots for different noise levels are presented in Figure 9. As can be seen, the patterns still suggest MA seasonal process rather than AR. The obtained research results lead to the conclusion that the reaction of the load time series to random disturbances is relatively small. The to the product of NSR ratio multiplied by the signal standard deviation: where: -inference noise standard deviation, -signal standard deviation, NSR -noise to signal ratio.
In Figure 8, the weekly load time series repeatedly disturbed with white noise with standard deviation (Equation (10)) determined for various NSR values from 10% to 500% is presented.   In the next step, the models developed for the reference model and the disturbed time series were used to calculate forecasts. Multi-step forecast 6 h ahead was prepared for each model. Obtained forecasts with the 95% confidence interval are compiled in Table 5 and illustrated in Figure 10. In the next step, the models developed for the reference model and the disturbed time series were used to calculate forecasts. Multi-step forecast 6 h ahead was prepared for each model. Obtained forecasts with the 95% confidence interval are compiled in Table 5 and illustrated in Figure 10. Increasing the noise level enlarges the forecast confidence interval (lowers the accuracy), but obtained results are quite surprising. Forecasts for all noise levels are fairly consistent. Practically all forecasts made on the basis of the models derived from the disturbed time series fit within the 95% confidence interval of the forecast made on the basis of the reference model. It may be assumed that up to NSR = 30%, the model and its estimation is not very noise sensitive. Increasing the noise beyond this level significantly increases the width of the forecast confidence interval. Only increasing the noise level to NSR = 200% makes the model irreversible. This level of disturbance changes the possibilities of discovering the patterns of the energy load time series.  Increasing the noise level enlarges the forecast confidence interval (lowers the accuracy), but obtained results are quite surprising. Forecasts for all noise levels are fairly consistent. Practically all forecasts made on the basis of the models derived from the disturbed time series fit within the 95% confidence interval of the forecast made on the basis of the reference model. It may be assumed that up to NSR = 30%, the model and its estimation is not very noise sensitive. Increasing the noise beyond this level significantly increases the width of the forecast confidence interval. Only increasing the noise level to NSR = 200% makes the model irreversible. This level of disturbance changes the possibilities of discovering the patterns of the energy load time series.

Discussion
The main problem of time series modelling with ARIMA models is specifying the class (autoregressive and moving average) and the order of non-seasonal differencing, and the number of seasons and the order of seasonal differencing. Identification of ARIMA models is not strictly codified, and it depends to a large extent on the empirical knowledge and the intuition of a researcher and the quality of fit of the tested models. The basis for the identification is the analysis of ACF and PACF plots. In many cases, a given time series may be described by different ARIMA models. The autoregressive and moving average components may cancel each other's effect. There is also a relationship between the degree of differencing and the order of autoregression and moving average. The overdifferencing of the series can be compensated for by considering the additional term of autoregression in the model, the under-differencing by the additional term of the moving average. The possibility of the correct identification and estimation always depends on the presence and variance of random noise. For this reason, it is important to define the disturbance level, which determines the possibility of applying specified models.
The designed and executed simulation experiment allowed us to evaluate the robustness of ARIMA models to noise in their ability to predict electrical load time series. This research activity follows an established research practice that consists of simulating various aspects of power system performance under changing noise intensity [80]. In the study, an idealized ARIMA model of electrical loads was disturbed by noise of different levels. The model parameters were then re-estimated and new forecasts were calculated. The experiment has provided many interesting observations. It may be concluded that the reaction of the ARIMA model to random disturbances of the modeled time series is relatively weak. ACF and PACF functions do not change significantly at all tested levels of disturbance, generally indicating the original type of model. However, changing values of the estimated parameters indicate that the series

Discussion
The main problem of time series modelling with ARIMA models is specifying the class (autoregressive and moving average) and the order of non-seasonal differencing, and the number of seasons and the order of seasonal differencing. Identification of ARIMA models is not strictly codified, and it depends to a large extent on the empirical knowledge and the intuition of a researcher and the quality of fit of the tested models. The basis for the identification is the analysis of ACF and PACF plots. In many cases, a given time series may be described by different ARIMA models. The autoregressive and moving average components may cancel each other's effect. There is also a relationship between the degree of differencing and the order of autoregression and moving average. The over-differencing of the series can be compensated for by considering the additional term of autoregression in the model, the under-differencing by the additional term of the moving average. The possibility of the correct identification and estimation always depends on the presence and variance of random noise. For this reason, it is important to define the disturbance level, which determines the possibility of applying specified models.
The designed and executed simulation experiment allowed us to evaluate the robustness of ARIMA models to noise in their ability to predict electrical load time series. This research activity follows an established research practice that consists of simulating various aspects of power system performance under changing noise intensity [80]. In the study, an idealized ARIMA model of electrical loads was disturbed by noise of different levels. The model parameters were then re-estimated and new forecasts were calculated. The experiment has provided many interesting observations. It may be concluded that the reaction of the ARIMA model to random disturbances of the modeled time series is relatively weak. ACF and PACF functions do not change significantly at all tested levels of disturbance, generally indicating the original type of model. However, changing values of the estimated parameters indicate that the series is recognized as of the same type, but with different parameter values. The correctness of the estimation stage of a given type of ARIMA model depends to a large extent on the level of random disturbances present in the series. The presence of disturbance over 30%, and strongly over 100% of standard deviation significantly influences the RMSE, AIC, and the width of the forecast confidence interval.
ARIMA models are frequently used in load forecasting. They are flexible and well interpretable. Obtained results constitute a valuable advice regarding the mode of conduct in practical applications of ARIMA in load modeling and forecasting. They reaffirm the key importance of data preprocessing stage in the ARIMA model implementation. It is also recommended to carry out a preliminary time series evaluation with regard to the noise presence and the possible noise filtering before the ARIMA model identification and estimation. The authors consider it reasonable to introduce two additional phases to the standard ARIMA model development process: noise level identification and signal filtering. Thus, the process of ARIMA modelling would cover eight consecutive phases: (i) preliminary analysis, (ii) noise level identification, (iii) signal filtering, (iv) transformation, (v) identification, (vi) estimation, (vii) testing, and (viii) application.
Too high of a noise-to-signal ratio may be a premise for the choice of other forecasting methods based on, e.g., machine learning or other artificial intelligence methods.
Certain limitations of the presented results must be also acknowledged. First of all, only one load time series describing the whole power system was analyzed. Consequently, such a time series was characterized by a large share of systematic components with well specified features and parameters. Second, simulations were carried out only for a single class of ARIMA model. Third, the considerations were limited to STLF forecasting. Identified limitations point at the possible directions of further research. They should concern the load of various elements (fragments) of the power system at different hierarchy levels. Different ARIMA model classes should be considered. Calculations of forecasts with various time horizons would also be valuable. It would be desirable to compare the results obtained in this study to other simulations based on data from different time periods, different forecast horizons, different power systems (and their sections), and different ARIMA model classes. In this paper, authors focus on the electric load processes, but the proposed methodology may as well be applied to study time series presenting observable data of other origins.

Conclusions
The obtained simulation results presented in this paper lead to the following conclusions: • Noise loading of the signal significantly affects the identification of the time series ARIMA model type and the estimation of its parameters, • The accuracy of the prediction of electrical loads strongly depends on the noise level in the observed signal, • The observed time series of the electrical load should be carefully examined for the presence and the level of noise in the signal before the prediction is performed, • Usefulness of extending the classic Box-Jenkins approach by the preliminary time series filtration is proven.
Despite the identified limitations, which partly result from the size constraints of this paper, it is justified to claim that the presented research contributes to the theory and practice of electric load forecasting, allowing for the preparation of more precise forecasts. Effectively, better forecasting decreases uncertainty and leads to better informed decisions at different hierarchical management levels of the power system, thus making the energy policy more robust to uncertainty, better aligned with the Goal 7 of the Sustainable Development Goals [81], and more environmentally viable.
Author Contributions: J.N. and E.C. were responsible for the study conception and research design; J.N. and Ł.N. developed the concept; E.C. performed computation and analysis; J.N. and E.C. were responsible for data interpretation; J.N. and Ł.N. discussed the results and contributed to the final manuscript; J.N., E.C., and Ł.N. wrote and edited the text. All authors have read and agreed to the published version of the manuscript.
Funding: The publication of the article for 11th International Conference on Engineering, Project, and Production Management -EPPM2021 was financed in the framework of the contract no. DNK/SN/465770/2020 by the Ministry of Science and Higher Education within the "Excellent Science" programme.

Data Availability Statement:
Publicly available datasets were analyzed in this study. This data can be found here: https://www.pse.pl/web/pse-eng/areas-of-activity/polish-power-system/systemload (accessed on 17 November 2021).

Conflicts of Interest:
The authors declare no conflict of interest.