A Review of Auto-Regressive Methods Applications to Short-Term Demand Forecasting in Power Systems

: The paper conducts a literature review of applications of autoregressive methods to short-term forecasting of power demand. This need is dictated by the advancement of modern forecasting methods and their achievement in good forecasting efﬁciency in particular. The annual effectiveness of forecasting power demand for the Polish National also discusses the classical methods of Artiﬁcial Intelligence, Data Mining, Big Data, and the state of research in short-term power demand forecasting in power systems using autoregressive and non-autoregressive methods and models.


Overview
The economic development of countries is inextricably linked with the functioning of their power systems. Due to the development of power grids and the growing access to them, electricity is now indispensable for the proper functioning of the economy and the population, and the demand for it is systematically growing. Rising electricity prices in recent years and their fluctuations, in addition to insufficient development of the manufacturing sector, make it difficult to optimally meet the growing demand for electricity. Unfortunately, storage of electricity on a large scale and in the long term is a complex and very expensive issue. Thus, at any time in the operation of power systems, it is necessary to maintain a balance between the generation of electricity and its consumption, taking into account the technical limitations of electricity networks, in order to maintain continuity and security of power and electricity supplies while maintaining the optimal operating costs of the power system. In this context, forecasting the load of power systems is an essential element of planning their work in the short, medium, and long term, and is one of the greatest challenges faced by the power industry in every country. Electricity demand forecasting is a basic element of planning electricity generation, participation in electricity markets, and the development of the power grid. Short-term forecasting of the power system load, performed, inter alia, by operators of power systems, requires ensuring the highest possible accuracy for each hour of the day while maintaining the lowest computational cost at an appropriate time. Forecasting the load on systems with the use of prognostic models using explanatory variables is costly and time-consuming, in contrast to autoregressive methods which use only information about the earlier development of the analyzed parameter in the forecasting process. Thus, along with the observed trend indicating the reduction in forecast horizons from hours to minutes, and even seconds, it is necessary to search for cheap and quick forecasting methods that will allow the current forecasting effectiveness to be maintained at lower costs of their development and with a comparable or shorter development time.

Literature Survey
In short-term electrical power demand forecasting, both autoregressive methods using the properties of moving averages and exponential smoothing, and methods using machine learning [1][2][3][4][5][6]. Support Vector Machines and Particle Swarms, and artificial intelligence [7], including Artificial Neural Networks, have been used for years. Many research centers worldwide have developed more accurate forecasting methods and models, especially for short-term forecasting. Several teams have conducted research at the academic level, perfecting the methods and models they have developed. For the conducted analyses and simulations, usually, STATISTICA ® , SAS/ETS, and SPSS environments [8], GRETL [9,10], and the R and Python programming languages are used, among others.
The demand for electrical power is characterized by large fluctuations [11]. In this case, the key factors exhibit daily, weekly, annual, and multi-year variability [12]. Moreover, the seasonal variability (which results in annual variability), quarterly variability (seasons), and monthly variability (part of the seasons) are distinguished. Continuity of power demand and the still "insufficient" (in the sense of high power/capacity) development of energy storage results in the inability to store it in large quantities, which makes it necessary to cover the demand for power at the time of the occurrence of this demand [13].
Other factors, apart from the passage of time (consecutive days, weeks, etc.), that influence the variability in the power system load [14,15] are the variability in weather conditions and the resulting variability in the ambient temperature, in addition to the transition from winter to summer time [16,17] and from summer to winter time (introduced to flatten the evening peak of power demand in the summer half of the year) [12]). Other weather factors influencing the level of demand in the power system include, among others, cloudiness, air humidity, and wind speed [12]. The ambient temperature significantly affects the load in the power system. The change in weather conditions directly impacts consumer behavior (municipal and industrial), consisting of increasing power consumption from lighting and heating devices (convector heating and electric heating).

Motivation and Incitement
Individual areas of the Polish Power System have a different share in shaping the domestic demand for electrical power. Naturally, areas with significant industrialization and, therefore, a significant population in Poland, translate into greater demand for electrical power (and, consequently, electrical power consumption), and thus, to a greater extent, changes in the weather (atmospheric conditions) affect these areas. The yearly demand forecasting error for the Polish National Power System is approximately 1%, which shows a high level of accuracy; thus, there is a need to search for the potential in well-known methods and models, including autoregressive models, to reduce the error below this level. In this context, this paper aims to review auto-regressive methods applied to short-term power demand forecasting in power systems.

Research Gaps
The conducted review of articles describing the methods and forecasting models used in short-term forecasting of electric power demand shows a great variety. Autoregressive methods are still an attractive and effective tool for forecasting. Their unquestionable advantage is low financial outlay and quickly obtaining forecast results. The current observation of scientific reports in the form of literature reviews is time-consuming. Therefore, it is important to develop rankings of forecasting models, taking into account their forecasting effectiveness. While preparing this review, the authors identified a gap in presenting the results of valuable research in this aspect, and thus attempted to develop such a ranking. The Mean Average Percentage Error was adopted as a measure for assessing the quality of forecasts developed with autoregressive methods. From the prepared ranking of 264 autoregressive models, a set of Top 10 models was distinguished, which can be a significant aid for researchers and scientists dealing with short-term forecasting of electricity demand in power systems.

Major Contributions
The main contribution of the authors is to present an overview of methods in the field of artificial intelligence, Data Mining (now often associated with Big Data issues), and Big Data. In addition, the state of research in short-term power demand forecasting for power systems using autoregressive and non-autoregressive methods and models is presented, along with a detailed table that describes the results of the review of 47 articles describing 264 forecasting models (Table 1, where MAPE is an ex post, and MAPE(ea) is an ex ante approach). Additionally, the authors present a new way to develop literature reviews in the context of selecting the most prospective prognostic models. In the proposed new approach (explained in the flowchart- Figure 1), ranking of forecasting models (Tables 2 and 3 and Figure 2) was used due to the selected measure of forecast quality (Mean Average Percentage Error). The applied new approach to the development of the results of literature reviews is an excellent source of knowledge for scientists, experts, and analysts, supporting the preparation of forecasts for power system operators, with particular emphasis on transmission system operators.
The indication of the greater effectiveness of Artificial Neural Networks over the improvement of traditional methods in short-term forecasting of power system loads, presented in [72], does not always translate into short-term forecasting of energy prices on Polish and foreign electricity trading floors [94]. In this context, it is possible to obtain an inverse relationship. For example, the multiple regression method gives significantly greater forecasting efficiency when compared to the models of Artificial Neural Networks [95]. Artificial Neural Networks are highly effective not only in the short term, but also in long-term forecasting [96,97].
Evolutionary algorithms are used, among others in [84]: forecasting daily loads of electric power systems [46,67], optimizing the configuration of power grids, optimizing voltage levels in power grids, designing power grids, planning power plant operation, creating an economical distribution of loads, planning power grid development, supporting regulatory activities in power systems, and protection automatics [83,98]. Expert systems are used, among other things, in [99]: designing power grids and stations and reconstruction of power systems in post-emergency states [100,101].
Additional information on the application of artificial intelligence methods, taking into account the studied subject of the variability in power system loads and their forecasting, can be found in [81,84,85,102,103].

Data Mining Methods
In the literature focusing on the analysis of large data sets and forecasting using Data Mining methods, there are many definitions of these methods and ideas [104].
The main definitions of Data Mining are: 1.
An interdisciplinary approach using techniques from machine learning, image recognition, statistics, databases and visualization to extract information from large databases [42,105,106]; 2.
An analysis of large, previously collected data sets to discover new regularities and describe the data in a new way that is understandable and useful for the data owner [107].
The first definition comes from 1998, while the second comes from 2001; thus, their evolution is noticeable.
Further definitions of Data Mining methods are: 3.
The process of searching for valuable information (knowledge) when the researcher is dealing with a large amount of data [108][109][110][111]; 4.
The process of examining and analyzing large amounts of data by automated or semi-automatic methods to discover meaningful patterns and rules [112,113]; 5.
Methods of broadly understood data analysis aimed at identifying previously unknown regularities occurring in large data sets, from which the results are in a form that is easy to interpret by the researcher [109].
At the beginning of their development, Data Mining methods were accused of being unscientific, assuming no theory, having no elegance or formal evidence, and being primitive and for application only [114].
The classical approach to data analysis uses the scheme [115,116] from defining the problem through creating a mathematical model, preparing the input data, and analyzing the problem, to interpreting the obtained results. The Data Mining approach uses a scheme from problem definition through preparing input data, problem analysis, and creating a mathematical model, to interpreting the obtained results. The algorithms used in the field of Data Mining are divided into supervised learning and non-supervised learning [104]. In the supervised learning methods, the main goal is to recreate the value of the examined parameter. In the non-supervised learning methods, the aim is to detect structures or hidden patterns in the analyzed data due to the lack of distinguishing a single feature. Teaching forecasting models using a supervised learning approach can be conducted as an implementation of a classification or regression problem. In classification problems, the analyzed parameter is qualitative, and in regression problems, this parameter is quantitative.
The knowledge derived from empirical research is proven, and due to the collection of larger and larger sets of data, it is beneficial for further research, both empirical and forecasting (in a certain sense speculative); it is useful to analyze these sets and draw additional conclusions. Additional research, including experimental studies, may result in obtaining a greater number of answers than the questions posed by the researcher [117][118][119]. The classification indicated in [118] of problem types and their respective Data Mining methods concerning time series analysis notes the inclusion of MultiLayer Perceptron (MLP) and Radial Basis Function (RBF) Artificial Neural Networks in this method. It must be concluded that the classifications of methods overlap and do not function as hermetic.
The group of Data Mining methods and models also includes forecasting problems, which are divided into two groups. The first group includes regression and classification trees, and the second group includes advanced machine learning methods. Classification and regression trees include Classification and Regression Trees (C&RT) and Chi-Square Automatic Interaction Detection (CHAID) trees [96,120]. The advanced machine learning group consists of the methods Multivariate Adaptive Regression Splines (MARSplines), Support Vector Machines (SVMs), k Nearest Neighbors [121,122], k-Means [123,124], Naive Bayes Classifier (only applicable to classification problems), Random Forest [125], and Boosted Trees [96]. The use of Data Mining methods in forecasting regression problems consists of evaluating many models, comparing their effectiveness results, and creating hybrid systems, due to which it is possible to maintain the smallest deviations in the forecasted values from the realized values of the analyzed parameters. The distinguishing feature of Data Mining methods is the speed of their creation. The MARSplines and Boosted Tree methods are among the most effective predictive models from the group of Data Mining methods for forecasting power demand in power systems.
The MARSplines method is in the niche of practical applications in forecasting problems in large-scale power engineering. In the MARSplines method, a non-parametric type belonging to the group of supervised learning methods, the co-variability in features is used to predict the value of a selected feature, and in classification problems [126,127]. The indicated convenience excludes from research activities the necessity to analyze the correlation between the independent variables, which in many cases may correlate with the predicted variable, but do not affect it.
The Multivariate Adaptive Regression Splines (MARSplines) method [128][129][130] uses the method of recursive division of the feature space to build a regression model in the form of spline curves [131][132][133] and is an extension of the methods of regression trees and multiple regression [105]. Due to the above properties, the MARSplines [131][132][133] is an effective tool for Big Data applications [134,135].
The MARSplines method also enables the automatic selection of explanatory variables for forecasting models. The efficiency of this selection is in many cases greater than that for classical methods of selecting variables [30,31,[136][137][138]. Thus, the method can be successfully used, in addition to the multiple regression method, in selecting input variables for forecasting models and short-term forecasting of time series, including power demand in power systems. [31,32,139].
The principal components method is an alternative to those analyzing the correlations between the explanatory variables in the forecasting process. It not only allows the removal of variables that are overly correlated with each other, but also the acquisition of uncorrelated variables that are responsible for part of the variability in groups of variables or even for the variability in entire groups of variables [140]. The application of the method creates new variables, which are linear combinations of the original variables, and the following components capture as much information contained in the original data as possible. The disadvantage of the method is the difficulty in interpreting the meaning of principal components [140].

Big Data
Big Data is a term that describes, on a very general level, exceptionally large data sets. These collections are characterized by a diversified structure of high complexity. The main difficulties are data storage, real-time analysis, and data visualization and analysis results [141,142]. The process of examining massive amounts of data to reveal hidden patterns and secret correlations is called Big Data analysis. In the 1990s and the first decade of the 21st century, Big Data analysis was understood as Data Mining. Big Data sets are characterized by: high volume (Volume) [98,141,143,144], high growth rate (Velocity) [98,141,143,144], reliability and accuracy (Veracity) [141,142], great variety (Variety) [98,144], and value for decision making processes (Value) [98,141,144,145].
The use of Big Data analysis for the needs of data sets containing electrical measurements, including the load size of power systems, includes practical applications, e.g., techniques, i.e., correlation analysis and machine learning techniques (including deep learning: Multilevel Deep Learning [146], Pooling Deep Recurrent Neural Network [147], Convolutional Neural Network Based Bagging Learning Approach [148], TensorFlow Deep Learning Framework and Clustering-regression [149], Long Short-Term Memory Neural Network [150], using Scikit-Learn and TensorFlow [151], with the Keras library [152], Deep Neural Networks [43,153], and introducing Multilevel Deep Learning Methods for Big Data Analysis [146] and databases [114]). Processing of electrical measurement data includes distributed processing (data storage and processing-Distributed Computing), memory processing (data reading and processing-Memory Computing) and stream processing (real-time data processing-Stream Processing) [141,154].
The use of Big Data techniques in the energy system in the energy sector [155][156][157] and in the field of Smart Grids [1,80,154,158] includes the use of RBF Artificial Neural Networks [159] using a Convolutional Neural Network Based Bagging Learning Approach [148]. This also encompasses compatibility of aid for technical measures concerning the integration of the generating sources [160], with special regard to renewable sources [161,162] and in creating backup data sets that can be used in situations of information and communication disruptions [163].
The use of sets, techniques, and processes concerning Big Data for the power industry is inextricably linked with the security of the stored data. The security of this type of data can be increased through its location dispersion (e.g., SCOOP system) [144].
Data streams supplying Big Data sets in transmission and distribution power systems come from [164][165][166]: Supervisory Control And Data Acquisition (SCADA) systems [167], phasor measurement systems in Wide Area Management System (WAMS) technology [168], Intelligent Electronic Devices (IEDs), network asset management systems, conventional and smart meters [147,[169][170][171], and information exchange systems with electricity market participants, from seismic and meteorological institutes, Global Positioning System (GPS) systems, and Geographic Information System (GIS) systems. The practical method of the similarity of days [172][173][174][175][176] allows the quality of forecasting power demand to be below 3.00% per day and the efficiency achieved by the Polish Transmission System Operator (PSE S.A.) to be approx. 1.00%. Similar days are selected based on the most recent demand factor forecasts in the first step. In the second step, the weighted average is calculated for each hour of the day, considering the historical values. In the classical approach, there is a slight variation in the values of individual weights. Due to weighting of the most similar days, it is possible to obtain minimum, maximum, and average errors for the entire day below 2% [176]. The method of self-adaptive weighing is successfully used in forecasting the demand for electric power in microgrids. Compared to the standard methods of dynamic demand profiles, multiple regression, and Artificial Neural Networks, it almost doubles forecasting effectiveness (approx. 3.5%) [177]. A similar level of effectiveness (3.99%) using the multiple regression method for the power system shows that despite the longer computation time (for a seven-day horizon), its classical version [178], using as input data (explanatory variables) forecasts of weather parameters, gives a similar quality. The use of Artificial Neural Networks in short-term forecasting of electrical power demand in power systems does not always give exceptionally effective forecasting results compared to other methods. Artificial Neural Networks require significant research experience, and the results, even using efficient network learning methods [147], rarely give effectiveness below 1.00% per day. Often, advanced Artificial Neural Networks provide forecasting efficiency expressed by the values of Mean Average Percentage Error (MAPE) from approx. 3.00% to even approx. 13.00% (in the 20-day horizon) [5]. The knowledge of electrical power quality parameters is one of the key elements of entities operating in the electricity market [179]. Cyclical measurements of these parameters (including the assessment of the condition of electrical apparatus and devices [180]), and their transmission and collection, in addition to the conducted analyses, may affect the medium-term planning of outages of individual elements of the transmission network and, thus, indirectly, short-term forecasting of power demand.

The State of Research in Short-Term Power Demand Forecasting for Power Systems Using Autoregressive and Non-Autoregressive Methods and Models
The study (Figure 1) was planned in such a way as to answer the question of whether the use of autoregressive methods in short-term forecasting of electricity demand in power systems can be even more effective and, at the same time, inexpensive and quick to implement. In order to answer this question, scientific articles presenting the effectiveness of autoregressive forecasting models determined by the MAPE were analyzed. The result of the review is Table 1 and a ranking of forecasting models (Tables 2 and 3), and the Top 10 collection of the ten most effective forecasting models. As a result of the review and development of the ranking of forecasting models, it was confirmed that the use of autoregressive models may support the transmission system operator to achieve better forecasting efficiency.
The literature review (Table 1) included 47 unique items and titles, several dozen forecasting methods, and 264 forecasting models (Table 1). Scientific papers were published in the period from 1997 and concerned short-term forecasting of power demand. The source data used by the authors of the analyzed publications, constituting the input for the forecasting models, covered the period from 1998 to 2014. Diverse and international teams of authors conducted their research based on data on the functioning of power systems in 25 countries located on four continents-in the countries of the Near and the Far East, Western Europe (including the British Isles), Central Europe (including Poland), North America (USA), and Australia. The publications indicated were compiled by 44 different authors' teams and published in 23 publishing houses. The analysis concerning the nomenclature of forecasting models covers a set of 185 unique items. Diversifying the observed relationships in individual forecasting models results in identifying 197 unique abbreviations assigned to forecasting models. The MAPE(ea) in Table 1 means that the accuracy results are measured in ex ante mode.
All the reviewed references describe the effectiveness of the presented forecasting models, in terms of the MAPE measure, to assess the accuracy of the forecasts. To analyze the collected forecasting results, 27 unique names of MAPE errors were distinguished for this analysis, reflecting the forecasting models used in the analysis. Some of the forecast results described by the MAPE index, contained in selected publications, are presented from the lowest value (MAPE min) to the highest value (MAPE max). In contrast, the remaining part of the results is described by one value.
The analysis of monovalent results was decomposed into minimum and maximum values to standardize the dominant approach used in selected publications. The lowest values of MAPE min are recorded in the range from 0.01% to 21.18%, while in the MAPE max category, the corresponding range of variability in the MAPE ranges from 0.01% to 33.45%. The MAPE min category includes 196 unique items from a set of 264 models, while the MAPE max category includes 212 unique items from the same set.
Further analysis of the results of the effectiveness of the forecasts obtained, described by the forecasting quality measure using the MAPE, concerns the MAPE category, min. A set of the ten smallest results expressed as percentages was selected in this category ( Only analytical studies on the GRM forecasting model in the Top 10 set are performed ex ante (ea). In the case of this model, the efficiency obtained in the third position should be considered very high. The GRM model uses information about the shaping of the ambient temperature as an input variable. The second model that uses the input variables is the FGRM model, which considers both the variability in the ambient temperature and the wind speed. The FGRM model ranks seventh in the Top 10 ranking in the MAPE category, min.
The forecasting effectiveness described by the lowest value of the MAPE min has an ambiguous effect on high forecasting efficiency. The power systems subject to forecast analysis in the Top 10 list are (in ascending order) the systems of Iran (two items), USA (one item), Iran (three items), USA (one item), and Australia (three items).
The length of the analyzed period significantly affects the quality of forecasting obtained. Along with the extension of the analysis period, including the natural impact of non-working days and holidays, both cyclical and non-cyclical, there is a decline in the effectiveness of the obtained forecasts of the load on power systems. The full forecasting model ranking is presented in Tables 2 and 3, where the column Model No. represents the model number from Table 1 (the last column on the right), and the column Ranking shows the position in the model ranking (1 equals the first position and 264 equals the last position). Table 2 consists of the models from Table 1 from 1 to 132 (in four pairs of Ranking and Model Number), and Table 3 shows the same scheme for the models from 133 to 264. Tables 2 and 3 present four sets of Ranking and Model Number. Articles [183][184][185] from 2019 to 2021 indicate that analysis and research are being continued, including with the use of some of the analyzed methods.

Conclusions
The 47 publications describing 264 models published from 1997 to 2018 were analyzed in detail by applying methods that use explanatory variables to broaden the background of analyses. Some relevant publications from 2019 to 2021 were also included to determine if autoregressive methods are still of interest. The results of the review confirm the significant potential of the autoregressive approach to power demand forecasting. The analyzed methods enable very high accuracy to be achieved in short-term forecasting with the resolution of one hour (accuracy measured in terms of MAPE is below 1%). The methods whose effectiveness were classified in the top ten sets are Fuzzy  . This shows the potential of the autoregressive prediction approach used in the models for short-term power demand forecasting in power systems.

Critical Discussion, Major Findings and Future Scope of Research
The results of the review show that the use of short-term forecasting of electric power demand with hourly resolution enables efficiency of below 1% to be achieved. It should be borne in mind that such effectiveness should apply to the entire calendar year. In the analyzed collection of 47 articles from all over the world, the analysis period ranges from several months to several years, which indicates that the research covers significant periods of time, and the analyzed models are stable and resistant to changes in external conditions (economic and climatic conditions). The group of the most effective prognostic models includes models using artificial intelligence techniques (e.g., Artificial Neural Networks, Fuzzy Logic, and Genetic Algorithms). The effective methods also include classic forecasting methods (e.g., ARIMA, Multiple Regression, Exponential Smoothing) and methods from the Data Mining group (e.g., Support Vector Machines, Nearest Neighbors, Random Forest).
The article confirms the authors' thesis about the enormous potential inherent in the use of the autoregressive approach for short-term forecasting of electricity demand. The results of the review (the prepared ranking of prognostic models and the knowledge from the analyzed articles) constitute an excellent starting point for further tests and pave the way for future research in this area.
The future research of the authors will focus on the first step of testing the prognostic models from the Top 10 set. The tests will take into account both the achieved effectiveness and the necessary financial costs and time consumption of the process. In the next step, the most effective prognostic methods selected in the first step will be tested, including individual testing in off-line mode. In the third step of further research, prognostic model committees will be established. The developed committees will assign weights to the participation of individual models (step 1) and test the suitability of individual models for forecasting individual hours of the day or periods of the day (step 2). The MAPE selected by the authors for the review analysis, despite the undoubted advantage of being able to be used to easily compare the effectiveness between forecasting models, has a tendency to average forecasts. Therefore, in future studies, the authors will also use other measures to assess the quality of forecasts, such as Mean Absolute Error, Mean Absolute Scaled Error, and Root Mean Square Error, and others as needed. The usefulness of the tested forecasting models will be assessed, taking into account the seasonality, periodicity, and ranges of hours during the day. The developed review encompasses an excellent range of forecasting methods and models that can be used at any time, and the usefulness of each of them may prove invaluable from the point of view of the needs of the Polish Transmission System Operator.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: