Analysis of Natural Gas Consumption Interdependence for Polish Industrial Consumers on the Basis of an Econometric Model

: The transmission of natural gas is a key element of the Polish energy system. The published data of the Polish distribution system operators and the transmission system operator on the volume of gaseous fuel transmitted indicate a growing trend in the consumption of energy produced from natural gas. In connection with the energy transformation, switching energy generation sources from hard coal to natural gas in Poland, it is important for transmission operators to know the future demand for gaseous fuel. The aim of the article is to attempt to develop an econometric model related to the consumption of gaseous fuel by Polish entrepreneurs. The knowledge therein may be useful for making business decisions related to the possible expansion of the transmission system, and thus investing ﬁnancial resources for this purpose. This knowledge will also provide quantitative information related to the interest in gaseous fuel among industrial consumers and the analysis of the trend of natural gas consumption in Poland in the aspect of energy transition. The intention of the publication was to determine the macroeconomic indicators that strongly affect natural gas consumption by the Polish industry and the quantitative growth of consumption depending on changes in these indicators. The results showed that the highest correlation of the growth of natural gas consumption is related to the production of chemistry, the chemical industry, and the power industry.


Introduction
The motivation of the article was to present the cause-effect analysis of the influence of external factors on the consumption of natural gas by the Polish industry. The research was based on the most frequently used estimation method in economics, i.e., the least squares method. It is shown that by estimating the unknown model parameters with this method, it is possible to obtain estimates for which the model best provides a description of the observed data. In recent years there has been a lack of research on the proposed topic; the results of the analysis may be useful to illustrate the essence of this natural gas in the energy transition. The transmission of natural gas is one of the main components of the country's energy system [1]. The published data of the operators of the national distribution system and the transmission system, regarding the volume of gaseous fuel transmitted, testify to an upward trend in the consumption of energy produced from natural gas [2]. The basic task of the operators transmitting natural gas to the final consumers is its safe delivery and the guarantee of the continuity of supplies without any disturbances, and ensuring the continuity of the supplies has a decisive impact on maintaining the energy security and maintaining stability of the economy based on natural gas sources [3].
In connection with energy transformation, the switch of energy production sources from hard coal to natural gas, it is crucial for transmission operators to predict the future demand for gaseous fuel [4]. Knowledge of future phenomena related to the consumption of gaseous fuel provides business decisions related to the possible expansion of the transmission system and thus in investing financial resources for this purpose. The prediction also provides quantitative information related to the interest in gaseous fuel among industrial consumers and checking the trend of natural gas consumption in Poland in the aspect of energy transition.
It should be emphasized in this context that the current energy transformation is significantly influenced by the European Union's climate and energy policy, including its long-term vision of achieving climate neutrality by 2050. In reference to the European Union's climate and energy policy, Poland has developed its energy policy until 2040. PEP2040 contributes to the implementation of the Paris Agreement concluded in December 2015 during the 21st Conference of the Parties to the United Nations Framework Convention on Climate Change (COP21) [5]. A critical element of PEP 2040 is natural gas, which is expected to be the bridge fuel in the energy transition. The national resource potential offers the possibility of independently covering the demand for coal and biomass, but most of the demand for natural gas has to be covered by imports.
Despite the increase in the number of recognized hydrocarbon fields (Table 1) from year to year, the volume of natural gas from national production transmitted to the national transmission system is decreasing (Figure 1). This confirms that diversification of supplies from various sources guarantees persistence security. The fossil energy resources (coal, oil, and natural gas) currently have no substitutes to match the required energy demand. Poland has no chance to be self-sufficient in covering the country's demand for oil and natural gas [6]. Due to this fact, it is important to diversify the supply routes [7]. An important aspect of ensuring energy production in Poland is still the production of electricity from coal. Table 1. Volume of recoverable natural gas reserves from gas, oil, and condensate fields in mln m3 (own elaboration based on data from Polish Geological Institute). In connection with energy transformation, the switch of energy production sources from hard coal to natural gas, it is crucial for transmission operators to predict the future demand for gaseous fuel [4]. Knowledge of future phenomena related to the consumption of gaseous fuel provides business decisions related to the possible expansion of the transmission system and thus in investing financial resources for this purpose. The prediction also provides quantitative information related to the interest in gaseous fuel among industrial consumers and checking the trend of natural gas consumption in Poland in the aspect of energy transition.

Quantity of Reservoirs
It should be emphasized in this context that the current energy transformation is significantly influenced by the European Union's climate and energy policy, including its long-term vision of achieving climate neutrality by 2050. In reference to the European Union's climate and energy policy, Poland has developed its energy policy until 2040. PEP2040 contributes to the implementation of the Paris Agreement concluded in December 2015 during the 21st Conference of the Parties to the United Nations Framework Convention on Climate Change (COP21) [5]. A critical element of PEP 2040 is natural gas, which is expected to be the bridge fuel in the energy transition. The national resource potential offers the possibility of independently covering the demand for coal and biomass, but most of the demand for natural gas has to be covered by imports.
Despite the increase in the number of recognized hydrocarbon fields (Table 1) from year to year, the volume of natural gas from national production transmitted to the national transmission system is decreasing (Figure 1). This confirms that diversification of supplies from various sources guarantees persistence security. The fossil energy resources (coal, oil, and natural gas) currently have no substitutes to match the required energy demand. Poland has no chance to be self-sufficient in covering the country's demand for oil and natural gas [6]. Due to this fact, it is important to diversify the supply routes [7]. An important aspect of ensuring energy production in Poland is still the production of electricity from coal.   Many factors may influence the consumption of natural gas by final consumers (industrial consumers, households) [8]. It is important to note that there is growing interest in gaseous fuel in Poland ( Figure 2) Based on the literature on the subject, several potential factors can influence the consumption of natural gas in Poland and can be identified [9,10]. These are described in the following section. It is necessary to mention that the influence of the coronavirus pandemic on the volumes of transmitted natural gas volumes has also been noticed recently [11]. Consequently, there is no attempt to build a model for the years 2020-2021, as they are affected by the deformation of the time series related to the supply of gaseous fuels.
tions are practically nonexistent. Of course, there are scientific articles that present forecasts, using mechanical forecasting techniques, without a thorough analysis of external factors that could affect the consumption of natural gas. The article analyzes the possibility of using a few dozen macroeconomic indicators, which may describe the phenomenon under study. Finally, 12 factors were selected for the article. Preliminary analysis showed that they have the greatest influence on consumption. The article presents their potential and the possibility of using them in the final econometric model. The contribution of the paper is the collection of macroeconomic indicators describing the current prevailing reality on the Polish natural gas market together with their interpretation. In particular, attention was paid to the most important determinants affecting natural gas consumption by the Polish economy, so that the continuity of supply is maintained and the perspective of new connections of customers with the gas network is created. This knowledge may be useful for gas operators and energy companies.

Review of the Literature
One of the recent scientific publications that attempted to analyze the interdependence of energy consumption by end users is an article by Muglabeh et al. 2021 [12]. The results showed that energy consumption significantly affected economic growth and that there is a common causal relationship. Radmeher et al. 2021 show by means of phenomenon interdependence analysis, an interesting relationship between economic growth and carbon emissions and between economic growth and renewable energy consumption is This article can also fill the research gap in the field of the possibility of using econometric models to describe the consumption of natural gas in Poland by industrial consumers, in a country aspiring to create a gas hub in the future and switching from coal-based to gas-based power generation. Furthermore, the Polish transmission network is currently being extended, with interconnections with Lithuania, Slovakia, the Czech Republic, and the North-South Gas Corridor being developed. It should be added that an analysis of recent publications has not revealed any material concerning an attempt to analyze the interdependence of natural gas consumption by Polish industry; moreover, such publications are practically nonexistent. Of course, there are scientific articles that present forecasts, using mechanical forecasting techniques, without a thorough analysis of external factors that could affect the consumption of natural gas. The article analyzes the possibility of using a few dozen macroeconomic indicators, which may describe the phenomenon under study. Finally, 12 factors were selected for the article. Preliminary analysis showed that they have the greatest influence on consumption. The article presents their potential and the possibility of using them in the final econometric model. The contribution of the paper is the collection of macroeconomic indicators describing the current prevailing reality on the Polish natural gas market together with their interpretation. In particular, attention was paid to the most important determinants affecting natural gas consumption by the Polish economy, so that the continuity of supply is maintained and the perspective of new connections of customers with the gas network is created. This knowledge may be useful for gas operators and energy companies.

Review of the Literature
One of the recent scientific publications that attempted to analyze the interdependence of energy consumption by end users is an article by Muglabeh et al., 2021 [12]. The results showed that energy consumption significantly affected economic growth and that there is a common causal relationship. Radmeher et al. 2021 show by means of phenomenon interdependence analysis, an interesting relationship between economic growth and carbon emissions and between economic growth and renewable energy consumption is bidirectional [13]. Abbasi 2021, also using econometric modeling, indicated that the industrial sector is a key factor in overall energy demand, closely related to the economy. Empirical analysis shows that these factors are cointegrated. While a 1% increase in electricity con- sumption causes the price of electricity to fall by 0.19%, if the GDP increases by 1%, the price of electricity falls by 0.16% [14]. Furthermore, Gerhson et al. 2021 indicated that fossil fuels are significant drivers of real GDP or economic growth for Nigeria [15].
An interesting publication in the area under review was the "Macroeconomic Short-Term High-Precision Combination Forecasting Algorithm" paper, which indicated that to calculate the potential growth rate, the three factors of potential total factor productivity, labor, and capital stock must be estimated and then the existing growth accounting model must be used to calculate the potential output [15].
Econometric modeling can be used to describe the energy balancing structure. In one publication, the authors studied the relationship between electricity consumption and economic reform. A 1% increase in economic output increases electricity consumption by 0.22% (income elasticity of electricity demand) [16].
Using econometric modeling, Ghosh et al. 2021 developed a model that optimizes the retail input, the wholesale price demanded by the producer, the environmental performance of the product and the selling price charged by retailers [17].
The coefficients in the econometric model can be interpreted as long-term elasticities, as the variables form a natural logarithm. Thus, CO 2 emissions increase with the level of production and consumption of fossil fuel-based energy. For example, a 3% increase in economic activity increases emissions by 1.27%, while a 3% increase in fossil fuel energy consumption will increase total emissions by 1.69%, assuming other factors remain constant. Research on data on the Mexican economy found evidence of a bidirectional causal relationship between energy and production, indicating that the two variables form a complementary process, that is, an increase in GDP is accompanied by an increase in energy demand [18].
The causal analysis [19] based on the empirical model indicated that natural gas consumption in Indonesia is a boost to the welfare of the country. The other authors have the same observation that, based on an econometric model, confirms the hypothesis of natural gas consumption induced by economic growth [20]. The authors recommend that policy makers intensify efforts to increase the accumulation of physical and human capital to offset industrialization, which will result in an increase in natural gas consumption in Malaysia. The next article indicated in an econometric model that an increase in natural gas consumption in the Mediterranean region leads to industrial growth [21]. Makala [22] made an attempt to analyze the relationship between industrial growth and natural gas consumption, his research focusing on finding a relationship between natural gas consumption and economic growth in Tanzania. The other notable publication is the analysis of the natural gas market in Germany [23], in which the authors analyze the factors in the construction of a statistical model of natural gas consumption. It is indicated that the main factors are population and outdoor temperature. Based on linear regression analysis, the authors indicated the relationship of explanatory variables such as population and temperature. Such analyzes were based on econometric models made from the perspective of domestic consumption. However, the construction of an econometric model can be applied in various contexts to the impact of natural gas consumption in buildings, with regard to homes and offices. Therefore, the method can be used successfully from the point of view of the local consumer. The linear regression method can also be used in hybrid combination with other methods such as artificial neural networks, random forests, vector machines [24]. The building of econometric models can also be successfully used in the perspective of renewable energy deposits. A successful application of the model used for hydroelectric plants is a promising approach [25]. The authors argue that these models can estimate the construction time of hydroelectric plants, which will help support environmental protection projects. The literature review concluded that in the research paper, the authors of building an econometric model using the method of correlation analysis of natural gas consumption indicate that economic development stimulates natural gas consumption.

Material and Methods
The objective of the paper is to determine the nature and type of cause-and-effect relationships for the consumption of natural gas by Polish end consumers. Econometric model building (classical least squares method) can be used for this purpose. In the classical least squares method, the assumptions must be considered, among all: the estimated econometric model is linear with respect to the parameters, and the explanatory variables xi are nonrandom quantities with fixed elements.
Consequently, it is reasonable to make an attempt to estimate a model that characterizes the dependence of gaseous fuel consumption on selected factors.
In general, the analytical form of such a model can be written as [26]: The purpose of the model was to present the relationship between the explained variable Y and the explanatory variables x n . It was assumed in the paper that the consumption of natural gas by consumers constituted the explanatory (dependent) variable Y, and the potential explanatory (independent) variables, potentially characterizing the factors influencing the value of natural gas consumption, would be marked as x n . α 0 , . . . , α n -parameters to be determined. In the following section, the basic sequence of econometric modeling procedures has been presented. Figure 3 illustrates the procedure for the construction of the econometric model. sumption.

Material and Methods
The objective of the paper is to determine the nature and type of cause-an relationships for the consumption of natural gas by Polish end consumers. Econ model building (classical least squares method) can be used for this purpose. In sical least squares method, the assumptions must be considered, among all: the es econometric model is linear with respect to the parameters, and the explanatory v xi are nonrandom quantities with fixed elements.
Consequently, it is reasonable to make an attempt to estimate a model that ch izes the dependence of gaseous fuel consumption on selected factors.
In general, the analytical form of such a model can be written as [26]: The purpose of the model was to present the relationship between the explai iable Y and the explanatory variables xn. It was assumed in the paper that the consu of natural gas by consumers constituted the explanatory (dependent) variable Y, potential explanatory (independent) variables, potentially characterizing the facto encing the value of natural gas consumption, would be marked as xn. α0,…, αn-par to be determined. In the following section, the basic sequence of econometric m procedures has been presented. Figure 3 illustrates the procedure for the constru the econometric model.

Principal Characteristics of the Variables
The main part of the construction of the econometric model using the least-squares method is the specification of the variables, including data collection. The selection of explanatory variables from the originally adopted set was preceded by checking the existence of significant relationships between potential independent variables and the dependent variable. The hypotheses of stochastic independence for the examined characteristics were verified using Pearson's chi-square independence test. The literature review suggested a relationship between natural gas consumption and GDP [27]. A study of a comprehensive review of macroeconomic indicators published by the Central Statistical Office was conducted in a monthly class. Analysis of the literature indicated that the three main influencing factors with the highest correlation with natural gas demand in the structure

Principal Characteristics of the Variables
The main part of the construction of the econometric model using the least-squares method is the specification of the variables, including data collection. The selection of explanatory variables from the originally adopted set was preceded by checking the existence of significant relationships between potential independent variables and the dependent variable. The hypotheses of stochastic independence for the examined characteristics were verified using Pearson's chi-square independence test. The literature review suggested a relationship between natural gas consumption and GDP [27]. A study of a comprehensive review of macroeconomic indicators published by the Central Statistical Office was conducted in a monthly class. Analysis of the literature indicated that the three main influencing factors with the highest correlation with natural gas demand in the structure

Principal Characteristics of the Variables
The main part of the construction of the econometric model using the least-squares method is the specification of the variables, including data collection. The selection of explanatory variables from the originally adopted set was preceded by checking the existence of significant relationships between potential independent variables and the dependent variable. The hypotheses of stochastic independence for the examined characteristics were verified using Pearson's chi-square independence test. The literature review suggested a relationship between natural gas consumption and GDP [27]. A study of a comprehensive review of macroeconomic indicators published by the Central Statistical Office was conducted in a monthly class. Analysis of the literature indicated that the three main influencing factors with the highest correlation with natural gas demand in the structure of energy consumption, GDP and urbanization rate, successively [28]. The published results for Indonesia as a country rich in its natural resources are to be one of the countries in the world that plays an active role in increasing international trade flows, it has been shown that in the short term, all variables such as domestic consumption, exchange rate, natural gas prices, and GDP per capita have a significant impact on the volume of natural gas exports and imports [29]. Additionally, other studies indicate factors such as total population, gross domestic products, urbanization rate, industrial structure, energy consumption structure, and carbon dioxide emission to be colinear with natural gas consumption. In addition, it points out the strongly correlated relationship between natural gas consumption and outside air temperature. Natural gas demand generally increases in the coldest months and decreases in the warmest months, with a slight increase in the summer to meet the demand for electric generators [30].
The data on macroeconomic indicators were all taken from EUROSTAT.

Propose Optimal Macroeconomic Indicators
The first is the compilation of a selection of potential explanatory variables based on knowledge about the phenomenon under analysis. These variables are obtained by knowledge of the theory of the given field of knowledge. The criterion for selection should be substantive knowledge of the studied phenomena. Factors should be selected (explanatory variables) that have a significant impact on the formation of the phenomenon researched. The variables collected in this manner will be called the set of potential explanatory variables. The most important formal and statistical criteria in the methods of variable selection are: the variables should be characterized by high variability, ensure maximum correlation of the explanatory variable with the explanatory variables. The explanatory variables should not be significantly correlated among themselves. The maximum degree of fit of the model to the actual economic relations should be sought, which is expressed in the maximization of the coefficient of determination R 2 , which is done in the article in the following section. Subsequently, a reduction of the set of explanatory variables is made, guided by statistical criteria. The article relies on the graph method, the study of correlation significance, and information capacity.
The first step in building the econometric model was to study the specification of potential macroeconomic indicators that affect the consumption by gaseous fuel of industrial consumers [31]. The selection of explanatory variables for a causal model should be based on the theory of the relevant field of knowledge, the experience of the principal and statistics, and trial and error methods (intuitive). Based on a general analysis of the macroeconomic indicators made available by EUROSTAT, the study selected the most potential ones (Table 2). Table 2. The preliminary macroeconomic indicators proposed.

Y Natural gas consumption by end consumers X1
Primary cereal production (yields affect potassium phosphate production potassium X2 Production of foods (consumers of gaseous fuel) X3 Paper production (consumers of gaseous fuel) X4 production of refined products (consumers of gaseous fuel) X5 Manufacture of chemicals and chemical products (consumers of gasoline) X6 Manufacture of nonmetallic mineral products (consumers of gasoline) X6 Metal production (consumers of gaseous fuel) X7 Manufacture of metal products (consumers of gaseous fuels) X8 Electricity, gas, steam, and air conditioning supply X9 Heating days X10 Contracted firm capacity in the National Natural Gas Transmission System X11 Building production After analyzing the proposed macroeconomic indicators, in terms of the construction of a time series that can be used to build an econometric model (Table 3), it was decided to select (highest p-values):  Table 4 determined the essential statistical calculations needed to develop the model, using GRETL software [32]. An open source statistical package used mainly in econometrics. The above results are not discussed because non-stationarity in individual macroeconomic indicators had to be removed. The results are discussed in detail in further detail in the article.

Checking the Seasonality and Removal of Nonstationarity
The following fundamental step requires checking the seasonality of the time series of the variables (Table 5). The variables can be used in the model, but an issue is the risk of apparent regression. This regression occurs if the analyst considers the variables in the model and ignores the fact that they are stationary.  Non-stationarity is eliminated in order for the data to be distributed normally and to achieve stationarity. As a result of the analysis, it was found that trends occur in the time series of variables X3, X4, X6, X8, X10. These series have the feature of trend-stationarity. The basic element in building an econometric model is to remove non-stationarity [33]. For the indicated variables, trend stationarity was removed by the differentiation method (Table 6). The above results are reasonably satisfactory (Table 7), but the objective was to achieve even better results, striving to obtain Stat. Durbina-Watsona~2,0 [34]. The best results intended for the assumptions were obtained after removing the variables: X7, X8, X12 (Table 8): The p-value for the F-test was 5.33 × 10 −8 (Table 9) which is a quite satisfactory result, the proposed arrangement of variables has explanatory power [35]. The coefficient of determination of R-square was 0.87, and this is sufficient to provide an analysis of the relationship. The information criteria of Akaike, Hannan-Quinn, and the Bayes-Schwarz criterion are close to each other, satisfactory result [36]. Furthermore, the values of the Durbin-Warson statistic are greater than the coefficient of determination R-square, a satisfactory result [37]. Another important information is the result obtained from the Durbin-Watson statistic obtained 2, which according to the distribution table is a good result. Furthermore, the standard error of the residuals was 244,692.9, in relation to the constant −9.80418 × 10 6 , which is significantly small, a good result. The analysis proved that the range of the adjusted R-square and the coefficient of determination R-square is within 5%, which is a good result. It should be added that the results presented suggest eliminating the variable X5 in the further construction of the econometric model.  (Table 10) [38].
The correlation matrix showed that the variables: X9, X10, and X11 are greater than the critical value, i.e., 0.333, they are potentially strongly related to the explained variable Y and have explained it well.

The Hellwig Method
The Hellwig method has been a formal method of selecting explanatory variables for a statistical model. The GRETL software provided a ready-to-use package for determining variables using this method [39]. Hellwig's method indicated that the variables X10_filter and X11 are the most significant. The integral capacity was 0.42. Given the correlation between the variables Y and X9, include the variable in the model.

Stepwise Regression Method
Stepwise regression is a method that iteratively examines the statistical significance of each independent variable in a linear regression model. Furthermore, there is a popular data extraction tool that uses statistical significance to select the explanatory variables to be used in a multiple regression model [40]. GRETL software, also provides prepackaged packages for calculations using this method (Tables 11 and 12). A significance level of 5% was assumed for the Student's t-test: The stepwise regression method showed that the significant explanatory variables of the Y variable are: X6, X10, X11 (Table 13). To summarize the results above, the summed variables were based on the Hellwig test and the stepwise regression method was used to build the model: X6, X9, X10, X11.

Building a Model That Includes Relevant Variables
A new method of least squares model with four variables was estimated. It is finally adopted (Tables 14 and 15) as follows, as the result of the previous models: On the basis of the above results, it can be concluded that the parameters of interest have improved.

Testing the Normality Distribution of the Residuals
Null hypothesis: the empirical distribution has a normal distribution ( Figure 6). Performed Doornik-Hansen Test-transformed skewness and kurtosis: Chi-square(2) = 1.060 with p-value 0.58863 ( On the basis of the above results, it can be concluded that the parameters of interest have improved.

Testing the Normality Distribution of the Residuals
Null hypothesis: the empirical distribution has a normal distribution ( Figure 6). Performed Doornik-Hansen Test-transformed skewness and kurtosis: Chi-square(2) = 1.060 with p-value 0.58863 (Table 16).

Testing for Autocorrelation
To test for autocorrelation, the Breusch -Godfrey test based on Langrange multipliers was performed [41]. The null hypothesis of this test is the absence of autocorrelation.
Coefficient of determination. R-square = 0.387876  For all test statistics, the p-values are large, greater than 5%, and therefore the null hypothesis should be accepted, i.e., the model contains no autocorrelation. The residuals do not depend on the lagged residuals (Figure 7). For all test statistics, the p-values are large, greater than 5%, and therefore the null hypothesis should be accepted, i.e., the model contains no autocorrelation. The residuals do not depend on the lagged residuals (Figure 7).

Removal of Autocorrelation (Cochrane-Orcutt Method)
Autocorrelation is the dependency between the two components. Autocorrelation occurs when the effects of random variability do not expire within a given period. For all test statistics, the p values are large, greater than 5%, so the null hypothesis should be accepted, i.e., the model contains no autocorrelation. The residuals do not depend on the lagged residuals.

Heteroskedasticity Test (Test Breusch-Pagan)
Heteroskedasticity occurs when the variance of a random component is not constant. The occurrence of heteroskedasticity can be influenced by both an incorrect functional form of the model and the omission of relevant variables. To test for heteroskedasticity, the White test and the Breusch-Pagan test were performed [42]. The null hypothesis of both tests is that there is no heteroskedasticity. Variance constancy means only that at each measurement point the variance of the random component is constant. This is a very important assumption. Check that the square of the residuals does not depend on anything.

Removal of Autocorrelation (Cochrane-Orcutt Method)
Autocorrelation is the dependency between the two components. Autocorrelation occurs when the effects of random variability do not expire within a given period. For all test statistics, the p values are large, greater than 5%, so the null hypothesis should be accepted, i.e., the model contains no autocorrelation. The residuals do not depend on the lagged residuals.

Heteroskedasticity Test (Test Breusch-Pagan)
Heteroskedasticity occurs when the variance of a random component is not constant. The occurrence of heteroskedasticity can be influenced by both an incorrect functional form of the model and the omission of relevant variables. To test for heteroskedasticity, the White test and the Breusch-Pagan test were performed [42]. The null hypothesis of both tests is that there is no heteroskedasticity. Variance constancy means only that at each measurement point the variance of the random component is constant. This is a very important assumption. Check that the square of the residuals does not depend on anything.

Stability Test for Coefficients
Variance inflation factor (VIF) is a measure of the amount of multicollinearity in a set of multiple regression variables. Mathematically, the VIF for a regression model variable is equal to the ratio of the overall model variance to the variance of a model that includes only that single independent variable. Collinearity assessment VIF(j)-Variance Inflation Factors (VIF)-minimum possible value = 1.0. Values > 10.0 are indicative of a collinearity problem, variance inflation. VIF(j) = 1/(1 − R(j)ˆ2), where R(j) is the multiple correlation coefficient between variable j and the other independent variables of the model (Table 19). Where: lambda = eigenvalues of the inverse covariance matrix (smallest is 0.00082038), cond = condition index (note: the proportions of variances in the columns have a sum equal to 1.0) According to BKW, cond >= 30 means "strong" almost linear interdependence, for cond between 10 and 30 means "moderately strong" interdependence. Estimated parameters whose variance is associated with high values of the condition index can always be considered problematic.
The test was also repeated for the variables X10_filter, X11, and X6_filter (Table 21): The test was repeated for a different set of parameters (Table 22). Belsley-Kuh-Welsch collinearity diagnosis: variance proportions (Table 23): The test was repeated for a different set of parameters (Table 24). X10, X11 with Y were also indicated in stepwise regression and Hellwig's method as the best variables. The collinearity has been removed. The new model does not seem to retain its positive properties. To confirm this, the tests in the previous subsections were repeated.

Final Form of Macroeconomic Indicators
Least Squares estimation, observations used 2013:02-2015:01 (N = 24). Dependent variable (Y): Y_filter (Tables 27 and 28). A CUSUM test showing whether the index measuring the sensitivity of the model or the sensitivity measure is within the confidence interval, i.e., the coefficients do not change over time (Figure 9) [44]. A CUSUM test showing whether the index measuring the sensitivity of the model or the sensitivity measure is within the confidence interval, i.e., the coefficients do not change over time (Figure 9) [44].
To improve the stability run, a modification was made to account for the structural change for variable X 11. Then the run as a whole formally falls within the confidence interval.

Coincidence Test
Coincidence test comparing the frequencies of two periodic phenomena (as the ticking of two clocks) by observing the interval between their successive coincidences, the interval being the least common multiple of the periods compared [45].
The coefficients X11_pre_shock and X10_filter have the same positive signs, i.e., the model has coincidence. The X6_filter has opposite signs. It does not cross it out of the model but may introduce some disorder (Tables 29 and 30). Therefore, for the final model the following will be taken: X11_before-shock (continuous power) and X10_filter (heating days) (Table 31). To improve the stability run, a modification was made to account for the structural change for variable X11. Then the run as a whole formally falls within the confidence interval.

Coincidence Test
Coincidence test comparing the frequencies of two periodic phenomena (as the ticking of two clocks) by observing the interval between their successive coincidences, the interval being the least common multiple of the periods compared [45].
The coefficients X11_pre_shock and X10_filter have the same positive signs, i.e., the model has coincidence. The X6_filter has opposite signs. It does not cross it out of the model but may introduce some disorder (Tables 29 and 30). Therefore, for the final model the following will be taken: X11_before-shock (continuous power) and X10_filter (heating days) (Table 31).

Experimental Results and Discussion
Econometric modeling has shown that for the proposed macroeconomic indicators, the natural gas consumption by Polish industrial consumers is determined to the greatest extent by the heat and power industry and the chemical industry. A significant role is also played by increasing the contracting of firm capacity provided by the Polish gas transmission pipelines operator, which proves an increased interest in gaseous fuel by the industry. This interest is related to the ongoing transformation process, in which natural gas will constitute a bridge fuel and an important factor in ensuring energy security.
To build the model, historical data related to the supply of gaseous fuel were necessary. The analysis covered the years 2011-2021. In the analysed time interval, potential macroeconomic indicators and natural gas volumes shipped were given in monthly gradation. It was found that for the course of natural gas supply there are structural changes that make it necessary to analyse the time series in the periods January 2015-March 2018, March 2018-January 2020, and January 2020-December 2020 . The selected macroeconomic indicators confirmed the fact that structural changes occurred during the pandemic period. Therefore, a stable supply period was considered for the study, as required for the model, 2011-2015.
Initially, variables such as the share of Property Rights to Certificates of Origin for energy produced from RES (in order to present them as a new source in the ongoing energy transition) in session transactions on the Polish Power Exchange and hard coal production in Poland (energy transition process) were proposed. Other variables proposed are energy-related goods, construction and assembly production (constant prices), dwelling or house occupancy and energy carriers, price index of industrial output sold, weighted average gas prices.
Based on the information obtained from descriptive statistics, it was found that the coefficients of variation for the indicators: "coal production [thousand tonnes]", "energyrelated goods", "dwelling or house use and energy carriers", "price indices of industrial output sold" are low, that is, less than 0.1. For this reason, these indices were not taken into account for further analysis. In the article an attempt was made to present the dependence of the impact of energy produced from hard coal, but the low variability of the index did not allow it. Therefore, an additional variable was introduced, namely PCMSI 2 (Polish Energy Coal Market Index). Moreover, due to the fact that the above-mentioned indices could not be taken into account in further analysis, additional indices were carried out: the price of Brend crude oil, CO 2 emissions trading (EU ETS carbon market price euros), coefficients for the number of heating days, the coefficient for electricity/gas/steam/hot water generation and supply, new orders in the industry. An attempt was also made to introduce variables related to the length of available infrastructure and the number of new customers (connected and in the process of being connected), but without success due to lack of such data (protected data). For the above variables, the coefficients introduced were no longer low and were taken for further analysis. After analysing the time series graphs of the proposed variables, it was concluded that because of the too frequent structural changes occurring in the variable "coal production", the elimination was eliminated.
Subsequently, a preliminary analysis of the graphs of the dependence of the explained variable (gas supply to end users) on the explanatory variables, as well as the dependence between the explanatory variables themselves, was carried out. The conclusions of the preliminary analysis showed that the explained variable is dependent.
From all variables, with the exception of the variable 'price index of industrial production sold.' Preliminary analysis revealed many unfavourable correlations between the explanatory variables.
Another important aspect was to assess the stationarity of the time series. The variables initially proposed can be used to build the model, but there is a risk of apparent regression. Therefore, the explanatory variable and the explained variables were tested for stationarity. The Dickey-Fuller test for the Y variable showed that it is a series with free expression, linear trend, and quadratic trend. The KPSS test confirmed this fact. The Dickey-Fuller and KPSS tests were used to check trend stability, stationarity, and stochastic nonstationarity for the remaining variables. The results obtained showed that the variables X1, X11, X12, and X13 are stationary. For the period 2016-2018, the series is stationary, it was noted that since the beginning of 2018 there has been a sharp increase for the variable X8, which causes a structural change and these observations were not taken into account.
After examining the series for stationarity, this was removed for the variables for which it was found. A preliminary analysis of the model was then carried out. The estimate of the model was carried out using the classical least-squares method. However, due to the very high p-value for the F-test (0.919), other variables that could be used in the model should be reexamined, as the F-test showed that the variables already proposed would not be able to induce a strong correlation in this system. In addition, an attempt was made to logarithmize the variables, but this did not introduce significant changes in the p-values in the F test. In the search for additional variables, the list of entities classified as final customers was analyzed in terms of their business profile. Furthermore, the zone of customers that have available transmission capacity in the national transmission system was analyzed. Again, the analysis of input data was performed. Based on the analysis of the transmission customers, it can be concluded that entities can be divided into groups: (1) those engaged in the production of basic chemicals, fertilisers and nitrogen compounds, plastics and synthetic rubber in primary forms, (2) those engaged in the sale of heat and natural gas, (3) those engaged in the production of building ceramics and table glass, (4) manufacture of products for the automotive, engineering, and mining industries, (5) manufacture of steel, (6) manufacture of electricity and heat, (6) manufacture of household chemicals, (6) retail, wholesale, (7) other. There is no information available on the volume of gaseous fuel consumption, but from the review of the available literature it can be concluded that the largest amount of gaseous fuel is consumed by industrial customers associated with the production of chemicals, fertilisers, electricity generation, building ceramics and other materials.
A re-analysis of the time series graphs was carried out to check for structural changes in the time series. Descriptive statistics tests were carried out and showed that the variable X1 had a variance of less than 0.1, indicating that this variable alone could not be taken for further analysis.
Next, graphs of the relationship between the explained variable and the nonplanar variables were constructed. After this part, stationarity was reassessed, and non-stationarity was removed from the newly proposed variables.
As a result of the second analysis of the newly proposed variables, the p-value for the test is 0.28. This result could be acceptable due to the values of the Durbin-Watson statistic but the coefficient of determ. The R-square is 0.53, which led the author to decide to combine the variables, from model 1 and model 2, those that have the greatest association strength with the variable under study.
Important information is the fact: the two samples made between January 2016 and March 2018 showed a problem in finding the strength of the relationship between the Y variable.
The third analysis was for the period January 2012-2015. This is the period in which the greatest stabilization was observed. For the next, third attempt at analysis, the following indicators were adopted: production of staple cereals (yields affect phosphate consumption potassium salt, fertilizers), food production (consumers of gaseous fuels), paper production (consumers of gaseous fuels), refined products production (consumers of gaseous fuels), chemicals and chemical products production (consumers of gaseous fuels), manufacture of nonmetallic mineral products (consumers of gaseous fuels), manufacture of basic metals (consumers of gaseous fuels), manufacture of metal products (consumers of gaseous fuels), electricity, gas and steam production and supply, heating days, Continuous power, construction output.
Stationarity estimation and removal of non-stationarity were again performed for the proposed variables in the above setup. Least Squares per-formed estimation suggests removing variables X7, X8, X12-which was done. The following results were obtained: Having satisfactory variables, a study of the correlation between the variables was made. This test was carried out using a correlation matrix, which shows that variables X9, X10, X11 are potentially strongly related to the explanatory variable Y and describe it well. Furthermore, an additional correlation test between variables was performed using the Hellwig method, which indicated that variables X10 (after stationary trend change) and X11 are the most significant. The integral capacity for this system was 0.42. Due to the correlation between the variables Y and Y9, it was included in the model. In addition, the correlation between the variables was examined using the stepwise regression method (an alternative to Hellwig's method), which assumed a significance level of 5% for the T-student test. The stepwise regression method showed that the significant explanatory variables for the Y variable were X6, X10, and X11. To sum up the above discussion, the summary variables will be used to build the further model, viz. X6, X9, X10, X11. Based on the tests so far, the batch variables were found to be good.
The model building was then carried out with the relevant variables. A new least squares model was estimated with only four variables already included.
For the new least squares model, the normality distribution of the residuals was checked. The Chi-square test for normality of the distribution and the Doornik-Hansen test showed that the distribution is normal.
To check for the presence of autocorrelation, the Breusch-Godfrey test based on Lan-grange multipliers was performed, in which the null hypothesis for this test is the absence of autocorrelation. The test carried out showed the absence of autocorrelation (no autocorrelation of the random component), which may indicate a well-done analysis of the input data to the model. The next point in constructing the model is the test for heteroskedasticity, i.e., the White test and the Breusch-Pagan test were performed to check. The null hypothesis of both tests is the absence of heteroskedasticity. Both tests indicated the absence of the heteroskedasticity problem (p = 0.74).
To further test the fit of the data to the model, the Ramsey RESET test was performed. Null hypothesis: the model is fitted correctly (linearity of the model). The p-value (0.249) indicates that the model is fitted correctly. The collinearity of the variances was further tested with the VIF test. The test was initially conducted for the following variables: X6(after removing trendostationarity), X9, X10 (after removing trendostationarity), X11. The test showed that the highest collinearity occurred for variable X9. The test was repeated for variables X10 (after removing trend stationarity), X11, and X6 (after removing trend stationarity). For these variables, the stepwise regression test and the Hellwig method showed that these were the best variables. Therefore, collinearity was removed. For the new set of variables, the normality test of the residuals, the autocorrelation test, the heteroskedasticity check were performed again. These tests did not show any problems.
Upon checking the stability of the model parameters, the CUSUM (Cumulated SUM of residuals) test was performed. This test shows whether the index measuring the sensitivity of the model or the sensitivity measure is within the confidence interval, that is, whether the coefficients do not change over time. For three observations, a structural change is visible (not significant-remains unchanged).
In order to improve the stability of the run, a modification related to taking into account structural changes was introduced for variable X 11. Then the whole run formally falls within the confidence interval.
The last step was to conduct a coincidence test, where its absence indicates collinearity of the variables. The coefficients X11_before_shock and X10_filter have the same positive signs, i.e., there is coincidence in the model. The X6_filter has opposite signs. This does not remove it from the model but may introduce some disturbance. Heating days (index)-the severity of the cold over a specified period of time, taking into account the outdoor temperature and the average indoor temperature (in other words, the need to heat). HDD calculation is based on a base temperature, defined as the lowest daily average air temperature that does not lead to space heating. The value of the base temperature depends, in principle, on several factors related to the building and its surrounding environment. Using a general climatological approach, the base temperature is set at a constant value of 15 • C in the HDD calculation.

Conclusions
(c) Will increase by 11,989 MWh if the contracted capacity increases by a unit compared to the previous year.
On the basis of the obtained model of the analysis of interdependence of the natural gas supply phenomenon to customers, it may be concluded that the highest interdependence occurs for the production of mineral fertilizers, the index of the number of days of heatingi.e., the consumption of gaseous fuel by industrial and commercial heat and power plants, and the contracted capacity by the remaining final customers. The other end users are a group with a wide range of production types, so it can be concluded that the tests of the proposed variables did not show a significant impact on the model.
In relation to the actual situation, the model refers to the expected significant increase in the share of gas units, as forecast by Polskie Sieci Elektroenergetyczne. The results of analyses prepared by Polskie Sieci Elektroenergetyczne to determine the future structure of electricity generation for the transmission network development plan update of the for the purpose of updating the transmission network showed a possible significant increase in the number of gas units in the National Power Grid. In addition, activity of entities from the power sector electricity sector may be a result of the emerging power market-the Act of 8 December 2017 on the power market (Dz.U of 2018, item 9) as well as the necessity or willingness to convert in the next few years of highly emitting energy carriers (coal) due to increasing electricity demand.
Charges for CO 2  of infrastructure and other business operations. Econometric models can be successfully implied in energy companies conducting business related to energy trading. Such models will allow them to understand market phenomena and investigate the reasons behind the industry's interest in energy.
The application of this model can be used by energy companies involved in trading, supplying gaseous fuels to end consumers. Energy operators will pay attention to the most important external indicators influencing natural gas off-take. This knowledge determines the proposed parameters to which particular attention should be paid, in planning the supply of natural gas to consumers. The article outlines the essence of macroeconomic modeling as one that can complement mechanical forecasting based on historical data. The model clearly indicated that the largest consumption of natural gas in Poland is related to the commercial power industry, heat engineering and switching economies from coal to, i.a., natural gas.
The selection of potential macroeconomic indicators should be critically assessed. The pre-selection of potential explanatory variables is time-consuming and the analyst must have a working knowledge of the market under study. Macroeconomic modeling is also unable to build a model capable of signaling the accumulation of negative economic phenomena.
Future directions of research should include identification of the most significant factors influencing the consumption of natural gas by the Polish economy as a result of the ongoing energy transition in order to achieve carbon neutrality. This research could also be based on the time period during which the coronavirus pandemic occurred and affected the reduction of energy consumption by end users. Future research could also indicate how the pandemic directly affected the analyzed macroeconomic indicators, which subsequently affected energy consumption. Furthermore, the research procedure presented can also be applied to other energy sectors. A challenge for today may be to present this method in the area of renewable energy sources. An interesting challenge is the development of an econometric model covering all available energy sources and their combination with the development of a forecast. It is also worth undertaking other studies to analyze macroeconomic indicators for different countries of the European Union and to develop models with comparisons, while checking which indicators affect which countries. The results of this research could be the subject of a fair energy transition.