1. Introduction
The oil shocks and market supply instabilities and uncertainties that have been occurring from 1973 until today have led to an increase in energy costs with negative consequences on the growth of different economies in different parts of the world. In addition, the 1997 Kyoto Protocol operationalized the United Nations Framework Convention on Climate Change by committing industrialized countries and economies in transition to limiting and reducing greenhouse gas (GHG) emissions in accordance with agreed-upon individual goals. It is therefore the occurrence of these two joint goals of the end of the last century that forced public decision-makers to initiate energy- and carbon-saving policies. Consequently, Europe defined the goals for 2030 [
1], which are a 40% reduction in greenhouse gas emissions with respect to the 1990 values and an increment of energy saving of at least 32.5%.
Following IRENA [
2], energy efficiency and renewable energy sources may provide more than 80% of the required emissions savings [
3]. These saving-oriented objectives require a management system enabling the evaluation of progress with respect to these defined goals on different time and space scales. In fact, many authors are aware of the complexity of this problem. This is highlighted in recent publications either from a descriptive or prospective point of view (e.g., [
4,
5,
6,
7]) or from a methodological point of view (e.g., [
8,
9]). One of the issues is the lack of statistical data on a disaggregated scale, which contrasts with the pressing needs for information on a parallel level for such a strategic sector of the economy.
Therefore, this paper tries to address one of these issues. We will present and apply a recent statistical approach to recover statistical information in conditions when traditional mathematical or statistical techniques are unable to pass muster. This approach is related to the principle of maximum entropy, which is known to deal with ill-posed inverse problems such as the one to be solved in this article. The same approach has been applied in the recent past. The most noteworthy work is one that assesses the interregional distribution of greenhouse emissions in Poland by industry [
10]. In this document, questions will arise when trying to recover the estimates from a dashboard of energy efficiency scores at a disaggregated national level where the statistical data does not exist. The case of Polish provinces will be methodologically illustrated for many similar countries, particularly those within the European Union. Many developed countries publish annually (e.g., [
11]) sectoral energy efficiency scores aggregated at the national level. However, such aggregated information makes it very difficult to assess and plan the energy policies for which scope of operation happens to be at the sub-regional levels. To overcome this problem, statistical institutes instead calculate energy intensity scores, which are easier to estimate. The first question is whether the production of statistics on energy efficiency coefficients at a disaggregated level (in our case at NUT 2 level) deserves to be calculated alongside the energy intensity coefficients. The counter argument is that the energy intensity coefficients available should be sufficient to provide information on its efficiency. Nevertheless, the correlation between energy intensity and energy efficiency is generally far from perfect. For instance, a small service-based economy in a mild climate region will be characterized by a lower intensity than a large industry-based economy in a colder climate region, even though the latter uses energy more intensively and efficiently. In addition, other elements will also play a role in defining the efficiency levels and trends. Among these are regional economic structure (share of large energy-consuming industries), geographical characteristics (e.g., longer distances leading to higher demand for transport), and climatic and weather conditions (changing demand for heating or cooling) (e.g., [
11]).
Based on what has just been presented above, let us now concretize the problem targeted by this paper as follows: we need to estimate the coefficients of energy efficiency for the sixteen province levels of Poland. The energy efficiency in question concerns the following four sectors: industry, transport, households, and services. The Odyssee-Mure project (e.g., [
11]) publishes national statistics on the sectoral averages of these coefficients. Next, some institutes such as the Polish Institute of Statistics [
12] publish the energy intensity coefficients at a disaggregated level, e.g., at the province level. The question posed in this article is, therefore, how to reconcile these two sources of information to forecast the sector ratios of energy efficiency at a province level. Mathematically, the problem as presented in a rectangular table (see
Table 1) is ill-posed. On the one hand, we only have information on the averages of the energy intensity coefficients for each of the 16 provinces, while at the level of the four sectors, we have the averages of energy efficiency without, in both cases, any details at the individual interprovincial sectoral level. On the other hand, it follows that the total of the rows cannot correspond to the total of the columns. Finally, if we additionally take into consideration the fact that the systems generating these two sources of information are not known with precision, the problem to be solved turns out to be an ill-posed stochastic inverse problem for which the solution goes beyond traditional mathematical techniques.
The remaining part of this paper is organized as follows.
Section 2 introduces the concepts and definitions related to optimal energy use and saving through recent literature on the subject.
Section 3 present mathematical and statistical insights of the model in the context of the problem to be solved. In this context, the concept of inverse problem is defined, and this will be followed by a detailed presentation of non-extensive cross-entropy econometrics as the main technique applied by the model.
Section 4 presents the model outputs and comments. At the end of this section, a sub-section is devoted to the main limitations of the proposed approach and future areas for further research.
Section 5 draws conclusions and highlights potential outcomes related to the application of the presented method.
2. Energy Efficiency and Its Measurement
This section provides basic concepts to enable good understanding and interpretation of the computed outputs. There exists a vast literature on the energy efficiency definition and measurement (e.g., [
13,
14]). The reason may reside in the complexity of the energy efficiency technical evaluation [
15]. The issues are the choice of the right aggregation level, the appropriate variables to construct a reference energy consumption trend, the energy units to be applied, and interaction between various effects. Uncertainty margins for the results lack in most presentations as well. Among a vast literature, we underscore some of these works that disserve a higher level of attention in the context of this paper. In particular, the authors [
16] present two approaches to measuring energy efficiency. The bottom-up approach, to which the “Odex” index is linked, has been developed under the EU Odyssee-Mure programme. This approach will be developed in this article. The second approach is a top-down approach, which brings together the “Decomposition” methodologies as used by e.g., the Netherlands, Canada, and New Zealand. Likewise, the authors [
17] propose the calculation of total factor energy efficiency (TFEE) using the concept of global production instead of domestic product, which excludes intermediate consumption. Another paper worthy to cite comes from the authors [
18]. In that work, a literature review was carried out, and the authors found that the currently in use definitions of total-factor energy efficiency and total-factor carbon emissions efficiency are confusing and misleading.
Regarding institutional research, the Environmental and Energy Study Institute [
19] uses a definition of energy efficiency, emphasizing using less energy to perform the same task, i.e., eliminating energy waste. The European Parliamentary Research Service (EPRS) definition [
20] highlights the fact that energy efficiency should refer, in general terms, to the amount of output that can be produced with a given energy input. Most commonly, energy efficiency is measured as the amount of energy output for a given energy input. However, other kinds of output can also be used. The EU Energy Efficiency Directive uses a very broad definition: “energy efficiency” means the ratio of output of performance, service, goods, or energy to input of energy. Following the International Energy Agency [
21], the energy efficiency ratio is a ratio between energy consumption (measured in energy units) and activity data (measured in physical units).
Finally, the UNDP, as an organ of the United Nations, has published a summary work that takes up different approaches to calculating energy efficiency. In fact, this institution, in the third chapter [
22], begins by identifying the methodological challenges associated with defining and measuring energy efficiency. It then proposes a framework for understanding energy efficiency trends, integrating the current UNDP approach to energy efficiency developed by various international agencies and national institutions, and establishing a methodology to identify a starting point in relation to which future improvements in energy efficiency can be measured globally and at national levels.
It is worthwhile to pay attention to the related three concepts to be used in the next part of the paper. For an economy-wide measure, GDP is often compared to energy use, to give the energy intensity (measured, for example, in kilowatt-hours per euro). Next, energy savings are the reduction of energy use, without reference to output produced. Finally, as far as energy efficiency assessment is concerned, this can be done at different levels and according to different techniques. These levels may start from economy-wide and sectoral energy intensity to individual units of activity.
We now present some details regarding the ODEX energy efficiency index from which the data to be used in our model are extracted a priori. This index is published by the Odyssee-Mure project [
23] to measure the energy efficiency progress of the main sectors (industry, transport, households, services) and for the whole economy (all final consumers). The ODEX composite indicator is calculated as a weighted average of sectoral indices.
For each sector, the index is calculated as a weighted average of sub-sectoral indices of energy efficiency progress. The sub-sectors stand for industrial branches, service sector branches, end-uses for households, or transport modes.
This project calculates the scores by including the following energy efficiency components:
the energy efficiency level,
the energy efficiency trends,
the energy efficiency policies, and
the overall energy efficiency.
These three first criteria are scored between 0 and 1 on the basis of a variety of indicators (extracted from the Odyssee Database) and of energy policies (extracted from the Mure Database). The overall energy efficiency score is obtained as an average of the three scores obtained for “energy efficiency level”, “energy efficiency progress”, and “energy efficiency policies” (i.e., one-third weighting). This work will use data representing the overall energy efficiency. Following [
23], the energy efficiency scoring technique is based on the OECD Composite Indicator methodology. This method allows the countries or regions to be compared in a relevant range where minimum and maximum values indicators define the best and worst scores and countries or regions are ranked between these two extrema. The indicators are calculated and normalized so that they range between 0 and 1, following this formula:
where
Indicator: The indicator value of the country/region.
Min indicator: The minimum indicator value across all countries/regions.
Max indicator: The maximum indicator value across all countries/regions.
Direction: The favored direction in the level of indicator; −1 if the decline is favored, 1 if the incline is favored.
In spite of a “one-third weighting”, the most influential score is the one of the energy efficiency level to which the two remaining scores should be related. Its scoring, according to the Odyssee-Mure practice, is done as follows:
Scoring is done separately for the four considered sectors (households, transport, industry, and services) and for all sectors together.
The score by sector is based on scores computed for statistically selected indicators of end uses in buildings or modes in transport. For the industry sector, an aggregate score is obtained from various industrial branch scores that account for the energy efficiency characteristics of each of them.
The score by sector is calculated as a weighted score of each indicator. The weights correspond to the average shares over the last 3 years of each end use or transport mode in the sector consumption.
Finally, for comparative reasons, sector score values are normalized into the interval 0–1 according to the next formula:
where
Indicator: The indicator value of the sector.
Min indicator: The minimum indicator value across all sectors.
Max indicator: The maximum indicator value across all sectors.
To close this section, it is worthwhile to notice that the use of the above presented energy efficiency measurements will not apply in all cases. To illustrate this issue, one can mention difficulties of the energy efficiency estimation in the presence of the recent energy system known as the integrated energy system (IES). In the context of the energy crisis and environmental degradation, an integrated energy system (IES) based on the complementarity of multiple energy sources and the cascading use of energy is considered an effective way to mitigate these problems. Due to different forms of energy and different characteristics of IESs, the interrelationships between different forms of energy are complicated, which increases the difficulty of assessing the energy efficiency of IESs. A limited number of techniques exist. We can send the interested authors to the authors [
15]. These authors have proposed a technique mixing energy use efficiency (EUE) and exergy efficiency (EXE) based on the first/second laws of thermodynamics [
4].
3. Mathematical Problem Setting
- (a)
Inverse problem and the maximum entropy principal
In many real-world situations, theorists and empiricists observe at a given time two or more quantifiable multivariate stochastic systems and want to infer on an unknown cross-correlation between their random elements. To illustrate that, we implement a cross-entropy formalism to forecast an interprovincial sectoral energy efficiency score matrix based on imperfect and contradictory information from province or sector aggregates.
The basic model for (For example, by limiting itself to ordinary cases of signal or imaging, the basic equation (Equation (1)) can be extended to, for example, the impulse response of a measurement system.) dealing with poorly posed inverse problems is to solve the integral equation of the first type. We formulate this—in the context of a model that will be developed later—as follows:
- -
G is the amounts observed in rows or columns;
- -
f is the unknown regional cross-sectoral energy efficiency coefficient matrix;
- -
D defines the model Hilbert support space;
- -
is the transformation kernel associating measures G and f; and
- -
explains the random components.
The literature on various methodologies devoted to the recovery of ill-posed inverse problems [
24] is expansive when dealing with empirical problems (e.g., [
25,
26,
27]). In addition to the well-known Tikhonov regularization theory [
28], the Gibbs–Shannon–Jaynes principle of maximal (minimum) entropy [
29,
30] and recent extensions [
31,
32], until recently, have remained the most commonly used techniques to solve this class of problems. The general rule that applies to both approaches is to link the linear or nonlinear problem of the least squares with the regularization rule (a priori or additional information) to get to a well-posed problem. Moreover, the Gibbs–Shannon–Jaynes principle of maximum (minimum) entropy formalism tries to search for global regularity—related to the second law of thermodynamics—while producing the smoothest reconstructions consistent with the data available in the Bayesian spirit. In this research, as is often the case in many empirical applications, the discrete form of equation 1 was implemented.
Focusing on social science research, a number of other techniques have been tried for this class of inverse problems. Examples include the pseudo-inverse Moore–Penrose problem approach or the bi-proportional RAS approach and its extension [
33]. Although the latter technique requires an initial transaction matrix, it offers a worse solution when the model studied is stochastic. The authors [
32] showed poorer performance of the Markov chain model compared to the generalized Gibbs–Shannon entropy for this class of inverse problems. Subsequently, the Bayesian approach showed its relative superiority, particularly that associated with the principle of maximal entropy. A neural class model can also be proposed. However, it is not based on a compact theory, its application takes time, and the results are not always guaranteed. Nevertheless, recent promising research studies have proposed new algorithms to solve multicriterial, dynamic inverse problems (e.g., [
26,
34,
35]). The entropy model has been successfully applied to update and balance social accounting matrices [
33]. However, on theoretical grounds, this assumes that entropy is a positive linear function of the number of possible states and therefore sets aside the possibility of interdependencies between states and their influence. In a recent paper, the authors [
36] demonstrated the convergence of two standard regularization techniques to two special power law (PL)-related q Tsallis values. For q = 2, a Tikhonov regularization is obtained, and for q = 1, the classical formulation of the Boltzmann–Gibbs–Shannon entropy is obtained. The central point is that in addition to the well-known law of scale, PL exhibits a series of interesting characteristics related to its aggregation properties, in that it is preserved under minimum and maximum, addition, multiplication, and polynomial transformation ([
37,
38]).
Since we are dealing with a poorly conditioned inverse problem, we must meet all three conditions of regularity (existence, uniqueness, and stability of the solution) at the same time.
If, in general, the conditions of existence and uniqueness remain available due to regular a priori constraints, the stability of the optimal solution, by random or systematic errors, is much more difficult to reach. In short, the problem is, among an infinite number of distributions that meet all the restrictions imposed, to find the one that best replicates the data generation system (DGS). As for the formalism of maximum entropy, thanks to Jaynes’ contribution [
5], the reasonable candidate should be the one that reduces uncertainty about the system the most. Similarly, according to Kullback-Leibler and information divergence metrics [
29,
39], the best candidate should be those posteriors who meet all the binding conditions and deviate the least from the priors.
- (b)
Non-extensive cross-entropy energy model and confidence interval area
Kullback-Leibler information divergence metrics are best known when they apply to the Gaussian attractor phenomena for parameter estimation, while dealing with an insufficient sample. In the case of PL phenomena—of which the Gaussian law is a particular case, the divergence metric applied is the q-Generalized Kullback-Leibler relative entropy ([
40,
41]). Formally, this metric constitutes a junction of both the power law and the Kullback-Leibler relative entropy formalisms. The first is known to analytically solve non-linear—and/or fractal—systems [
42], and the second is known to find discriminative information from two or more hypotheses in the context of insufficient information. Accordingly, let us present below the main conditions following which emerges competing superiority of the q-generalized Kullback-Leibler relative entropy with respect to traditional techniques. Following (e.g., [
32,
43,
44]), insufficient information means that we are trying to solve an ill-posed problem that is likely to arise in the following cases:
- -
sample statistics are linear or collinear for various reasons;
- -
non-stationary or non-cointegrating variables result from poor model specification;
- -
data from the sampling plan are insufficient and/or incomplete due to technical or financial constraints—official statistics on small areas could illustrate this situation;
- -
The Gaussian properties of random disturbances are questioned, among others, due to systematic errors resulting from the research process;
- -
the model is not linear, and the last option is an approximate linearization; and
- -
observations of aggregated data (in time or space) may hide a very complex system represented, for example, by a PL distribution, and there may be multifractal properties of the system.
Thus, it results from the above that the q-Generalized Kullback-Leibler relative entropy will be conceptually free from the traditional hypotheses, mainly those related to the least squares method. Among these, we can list the sphericity or collinearity attributes of the model.
Regarding the model specification, we estimate, using the proposed cross-entropy technique, the random parameters of the model from the cross-sectional data [
45] of two distinct periods (years 2020 and 2021). In the context of that technique, authors [
32] proposed the solutions to panel models based on various statistical hypotheses. The outputs of those models revealed higher precision in comparison with traditional techniques. Nevertheless, since we deal with a stochastic inverse problem, treating the problem as related to a panel structure could lead to important theoretical and computational problems related to the hypothesis of the non-extensivity of entropy.
The model related to that metric has been extensively presented in different publications (e.g., [
44]) in the context of macroeconomic analysis. Since this work deals with energy management, it is worthwhile to reformulate the model in the energy management context to enable interpretation of the outputs.
We implement (the generalized Bregman Kullback-Leibler may be an alternative version of this model) the usual (the generalized Bregman Kullback-Leibler may be the alternative version of this model) discrete form of the q-Generalized Kullback-Leibler relative entropy (Equation (5)). With the cross-entropy new constraining data, the model updates the initial information (priors in
Table 1) and provides new outputs (posteriors).
It is necessary to redefine the parameterization of the generalized linear model (Equation (2)), which plays the role of constraints. The inside table elements to be forecasted can be meaningfully presented by columns as the discrete Bayesian joint probabilities explaining each region’s average cross-sector weight or probabilities corresponding to individual energy efficiency ratio. The ratio total per column will sum up to unit. We recall that energy efficiency coefficients to be forecasted are in the form of normalized indicators, thus belonging within the space zero and one. In this case, the parameter processing space coincides with the probability space. Under these conditions, the accuracy of the estimated parameters is greater, since a priori there is no loss of information from these data [
46]. In any case, let us briefly present the general procedure of reparameterization in the case of a general linear inverse model:
where the values of unknown parameters
are not necessarily bounded between
and
, indicating the need for reparameterization. This
term is an unobservable random term for perturbations, plausibly with finite variance exhibiting observation errors from empirical measurement or random shocks that may be driven by PL. The variable
consists of data observed—with errors—of an unknown data-generating system of energy efficiency coefficients by sector, and
may represent known average regional energy intensity coefficients—with uncertainty—through the relational parameter matrix
and the unobservable disturbance
to be estimated through the observable error components
. Unlike classical econometric models, no binding assumptions are required—for example, regarding the distribution of random errors. In particular, as we deal with an ill-behaved inverse problem, the number of parameters to be estimated should be greater than the observed data points, and the quality of the informative data collected should be low. The process of the true system recovery requires the entropy objective function to include all of the interacting constraining consistency moments. Thus, referring to the properties of the relative entropy principle, each new piece of constraining information will reduce the entropy level of the system in accordance with the degree of data consistency with the system. For this multi-dimensional space inverse problem, among an unlimited number of model solution candidates, the best solution will result from identifying the one that—in terms of probability—best simulates the data-generating system. By taking each of
as a discrete and random variable with a compact support (e.g., [
32]) and
possible outcomes, it can be estimated using
, i.e.,
where
is the probability of the outcome
, and the probabilities must be non-negative and added to one. Similarly, by treating each element
(which affects the total uncertainty of the sector efficiency) as an
finite, discrete random variable with a compact support and
possible results centred at zero, we can express
as
As mentioned, it can be assumed that each previous entire row of errors has been evaluated,
and a similar support space should be constructed as follows:
where
and
are the outcome probabilities in the support spaces, respectively.
. Therefore,
and
represent, respectively, the indexes of the number of rows and columns whose coefficient sums were estimated with errors. Moreover,
the term error
is empirically set around the empirical standard error of the stated variables and a priori represents the Bayesian hypothesis. The choice of error limits, of course, depends on their own properties. In this study, their sets were determined by Chebyshev’s inequality [
47], with the boundaries of the support space ranging from −3 to +3. Notice that in spite of this Gaussian property of the priors, posterior probabilities in the support space may represent a class of a non-Gaussian distribution, in particular a PL.
The element
constitutes a priori information provided by the researcher, while
is an unknown probability generating the true parameter
, the value of which must be determined by solving a non-extensive cross-entropy econometrics problem. In matrix notation, let us rewrite
with
and
. Also, let
with
and
for K and L, the number of rows and columns, and
, the number of data points over the support space for the error terms inside the regional cross-sector matrix. The same conditions of normality can be easily formulated for any vector of column sums. Next, the cross-entropy econometric estimator of the Tsallis entropy can be presented as follows:
: indicates each sum per column (values observed by energy sector , including unknown errors);
: each row total (observed values per province ) corrected for ;
: total energy intensity indicators by region affected by unknown errors; and
: probabilistic structure of energy efficiency ratios by sector and region.
is a positive scaling factor related to province climatic factor (average annual temperature by province) to match the totals of energy intensity and energy efficiency indicator levels,
is the arbitrary additional scaling factor representing other random variables, but to balance row and column sums “” refers to a variable bound to a row or column total, depending on the context.
Non-extensive statistics use a number of binding forms in which expectations can be set. The above model uses Curado-Tsallis (C-T) constraints [
40,
48], the general form of which is as follows:
The parameter
, as already mentioned, represents the Tsallis parameter. Theoretically, the values of this parameter vary between 0 and 3 [
49]. Once again, its value equal to 1 corresponds to the Gaussian attractor. In the real world, the values of this parameter should evolve between 1 and 5/3, thus including spontaneous or man-made stable fractal structures such as in most financial or economic markets ([
50,
51]) indices. As
Table 1 suggests, the
remains a scaling factor related to province climatic factor contributing to match the totals of energy intensity and energy efficiency indicator levels that are known with uncertainty. Unlike the factor
, whose role has just been expressed above, the factor
is the arbitrary additional scaling factor representing other random variables but
to complementarily contribute to balancing row and column sums. This is because the
, which explains that climatic differences within provinces could not play the balancing role alone since other factors may exist in explaining the difference between energy intensity and energy efficiency. Among these, as already said in the introductory section, are the regional economic structure (e.g., share of large energy-consuming industries) and the geographical characteristics of provinces (e.g., longer distances leading to higher demand for transport). Still, the sum per column (i.e.,per sector) of the posterior probabilities is constrained to unity, given
. In the above model (Equations (5)–(10)), their values will play the role of the new Bayesian data discriminating in favor of the new inferential evidence. We must recall that when no new data are included in the cross-entropy model, the results correspond to those from formulating a maximum entropy principle, without additional conditions, except those of normality.
The above
is nonlinear and measures entropy in the model. The relative entropies of the three independent terms (respectively the three posteriors
,
,
and corresponding priors
) are then added together using the weights. These are the real positives that amount to unity within the mentioned constraints. The first term known as the one of “parameter precision” takes into account the discrepancies between the estimated parameters
and the prior parameters (usually defined in the support space). The second and third terms “ex-post predictions” include the empirical error term as the difference between the predicted and observed data values (see the last row and column in
Table 1) in the model. The first component of the criterion function can relate to the structure of the table parameters, the second component to the errors in the row totals, and the last component to the errors in the column totals.
It should be noted that the estimates of the model and their variances should be influenced not only by the length of the support space but also by the spatial scale effect, i.e., the number of affected point values [
32]. The greater the number of these points, the better the prior information, i.e., nonlinear starting points, about the system.
Next, the random errors
(see Equation (7)) explain the errors in data collection and processing and are not necessarily related to the Gaussian distribution being itself a particular case of a PL. Traditionally, regarding Bayesian formulations and relative entropy, it should be noted that both models will lead to similar results if and only if the real expected errors associated with the data generation system are zero in the symmetric support space around zero (see, e.g., [
44]). Similarly, the results of the Gibbs-Shannon cross-entropy and the Tsallis’ non-extensive cross-entropy will match when errors included in the model are not correlated, and the system distribution will then evolve towards the Gaussian attractor (e.g., [
52,
53,
54]).
With respect to the confidence interval of the parameters, Equation (11) shows the non-additivity of the Tsallis entropy for two—probable—independent systems, one related to the probability distribution of the parameters and the other to the probability distribution of error perturbations:
where
is then the sum of the normalized entropy associated with the parameters of the model and the term of perturbation . Similarly, the last value is obtained for all observations , with the number of data points above the support of the estimated probabilities related to the time length of errors.
The values of these normalized entropy indices range from zero to one. The values close to one indicate a weak information variable, while lower values indicate a parameter that is estimated to be more informative in the model.
4. Outputs and Comment
We will illustrate the empirical basis of the theoretical model developed in the above sections by applying the Polish energy efficiency coefficients to the case.
Table 1—whose row and column symbols correspond to Equation (6)—is presented to illustrate the extent to which a researcher can have limited information about quantity and quality before solving an inverse and ill-posed inverse problem such as the one in this article.
We have two aggregated energy indicators. each from the Polish energy demand sectors per province. The problem in hands stands as [
29] a discrete problem in a multidimensional space, leading to (
− 1) × (
− 1) degree of freedom that would illustrate the case of a standard inverse problem. In this problem, as illustrated in
Table 1 ([
55,
56]), we have 3 × 15 = 45 degrees of freedom related to the number of energy efficiency indicators per sector and province. Moreover, as alluded to before, the row total and the column total are different, probably as a result of the two separate data sources with different nature and scale.
As the above pages made us aware of, this kind of problem corresponds the best to the philosophy naturally contained in the principle of maximum entropy that we have implemented to forecast the energy efficiency ratios displayed in
Table 2.
Let us now comment on the model outputs presenting the final solution about the cross “energy efficiency ratios“ (column 5 of
Table 2) by sector and province. As presented in the theoretical model (Equations (5)–(10)), in addition to the classical normality conditions and moment consistency, the earlier explained random factors
and
allowed the system to balance. The first factor is related to various variables differentiating energy intensity from energy efficiency except the climatical factor represented by
. As already said, these calculated ratios in the above table are not deterministic, since the priors are not known with certainty. Therefore, we treat them as random variables, from which the posteriors result from the optimization process. It is worthwhile to recall that the presented energy efficiency scores stand for the overall energy efficiency scores, i.e., a combination of the three already presented components: the energy efficiency level, the energy efficiency progress (i.e., energy efficiency trends), and the energy efficiency policies. The prior matrix of energy efficiency ratios, which is not presented in this paper, has been initiated on the basis of knowledge of the province energy intensity index and the sector energy efficiency ratio averages. Next, using proportions of the energy intensity index (the last column of
Table 1) of different provinces, we computed the initial energy efficiency ratios by sector to sum up to the total of the energy efficiency ratios—known with error—available from the last row of the same
Table 1. In probabilistic terms, we have assumed the energy efficiency ratio to be a uniform distribution across the provinces, and thus the last column of the energy intensity is a marginal probability. The assumption behind this procedure is that the province with a lower or a higher average energy intensity index will have a lower or a higher average energy efficiency ratio, respectively, irrespective of the considered sector. Doing so, we enabled the nonlinear mathematical system to start the search of the model global optimal solution from the best starting points leading to a quick convergence. The post-entropy posterior ratios in
Table 2 are normalised, and the higher the value, the lower the energy efficiency level for a given sector and/or province. The obtained forecasts are empirically close to real world expectations of energy efficiency ranking within Polish provinces. We notice that in the industry sector, the provinces Mazowieckie (including Warsaw) and Wielkopolski (including Poznan) have the lowest energy efficiency ratios, respectively, around 0.320 and 0.333. The highest ratio is shown in the provinces Swietokrzyskie and Opole, with ratios around to 0.445 and 0.482, respectively. Globally, we notice that Mazowieckie displays the highest efficiency in all sectors, while Opole displays the lowest efficiency. We notice too that the industry sector globally remains the most efficient among all sectors.
Table 3 presents the level at which the model has discriminated the prior in favor of the post cross-entropy solution given the new model consistency moments and normality conditions. Consequently, we notice in that table the highest post cross-entropy discriminating values through all sectors in the case of Opole in comparison with the remaining sectors. It is worthy to recall that, as described in the precedent paragraphs, the prior values presented the same structure for different sectors. The cross-entropy formalism has discounted the non-valuable information to just retain the information fitting to corresponding sectors and provinces given the initial information on the energy intensity indicator and the climatic factor.
Following the formulation of Equation (11), the confidence interval of the model is a normalized value ranging between zero to one. In the present case, the comparable to the corrected classical coefficient of determination is equal to 0.011, much closer to zero (lowest entropy) than one (highest entropy). Then, the proposed model system has discriminated optimally from the priors, given all the constraints encountering the energy efficiency system and defined by the equations. Nevertheless, it is important to mention that the cross-entropy information metric does not conceptually conform to the property of triangular inequality inherent in Euclidean distance.
Limitations of the Study and Prospective Research Area
Let us start with the strong side of the model and then discuss its limitations. Next, we will point out the specificity of the approach of the relative non-extensive entropy econometrics. We talk about it in the context of solving a random inverse problem consisting of the estimation of energy efficiency coefficients by sector and province in Poland. As explained in the article, this approach inherits its relative strength and originality from the combination of the following three attributes:
- -
The power law distribution generalizes most of the known statistical laws and has proven to be designed for analytically resolving non-stationary functions. Its application leads to the analytical closed form outputs of the model.
- -
The Kullback-Leibler relative entropy, being a combination between the properties of entropy and Bayesian formalism, stands for a strong information metric, particularly in the case of inverse problem modelling. Thanks to this formalism, several related hypotheses using the least squares method become obsolete.
- -
While these two scientific sub-disciplines are based on solid hypotheses, joining traditional econometrics to them leads to the model proposed in this paper.
As a matter of fact, this approach will combine the advantages listed above: a minimum of hypotheses, the generalization of the normal law, and high precision of estimated parameters.
Now let us talk about the limitations. The major limitation of the approach is owing to computational complications that may arise while solving nonlinear systems. We used GAMS (General Algebraic Modelling System) software 43.4.1) to perform computations. Based on our own experience, the solution to this problem was to use one of the recent programs (connected to GAMS) designed to calculate the global optimum of nonlinear systems as well as to find the most informative initial points of the solution (the priors). In this case, we used “Knitro”. The point concerning the choice of the appropriate starting solution was explained when we presented the model.
In recent years, the non-extensive entropy approach has been extended to two parameters (in addition to q), and this opens up greater ease of modelling as well as faster convergence towards the global optimum from any point space and with a minimum of additional information about the system. This line of research could make this approach easier to apply to different models hitherto considered insoluble or soluble due to non-realistic or unverifiable hypotheses.
The next point of improvement of the model should be to analyze by simulation techniques the individual impact of the main factors impacting the energy efficiency coefficients. Once again, it is about climatic conditions, the regional economic structure, and the geographical characteristics of provinces. The model presented in this paper is limited to just estimating the sectoral coefficients by province. Therefore, certain factors have been grouped into the same variable “cc.j”.
Next, it would be worthwhile for the future to compare the relative entropy approach presented here with other powerful forecasting techniques for the same specific case. We could cite here, among others, the grey systems ([
8,
57]) based on solid theoretical foundations, the approach of neural networks successfully used in many fields of sciences, or the Fuzzy Time Series technique [
58] plus its different versions, known for its relatively good precision in terms of predictions. The comparison of all these techniques requires a good understanding of the theoretical background of each of them and then the conceptually targeted area of their applications.
Finally, thanks to this approach, this problem could also be solved in the case of further disaggregation, for example, to the NUTS 3 level. Once again, this is of capital importance because it is at the more disaggregated levels that economic actors act.