An Alternative for Indicators that Characterize the Structure of Economic Systems

Studies on the structure of economic systems are, most frequently, carried out by the methods of informational statistics. These methods, often accompanied by a broad range of indicators (Shannon entropy, Balassa coefficient, Herfindahl specialty index, Gini coefficient, Theil index, etc.) around which a wide literature has been created over time, have a major disadvantage. Their weakness is related to the imposition of the system condition, which indicates the need to know all of the components of the system (as absolute values or as weights). This restriction is difficult to accomplish in some situations, while in others this knowledge may be irrelevant, especially when there is an interest in structural changes only in some of the components of the economic system (either we refer to the typology of economic activities—NACE, or of territorial units—Nomenclature of territorial units for statistics (NUTS)). This article presents a procedure for characterizing the structure of a system and for comparing its evolution over time, in the case of incomplete information, thus eliminating the restriction existent in the classical methods. The proposed methodological alternative uses a parametric distribution, with sub-unit values for the variable. The application refers to Gross Domestic Product values for five of the 28 European Union countries, with annual values of over 1000 billion Euros (Germany, Spain, France, Italy, and United Kingdom) for the years 2003 and 2015. A form of the Wald sequential test is applied to measure changes in the structure of this group of countries, between the years compared. The results of this application validate the proposed method.


Introduction
Alongside globalization, the transfer of knowledge among different areas is a characteristic of our times, resulting in the emergence and strengthening of border domains.Among the most interesting such domains is econophysics.In our field of interest is the second law of thermodynamics and its further developments, generated by the dispute between Max Planck and C. Caratheodory [1,2], leading to crystallization of the entropy concept.The entropy concept was introduced by L. Boltzmann , and the formula S = k• log W, representing the dependency of the entropy S and probability W, which was engraved on his gravestone [3], requires a macro-system with known state probabilities (w i ), but with the restriction: ∑ w i = 1.The debates and emulation inspired by Gibbs Paradox, which showed a deviation from the second law of thermodynamics, generated a different approach of entropy [4].
Considered the father of informational theory, Claude Shannon proposed a new version of entropy calculus [5][6][7]: Over time, Shannon entropy has diversified the types and the fields of applications, such as relative entropy, entropy of a composed system, conditional entropy, entropy of a correlated system, Tsallis entropy, Sharma-Taneja-Mittal entropy, Kaniadakis entropy, and Abe entropy [2,[8][9][10].For a while, the two fields evolved separately, but more recently there have been papers published that aim to harmonize and combine econophysics with different areas of economics.The most spectacular evolution was in the financial field.In this regard, Jovanovic and Schinckus [11] developed some relevant features regarding the interference of econophysics and finance, using an accessible language, presenting a combination of methods that help the development of both domains.Thus, they combine the theory of probabilities with the empirical data processing in the financial field.In the same sense, Ausloos, Jovanovic and Schinckus [12] conclude that econophysics and finance are not irreconcilable domains, but both can progress by identifying procedures and models for the benefit of economic sciences, remarking that there are similar approaches, with positive results, in handling financial phenomena.Also, the authors bring a number of methodological clarifications.In parallel with the development of the entropy indicator, a new group of such indicators was used in the analysis of the economic systems structure.One category of methods is the econometric ones, for example the Cobb-Douglas production function [13]: In this equation, Q t -total production output (the real value of all goods produced in a year); L-labor input (the total number of person-hours worked in a year); K-capital input (the real value of all machinery, equipment, and buildings); A-total factor productivity; t-time; α and β are the output elasticity of capital and labor, respectively.Also, in this category of methods the Input-Output model proposed by W. Leontief is framed [14][15][16].Vitanov [17] discuss the complex dynamics of science structures and systems, so the evaluation of research productivity requires a combination of qualitative and quantitative methods and measures.Also, Dimitrova and Vitanov [18] investigate how the adaptation of the competition coefficients of the competing populations for the same limited resource influences the system dynamics in the regions of the parameter space, where chaotic motion of Shilnikov kind exists.Vitanov and Ausloos [19] present a rich inventory of dynamic models based on the behavior of groups of scientists and suitable to describe the emergence and spreading of new ideas in a competitive process.The authors also discuss the role of fluctuations during the emergence of innovation and when best to turn from deterministic models to more complex stochastic models.
Among the other specific methods used in the analysis of the macroeconomic systems structure, some are some quite significant.For example, the Herfindahl index of regional specialization [20,21] is useful in analyzing the geographical distribution of territorial-administrative indicators, or the specializations in the economic sector.The Krugman index [22], in economic literature called the K-spec.index, assumes dividing a country into geographical regions, or macro neighborhoods, in the border areas of the European Union.The K-spec.index, for a region, characterizes contrasts that exist between the structure of the workload in a region and the defined area economy.The converted Gini index [23,24] is a statistical measure used for the analyzing the concentration among values of a frequency distribution.The benefit of this index is that it also applies to a qualitative series (for example, production distribution by activity sectors NACE, income distribution by administrative subdivisions, etc.).This coefficient is calculated as the ratio between the average of absolute deviations and the arithmetic mean of the items.The Gini index may also be computed based on a chart, according to the surface area of concentration being its double.The Gini index calculated against the per capita income is used to define types of countries such as Organization for Economic Co-operation and Development (OECD) countries, the countries of Latin America, and the countries of Eastern Europe.The Theil index [25][26][27] is a statistical indicator inspired by the entropy measurement, calculated for an uncertain event, and characterized by a probability vector defined as the difference between the maximum value of the event's entropy and its entropy.The Theil index value is directly proportional to the concentration of the distribution values; if the distribution values are equiprobable, which means the concentration is at its minimum, then the value of this index is zero.
The length of the structural vector x [28] is defined by ||x|| i , with the limits ||x|| max = ∑ i x i , respectively: It is estimated that its concentration is directly proportional to the indicator value, but it is recommended to take into consideration the following two observations: first-the indicator size is strongly influenced by both the number of groups into which the population is divided and by the amplitude of the distribution, second-the indicator is available in the measuring unit of the characteristic.For this reason, it cannot be used in comparing the concentration of the population units relative to different characteristics.
The Lorenz curve [29,30] is used to characterize the diversification or concentration of information in an economic system.The concentration curve was used in economic analysis, for the first time by Atkinson, to measure income distribution and redistribution, and in time became recognized as "the golden standard".It is formed distinctly for discrete and continuous data.For a quantitative characteristic X defined by G x i , n i i=1,k , where G x i is a value, or a crowd, or an interval of values for the characteristic X, and n i is absolute frequency.In order to analyze the concentration of values for variable X starting from the distribution G x i , n i i=1,k , two sets of relative frequencies are calculated.The processing phases of a distribution, designed for the analysis of the concentration degree, are represented below, distribution versus transformed distribution: For curve fitting, where a data series known (x i , y i ) i=1,m with the values for each characteristic and direct cumulative, the following steps are taken:

•
The geographic regions are arranged in increasing order, in relation to the values of the ratio y i /x i • For each variable, the relative frequencies and cumulative relative frequencies are calculated;

•
The Lorenz curve is drawn by joining the points The concentration area, between the first bisector and the concentration curve, is a much more effective measure than a simple graphical representation to characterize revenue distribution.The proposed procedure seeks to highlight changes in the structure of an ensemble, measured by weights with sub-unitary values, the sum of which is not restricted to be equal to 1, as is the case for other indicators with the same destination, such as entropy or the Gini coefficient.

Statistical Distribution
The class of statistical distribution is usually defined on the interval (0, ∞) or [a, b]; a, b > 0, and its applications to economic studies are extremely wide [31,32].Excepting Beta distribution defined on [0, 1], few other distributions defined on the same interval have been applied to the analysis of sub-unit economic indicators.For the analysis of economic phenomena, characterized by sub-unit values (as, for example, the weights) we propose the probability density: The restriction θ > 1 is necessary in order to have lim f (x, θ) = 0 and to ensure the existence of the modal value x m0 on the stated interval (0, 1].Thus, the equation f (x, θ) = 0, i.e.: which, when reduced to (θ − 1) ln x + 2 = 0 gives us the solution If θ would be positive, but smaller than 1, for instance θ = 1/2, we would obtain x m0 = e 4 > 1.So, the modal exists on (0, 1] only if θ > 1 (for instance, for θ = 2, x m0 = 1/e 2 and f 1/e 2 = 16/e 2 ).
The function graph is illustrated in Figure 1.
The n-th non-central moment is: providing the mean and dispersion under the forms: (the fact that this expression is strictly positive can be easily verified).
Entropy 2017, 19, 346 4 of 10 providing the mean and dispersion under the forms: (the fact that this expression is strictly positive can be easily verified).

Estimating the Distribution Parameter
The estimation can be made by using the method of moments, according to Pearson [33], or by the maximum likelihood [34].By applying the method of moments, we first estimate the theoretical n th non-central moment.Consequently, According to calculations, in order to show that i = 1, we obtain: Hence, the average and variance indicators follow immediately: Respectively: Let , = 1, be a random sample of population {X}.As 0 ≤ ≤ 1 for any i, we also have ̅ = ∑ ≤ 1.The estimation equation using the method of moments is therefore:

Estimating the Distribution Parameter
The estimation can be made by using the method of moments, according to Pearson [33], or by the maximum likelihood [34].By applying the method of moments, we first estimate the theoretical n-th non-central moment.Consequently, According to calculations, in order to show that i = 1, we obtain: Hence, the average and variance indicators follow immediately: Respectively: Let x i , i = 1, n be a random sample of population {X}.As 0 ≤ x i ≤ 1 for any i, we also have x = n −1 ∑ x i ≤ 1.The estimation equation using the method of moments is therefore: where: The condition θ > 1 involving x > 1 8 = 0.125.By using the maximum likelihood method, we immediately obtain: ˆ x i whose repartition is known.Indeed, by making the transformation x = e −u , u ≥ 0, our density becomes: namely, a particular case of Gamma density [32,33,35]: for k = 3.If parameter k is known, then: ˆ 1 which is an unbiased parameter and of minimum dispersion, which is not the case of θMM .

A Particular Case
We will analyze the particular case where θ = 2 (hence, the distribution is completely specified), in order to outline the difficulties met when calculating the distribution function for the general case.Indeed: and taking into account that ( [36], p. 133): It follows for θ = 2, the expression α = 1 obviously: Even for this simple case, the calculus of the theoretical median, for instance, leads to the transcendent equation F(x; 2) = 1/2, i.e., ϕ(x) = x 2 4 ln 2 x − 4 ln x + 2 − 1 = 0 which really has a solution on the interval (0;1), as ϕ(0) = −1 and ϕ(0) = 1 and ϕ(0)•ϕ(1) < 0. The solution is unique, as we can prove that the graphs of the curves ϕ 1 (x) = 4 ln 2 x − 4 ln x + 2 and ϕ 2 (x) = 1/x 2 are intersected at a single point on the interval (0,1).The value of parameter θ = 2 is obtained, for instance, if x ≈ 0.296 (from the estimation relation by means of the method of moments), a value higher than the "critical level 0.125", in accordance [37] with Equations ( 9) and (10).
Consequently, it is clear that the various problems related to the direct implication of the distribution function (quantiles assessment, natural tolerance, etc.) must be considered according to various particular values of θ in order to finalize the computations.In the next sections, we proceed to develop the analysis using the sequential method.

Verifying Certain Statistical Hypotheses
Verifying a simple statistical hypothesis: with the alternative: represents a hypothesis on the stability of the system structure at a given moment [33,37].By using the likelihood ratio test for a single observation, we deduced the equation providing the decision constant x d according to θ 0 and to the significance level ε of the test: In this equation, x d can be approximated either by a I-st degree polynomial (ln x d ≈ x d − 1) leading in the left side to a parabola: or by a II-nd degree polynomial ln 2 , leading in the left side to a IV-th degree curve (polynomial): Again, the case where θ = 2 proves to be interesting, since we have to show that the polynomial y = (x − 2) 4 + 1 intersects only once the hyperbola y = 2ε/x 2 on the interval (0,1).Indeed, if we denote by ϕ In this case, we would like to use several sequels of observations, therefore it is better to use the sequential probability ratio test (SPRT)-Wald procedure (the developments in the area of theoretical and applied sequential analysis generated the editing of a profile journal since 1984: Sequential Analysis: Design Methods and Applications): which, by logarithms operation, leads to: If α and β are the risks associated to the two hypotheses, and A and B are the decision constants of the sequential test, A ≈ (1 − β)/α and B ≈ β/(1 − α), then the experiment estimating area is given by the double inequality: Here, x 1 , x 2 , . . ., x n , . . . is the sequential sample.

The Sequential Comparison
If we have two systems characterized by parameters θ and ω, then, from a practical point of view, it is interesting to compare the level of the respective parameters.This is reduced to verifying the compound hypothesis H : θ ≤ ω versus the alternative H : θ > ω.
Girshick [38] proposed an SPRT test as follows: let X and Y be the two systems, or the same system in two periods (in our case), characterized by the densities f (x; θ) and f (y; ω), respectively.We choose two values, θ 0 and ω 0 (θ 0 < ω 0 ), and let H 0 be the statistical hypothesis: the joint distribution of variables X and Y has the form f (x; θ 0 ) f (y; ω 0 ), with alternative H 1 ; the joint distribution is f (x; ω 0 ) f (y; θ 0 ).In other words, verifying H versus H' is reduced to: H 0 : θ = θ 0 , ω = ω 0 versus H 1 : By using the notations f 0 (x; y) = f (x; θ 0 )• f (y; ω 0 ) and f 1 (x; y) = f (x; ω 0 )• f (y; θ 0 ), respectively, then the likelihood ratio associated with the observation pairs (x 1 ; y 1 ), (x 2 ; y 2 ), . . . ,(x n ; y n ) . . .is, in our case: So, the uncertainty area is given by relations: Girshick showed that reducing the verification of H 1 to that of H 0 can be made if there is a function v(θ; ω) with the following properties: This function, called also the Girshick function, can be considered as a measure of the difference between θ and ω.

Application and Conclusions
The European Union's economy is emerging, year after year, and growing stronger as a world leading economic player.In 2015, the total gross domestic product (GDP) of the 28 member states exceeded 14,600 billion Euros, accounting for around 20% of the world trade being, next to the US and China, the third world economic power.
We plan to look at whether, between the first five economies in the European Union (Germany, Spain, France, Italy, United Kingdom) with annual GDP values of more than 1000 billion, there have been changes in the structure of the group.Table 1 shows the data for the years 2003 and 2015.In 2015, the group of the most powerful economies in the European Union held about 68% of GDP, decreasing compared with 2003, and also there were structural movements within The Group.For example, the weight of Spain increased (+52.05% in 2015 compared to 2003), while for the rest of the group members the weight decreased, given the increase of GDP in absolute value (+27.29% for the group of the five most developed countries, +39.51% across the European Union).The characterization of the structure using the methods of informational statistics (entropy, Gini coefficients, structure vectors, etc.) is not possible, because ∑ n i=1 f i ≤ 1, but it is useful to use the method outlined above.The mean values determined by Equation (3), the variants calculated by Equation ( 4), the parameters computed by Equation (10), and other intermediate elements necessary for testing the hypotheses (16a) and (16b), and for determining the uncertainty intervals by Equations ( 22) and (24), are presented in Table 2. Based on the information processed in Table 2, for the usual risks values of type I and II, α = 0.05 and β = 0.10, with ln B = ln(0.10/(1− 0.05)) and ln A = ln((1 − 0.10)/0.05),the decision report and the essence of the Wald test in the validation of the hypotheses (16a) and (16b), as well as through the interval established by Equation ( 22), the result is as follows: ∑ i ln x i ∈ (−46.1298; 0.1344), as in version (24), the size ∑ n i=1 ln x i y i is placed between the limits of uncertainty (−2.136 < 0.4020 < 2.74).The conclusion drawn from this is that in the group of the five developed European Union countries, between 2003 and 2015, there were no major changes in the structure.This is a decision made at a type I risk of 5% or a type II risk of 10%.
The sequential test, which we chose in order to test the hypothesis on the movement of the GDP structure for the group of developed countries in the European Union, offers, unlike other possible variants of testing, the advantage of working with the existence of an uncertainty area.This is useful due to the fact that GDP is also determined for periods below annual and for national territorial divisions (nomenclature of territorial units for statistics), as well as for neighboring regions (Euro-regions), and it is interesting to follow the structural changes over time.The proposed method can also be used beyond the suggested application, for example, in characterizing the structure and its changes within economic activities (European Classification of Economic Activities) to highlight the economic upturn by increasing the share of high added value activities, or the changes (in time) in occupational structure (International Standard Classification of Occupations).Likewise, population structures can be analyzed by nationalities, ethnic groups, age groups, and many more categories.

Table 1 .
Gross domestic product (GDP) values for the analyzed group and the whole European Union.

Table 2 .
Results of data processing.