A New and Stable Estimation Method of Country Economic Fitness and Product Complexity

We present a new metric estimating fitness of countries and complexity of products by exploiting a non-linear non-homogeneous map applied to the publicly available information on the goods exported by a country. The non homogeneous terms guarantee both convergence and stability. After a suitable rescaling of the relevant quantities, the non homogeneous terms are eventually set to zero so that this new metric is parameter free. This new map almost reproduces the results of the original homogeneous metrics already defined in literature and allows for an approximate analytic solution in case of actual binarized matrices based on the Revealed Comparative Advantage (RCA) indicator. This solution is connected with a new quantity describing the neighborhood of nodes in bipartite graphs, representing in this work the relations between countries and exported products. Moreover, we define the new indicator of country net-efficiency quantifying how a country efficiently invests in capabilities able to generate innovative complex high quality products. Eventually, we demonstrate analytically the local convergence of the algorithm involved.


Introduction
In the last decade a new approach to macroeconomics has been developed to better understand the growth of countries [1,2]. The key idea is to consider the international trade of countries as a proxy of their internal production system. By describing the international trade as a bipartite network, where countries and products are sites of the two layers, new metrics for the economy of countries and the quality of products can be constructed by leveraging the network structures only [1] . These metrics quantify the fitness of countries, the quality of their industrial system, and the complexity of commodities by indirectly inferring the technological requirements needed to produce them. The mathematical properties of the algorithm involved in the evaluation of the metrics, as well as the economic meaning of the metrics and possible applications, have been discussed in previous papers [3][4][5]. Moreover, these two new metrics have been successfully used to develop state-of-the-art forecasting approaches for economic growth [6][7][8].
The very same approach has been applied to different social and ecological systems presenting a bipartite network structure and a competition between the components of the system [9,10]. Thus, it is natural to interpret fitness and complexity as properties of the network underlying those systems. The revised version of the fitness-complexity estimation metric that we show here, results in a clear and natural interpretation in terms of network properties and helps to better understand the different components that contribute to the fitness of countries.
In the following, we first describe the original metrics of fitness and complexity and their properties, underlining some critical issues that we solve with this new version. Then, we define the new procedure step by step and highlight its advantages in the case of countries-products bipartite networks. Finally, we devise an approximated solution and discuss its interpretation. In Appendix B we list the main quantities appearing in the text.

The Original Metric
Object of this work is the network of countries and their exported goods. This network is of bipartite type (countries and products are mutually linked, but no link exists between countries as well as between products) and weighted (links carry a weight s cp , i.e., the exported volume of product p of country c, measured in US$). Data ranging from year 1995 to year 2015 can be freely retrieved from the Web [11], though we use them after a procedure to enhance their consistency [8]. Eventually, we come up with data about 161 countries and more than 4000 products, which were categorized according to the Harmonized System 2007 coding system, at 6 digits level of coarse-graining. The weighted bipartite network of countries and products can be projected onto an unweighted network described solely by the M cp matrix with elements set to unity when a given country c meaningfully exports a good p and zero otherwise (See Methods).
The original metric estimating the fitness of countries and complexity of products was defined by the following non-linear iterative map: In the previous expression, F c and Q p stand for the fitness of a country c and quality (complexity) of a product p; C and P are the total number of countries and exported products, respectively; and from the dataset we have that C P. By multiplying all F c and Q p by the same numerical factor k, the map remains unaltered, so that the fixed point of the map (as n → ∞) is defined up to a normalization constant. In the original method this constant is chosen at each iteration n such that fitness and complexity are constrained to lie on the double simplex defined by: The metric defined in Equations (1) and (2) successfully ranks the countries of our world according to their potential technological development and, when applied to different yearly time intervals, can be used to suggest precise strategies to improve country economies. It has also been proved to give the correct ranking of importance of species in a complex ecological system [9]. Despite its success, some points can still be improved: i.
Convergence issues: As stated in a recent paper dealing with the stability of calculating this metric [12]: If the belly of the matrix [M cp ] is outward, all the fitnesses and complexities converge to numbers greater than zero. If the belly is inward, some of the fitnesses will converge to zero.
Since an inward belly is the rule rather than the exception, some countries will have zero fitness and as a result all the products exported by them get zero complexity (quality). This is mathematically acceptable, though it heavily underestimates the quality of such products: Even natural resources need the right know-how to be extracted so that their quality would be better represented by a positive quantity. To cure this issue one has to introduce the notion of "rank convergence" rather than absolute convergence, i.e., the fixed point is considered achieved when the ranking of countries stays unaltered step by step. ii.
Zero exports: The countries that do not export any good do have zero fitness independently from their finite capabilities. iii.
Specialized world: In an hypothetical world where each country would export only one product, different from all other products exported by other countries, this metric would assign a unity fitness and quality to all countries and products. Though mathematically acceptable, this solution does not take into account the intrinsic complexity of products. iv.
Equation symmetry: This is rather an aesthetic point, in that Equation (1) are not cast in a symmetric form.

The New Metric
First, we reshape Equation (1) in a symmetric form by introducing the variable P p = Q −1 p , i.e.,: Now the quality of products are given by the quantities P −1 p and the metric is trivially equivalent to the original one provided one uses the normalization conditions ∑ c F (n) c = C and ∑ p (P (n) p ) −1 = P. Next, we introduce two set of quantities φ c > 0 and π p > 0 and consider the inhomogeneous non-linear map defined as: Since the map is no more defined up to a multiplicative constant, the normalization condition is not required anymore, while the initial condition can be set as in the original metric F The fixed point of the transformation is now trivially characterized by the conditions: The parameters φ c and π p can be interpreted as follows. The parameter φ c represents the intrinsic fitness of a country. In fact, for a country k that does not export any good we have M kp = 0 ∀p so that its fitness is simply equal to φ k . Irrespective of its exports any country has a set of capabilities that characterize it.
The parameter π p is more intriguing. If no country exports it (probably because no country produces it), the product q has not been invented yet and its quality lies at its maximum value π −1 q since M cq = 0 ∀c. Therefore, the inverse of π q may be interpreted as a sort of innovation threshold: The smaller the parameter is, the higher is the quality of the product in his outset and more sophisticated capabilities are necessary to produce it. On the other hand, products like natural resources may be associated with a larger value of the parameter since require less complex capabilities for their extraction.
In order to keep the metric evaluating algorithm simple and parameter free as in the original case, we set a common value φ c = π p = δ, then we study the dependence of the metrics on δ, and finally we set δ = 0 (in fact renouncing to cure the issue number iii. listed above).

Dependence on the Non-Homogeneous Parameter
We consider φ c = π p = δ ∀c, p and address the dependence of the fixed point upon δ. To outline the dependence of F c and P p from the parameter δ, we use the relations defined in Equation (4) and introduce the rescaled quantitiesP p = P p /δ andF c = F c δ. After some trivial algebra we get from Equation (4): from which we deduce that, as soon as the parameter δ 2 is much smaller than the typical value of M cp matrix elements, i.e., much smaller than unity, the fixed point in terms ofF c andP p almost does not depend on δ (see Figure 1). It is worth noting that the values of fitness F c and quality Q p = P −1 p of the original map defined by Equations (1) and (2) cannot be obtained in this new metric when the parameter δ tends to zero. In terms ofF c andP p the fitness and quality obtained in the original metric can be expressed as F c =F c δ −1 and Q p =P −1 p δ −1 . Since the new metric provides finite non vanishing values ofF c andP p , by taking the limit δ → 0 would deliver infinite values of F c and Q p . We might think that the normalization procedure necessary in the old metric in order to fix the arbitrary constant would get rid of the common factor δ −1 and deliver the same values of the new metric. Unfortunately, this is not the case since the new metric does not rely on a normalization procedure. Therefore, since a self-consistent procedure of normalization, i.e., a projection on the double simplex defined by Equation (2), is missing in the new metric, the results cannot coincide. Since the quantitiesF c andP p are well defined in the limit δ → 0, we shall focus on them only, in the following. We remind that the complexities of products delivered by the original metric are connected to the set of P −1 p and thus to theP −1 p . In particular, the second of Equation (6) can be interpreted at the fixed point as:P p = 1 +Q −1 p with theQ p expressed as in the second of Equation (1), but with the tilde quantities calculated in the new metric. Therefore, we shall assign toQ p = (P p − 1) −1 the meaning of complexity of products in our new metric. The differences between the old and new metrics are depicted in Figure 2.

Analytic Approximate Solution
In this section we shall provide an approximate analytic solution that can be used to estimate the values attained by the map of Equation (6) at the fixed point. Despite their symmetric shape, Equation (4) are not symmetric at all since in case of actual countries and products, the matrix M cp is rectangular with the number of its rows C being much less than the number of its columns P. To estimate the effect of this asymmetry, we first consider Equation (4) in a mean field fashion, where each element of M cp is set to the average value M = ∑ c,p M cp /CP, and write, at the fixed point: with now allF c andP p set to be equal to their mean field valuef andp, respectively. By setting δ = 0, we findp = 1/(1 − C P ) ≈ 1 + C P andf = P − C. Indeed, an approximate expression for the fixed point of Equation (6) in the regime δ 1 and C P can be derived also beyond the mean field approximation. To this end, we set again δ = 0 and consider the corresponding fixed point equation associated to Equation (6), i.e.,: From the empirical structure of the matrix M, we observe that the quantity D c = ∑ p M c,p , representing the diversification of country c, i.e., the number of different products exported by c, is of the order of P, at least for the majority of countries (as an average over all the years considered we find that 70% of the countries have 0.1 ≤ D c /P ≤ 1). Therefore, settingP * = max pPp andF * = min cFc , Equation (8) implies: From the first estimate,F * ≥ const P /P * , and therefore, by the second estimate,P * ≤ 1 + const C PP * . As P p ≥ 1, we conclude thatP p = 1 + W p with W p in the order of magnitude of C/P, and, as a consequence,F c is of the order of magnitude of P.
We next compute explicitly the values ofF c andP p at the first order in this approximation. The calculation of second order terms can be found in Appendix A. By using the first order approximation (1 + a) −1 ≈ 1 − a twice, from Equation (8) we have: Now let H be the square matrix of elements H pp = ∑ c M T pc D −2 c M c p . Letting D −1 be the column vector with components 1/D c and 1 the identity matrix, the last displayed formula reads: We now observe that: H pp ≤ ∑ c 1/D 2 c ≤ const C/P 2 . Therefore, the matrix (1 − H) is close to the identity (the correction is of order C/P 2 ) and hence invertible (with also the inverse close to the identity). In this approximation, W = M T D −1 , so that the rescaled (reciprocals of the) qualities of products are given by: In the same approximation, we obtain the rescaled fitnessesF c ; since: having introduced the co-production matrix K = MM T with elements K cc = ∑ p M cp M c p , representing the number of the same products exported by the two countries c and c . It is interesting to note how, up to the first order approximation, the values of the fitness of countries are depending on the co-production matrix and diversification only. The goodness of the approximations above can be appreciated in Figure 3 that shows how the relative difference between the numerical values at the fixed point and the approximate solution of Equation (10) is below 0.5% for more than 85% of the countries.
It is worth noting that in a recent work the Economic Complexity Index (ECI) defined in Ref. [2] has been connected to the spectral properties of a weighted similarity matrixM resembling our co-production matrix K [13]. This similarity is only apparent since in ECI the matrixM is defined as: i.e., it contains a further weighting term (the ubiquity) in the sum defining it. Besides, the two metrics of ECI and Fitness-Complexity differ very much from each other: ECI relies on a linear homogeneous map, while Fitness-Complexity relies on a non-linear and in this work also non-homogeneous map.

Country Inefficiency and Net-Efficiency
From Equation (10) we deduce that the leading part of fitnessF c is given by the diversification D c . The diversification of a country is indeed an important quantity, for the calculation of which we do not need any complicated algorithm. On the other hand, what the non-linear map proposed does, is to quantify how a country manages to successfully differentiate its products, and indirectly offers an estimate of the capabilities of a nation. In fact, a country exporting mainly raw materials would be less efficient with respect to a country exporting high technological goods, when they have the same diversification value. For this reason, we introduce the new quantity I c = D c −F c , inefficiency of country c: the smaller the value I c the more efficient is the diversification it chooses. From the approximate solution displayed in Equation (10), we get that I c ≈ ∑ c K cc /D c , so that the inefficiency of a country is a weighted average of its co-production matrix elements. The dependence of the country inefficiency on the diversification is displayed in Figure 4, while a visual representation of it is displayed in Figure 5.  Large ovals represent three countries, while small circles represent products. In this simple example, the inefficiency I 1 of country 1 is I 1 = K 12 /D 2 + K 13 /D 3 . From the figure we get K 12 = 2 and K 13 = 4, i.e., the number of products exported by both countries (the cardinality of the intersection sets), and the diversifications D 1 = 17, D 2 = 5, D 3 = 20. Thus, I 1 = 2/5 + 4/20 = 0.6 and the approximated fitnessF 1 ≈ 16.4. It is interesting to notice how a clear power-law dependence exists between the inefficiency and the diversification of a country.
The structure of the M matrix is such that those countries with high diversification also export low quality goods in average. Therefore to a large diversification would statistically correspond a large inefficiency, though the found power-law is not trivial and depends on the structure of the M. A similar power-law behaviour is found between the fitness calculated with the traditional metric and the diversification, but with a different exponent (from the left panel of Figure 2 we deduce that there is a power-law relation between the fitnesses calculated with the original metric and this new metric, and the exponent is around 1.53; since the fitness F c calculated with the new metric goes as D c at the first order, then the old fitnesses also go as D 1.53 c ). In order to better appreciate the production strategies of countries, we subtracted the common power-law trend of the dependency of the inefficiency on the diversification for each year, changed its sign and plotted the result in the right panel of Figure 6, which thus shows the time evolution of a quantity that we call country net-efficiency N c (net in the sense opposed to gross) over the years 1995-2014. It interesting to note how countries behave differently over the time lapse considered. Some countries display a decreasing net-efficiency, others an increasing or a constant one. What many of these curves have in common is the decreasing set up around year 2000, more pronounced in the case of higher developed countries, which lie at high net-efficiency in the graph.

Local Convergence
From the simulations it is clear that the fixed point obtained by iterating Equation (4) is locally stable. We can also prove it by resorting to the Jacobian of the transformation, in the case of countries and products. First we recall that the sum over the indexes c and p of Equation (4) run from 1 to C and P, respectively, with usually C P. In the case of countries and products C/P ≈ 10 −1 . We also fix φ c = π p = δ 1, so that the fitnesses and the (reciprocals of the) qualities at the fixed point are approximately given by: F c =F c /δ and P p = δP p withF c andP p the components of the vectorsF and P given in Equations (9) and (10) respectively.
Next, we calculate the Jacobian of the transformation at the fixed point, which can be simply expressed as the block anti-diagonal matrix: having introduced the diagonal matrices F = diag(F 1 , F 2 , . . . , F c ) and P = diag(P 1 , P 2 , . . . , P p ), respectively. We claim that the spectral radius ρ(J) of the square matrix J is strictly smaller than one. Denoting by σ(J) the spectrum of J, this means that ρ(J) := max{|λ| : λ ∈ σ(J)} < 1. From this it follows [14] that the fixed point is asymptotically stable and the convergence exponentially fast. To prove the claim we consider the square of the Jacobian that can be written as a block diagonal matrix, and note that the traces of the two matrices on the diagonal is the same by applying a cyclic permutation. Noticing that F c P p =F cPp and using the approximate solutions in Equations (9) and (10), we find with simple algebra that: Moreover, we can write the two non trivial matrices composing J 2 as: and: with A = F −1 MP −1 . The matrices AA T and A T A are symmetric and positive-semidefinite so that their eigenvalues are real and non negative, and the matrices FAA T F −1 and PA T AP −1 have the same eigenvalues. Therefore, the eigenvalues of J 2 are real and non negative and we can write according to Equation (13): with λ i eigenvalues of J. Finally, from the preceding equation we have max λ 2 i < max |λ i | < 1 so that at the fixed point ρ(J) < 1.

Robustness to Noise
Fitness and complexity (quality) values depend on the structure of the matrix M cp . Noise can affect its elements by flipping their value. Thus, we test the robustness of the new metric to noise as described in [15]. The idea is to introduce random noise by flipping each single bit of the matrix with probability η, which then is a parameter tuning the noise level. The rank of country fitnesses in presence of noise R η c is then compared with the rank obtained without noise R 0 c . The Spearman correlation ρ s is then evaluated between these two sets and shown in Figure 7 as a function of η for both the original and the new metrics: The new metrics show a perfect stability to random noise as the original ones with an unavoidable transition around η ≈ 0.5, where noise is so strong to alter significantly the structure of the matrix M cp .

Discussion
The proposed new inhomogeneous non-linear metric to estimate economic fitness and complexity defined in Equations (4) and (6) carries many advantages with respect to the original one. The fitnesses and complexities resulting from these two approaches are not identical, but highly correlated to each other as witnessed by the plots in Figure 2. This high correlation between the two metrics ensures that all the studies carried on with the original metric so far, can be obtained by applying this new metric as well.
Besides the stability of the metric and its robustness, one more advantage is that the fitness is well defined also for those countries that have low exportation volumes and that in the original metric had their fitness tending to zero. For those countries it is now possible to undertake a comparative study based on hypothetical investments (changing the elements of the M matrix) so to make predictions on their economic impact.
By first symmetrising the original equations, by adding an inhomogeneous parameter and by rescaling the quantities, one obtains Equation (6), where the parameter can be safely set to zero. This ensures that this new metric is parameter free as the original one. As a pleasant side effect, the fixed point of the map can be well approximated analytically, with an error with respect to the iterative fixed point of less than 3% (see Figure 3). The result is represented by Equations (9) and (10) at the first order (Equations (A2) and (A3) at the second order), which allow for a simple intuitive explanation of the complexity of products and fitness of countries.
Let us discuss Equation (10) first. The result suggests that the fitness of a country is trivially related, at the first order, to its diversification: The more products a country exports, the larger is its fitness, i.e., the more developed its capabilities. This simple explicit dependence of the fitness on the diversification is also an advantage with respect to the original metric, where the dependence was not explicitly clear. The second term of Equation (10), which we call inefficiency, is also very interesting. If a country is the only one to export a given product, the contribution of this product to its fitness is a full one, or in other words, the contribution to the inefficiency is zero. This situation mimics a condition of monopoly on that product and it is logical that the exporting country has the full benefit of it. When a product is exported by multiple nations then it is critical to assess whether those countries export few or many other products (see Figure 5). If a product is exported by a country c with low diversification (low capabilities), then that product is not supposed to be of high complexity. The result is that the ratio K cc /D c can be close to one (c = 1, c = 2 in the figure) and the inefficiency associated to the common products is high, resulting in a small contribution to the fitness of c. The inefficiency can be interpreted in terms of the bipartite network of countries and products: The K cc counts the number of links that connect countries c and c to the same products, while the differentiation D c is the node degree of country c. In other words, for a country c the inefficiency counts the links to common products of all other countries and weights them according to the degree of those. To our knowledge, this kind of measure has never been considered in complex networks so far.
Since, statistically, countries with an high diversification also export many less complex products, the inefficiency is an increasing function of the diversification (Figure 4, main graph). If we subtract the general trend, which stems from the structure of the matrix M cp , we can appreciate the net effect of selecting the goods to export. We call this new de-trended quantity net-efficiency. In this way we somehow remove the negative effect of less valuable products and highlight the contribution of more sophisticated goods. In the inset of Figure 4 we show the net-efficiency as a function of diversification and underline the three nations (Japan, Korea, and Switzerland) that stand out among the others. The time evolution of this new quantity is shown in the right panel of Figure 6. We can combine the time evolution of both fitness and net-efficiency for a given country to determine to what degree they are correlated. Figure 8 shows the two quantities for selected countries. It is clear how these two quantities are not related to each other and represent two complementary information. In fact, the fitness is mainly connected to the product diversification of a country (Equation (10)) while the net-efficiency is connected to how complex are the exported products. In the figure we see two extreme cases represented by Switzerland (CHE) and South Korea (KOR), whose lines are practically orthogonal. While Switzerland have decreased the number of different exported goods in the years but have still exported complex products, South Korea have kept the number of exported products as almost constant but have increased their complexity. The opposite situation of South Korea we notice for Germany (DEU), where the complexity of the exported goods has decreased in time. Interestingly, China (CHN) has systematically increased both the number of exported goods and their complexity, which we interpret as a symptom of a solid economy in expansion. The complexity of products is estimated by Equation (9) as the reciprocal of the second term of the sum. Since the diversification of a country D c is a direct measure of its capabilities, we expect to find a simple relation between it and the complexities of productsQ p . Indeed, if we indicate with c i those countries exporting the product p, for which obviously we have M c i p = 1, and with m = ∑ c M cp , we can write:Q from which we corroborate the main idea that the complexities of products are driven by the countries with low diversification (capabilities) that export it. Just for amusement, we observe how the complexity of products can be considered as the equivalent resistor of a parallel of resistors each one with resistance D c . Somehow, a high D c represents an effective resistance to the creation of a product and its export, so that if a country exists with a low diversification exporting it, the effort (resistance) of producing that product is also low.

Construction of the M Matrix
We exploit the UN-COMTRADE data set [11], where re-export and re-import fluxes are explicitly declared, allowing us to exclude them from the analysis. As reported by UNSTAT in Ref. [16], the 81.8% of the whole data set (96.8% in case of developed countries) does not account for goods in transit. Moreover, commodities that do not cross borders are not included in the data.
Given the export volumes s cp of a country c in a product p one can evaluate the Revealed Comparative Advantage (RCA) indicator [17] defined as the ratio: RCA cp = s cp ∑ c s c p ∑ p s cp ∑ c p s c p (17) in this way one can filter out size effects. As described in the Supplementary information of [8], from the time series of the RCA we can evaluate the productive competitiveness of each country in each product by assigning to it a productivity state from 1 to 4. State 1 means that the country does not produce (or is very uncompetitive in producing) a product, state 4 means that it is one of the main producer in the world. We can then project this states onto the binarized matrix M cp by simply setting its elements to unity whenever a state larger than 2 is encountered, and set them to null otherwise.