Green Technology Fitness

The present study provides an analysis of empirical regularities in the development of green technology. We use patent data to examine inventions that can be traced to the environment-related catalogue (ENV-Tech) covering technologies in environmental management, water-related adaptation and climate change mitigation. Furthermore, we employ the Economic Fitness-Complexity (EFC) approach to assess their development and geographical distribution across countries between 1970 and 2010. This allows us to identify three typologies of countries: leaders, laggards and catch-up. While, as expected, there is a direct relationship between GDP per capita and invention capacity, we also document the remarkable growth of East Asia countries that started from the periphery and rapidly established themselves as key actors. This geographical pattern coincides with higher integration across domains so that, while the relative development of individual areas may have peaked, there is now demand for greater interoperability across green technologies.


Introduction
There is broad consensus among academics and policy makers that accelerating the development of new low-carbon technologies and promoting their global application are crucial steps, albeit not the only ones, towards containing and preventing greenhouse gas (GHG) emissions. To be sure, climate change is a global phenomenon with marked local manifestations, which implies that geographical areas differ significantly both in their exposure as well as in their ability to respond effectively to climate events. Indeed the striking paradox is that while environmentally friendly technologies emerge primarily in industrialised countries, the urgency to mitigate GHG emissions is stronger in emerging economies. Last but not least, besides the traditional negative externalities due to non-appropriability and non-exclusivity of knowledge, green technologies engender also positive externalities in the form of improvements to the quality of the environment.
These features highlight the importance of institutional conditions for promoting or thwarting sustainable economic growth. Governance mechanisms that are crucial to create incentives for efficient use of natural resources and for environmental conservation, while minimizing the prospect of market failures, are spatially bound [1]. Spatial features are also relevant because the generation and diffusion of knowledge stem from the recombination of ideas [2,3] among agents that have limited access to information, as well as imperfect capacity to absorb, process, and respond to it [4]. Because information exchange entails costs that increase with the diversity of the attendant knowledge base, higher coherence between activities is expected to facilitate the likelihood of innovation [5][6][7].
The key point is that economic development builds on existing local capabilities to generate distinctive technological and industrial profiles [8,9]. A major driver of the distinctiveness of these trajectories is indeed the composition of knowledge, that is, the number of underlying inputs and the interdependence between them [10][11][12]. The greater and more diverse the spectrum of know-how, the more complex the domains to which this knowledge is applied, be they products [13,14], industries [15] or technologies [16]. Empirical evidence provides clear indications about these patterns. First, there are significant differences in the complexity of knowledge produced across geographical locations. Second, only a few areas exhibit proficiency in complex activities, and this usually correlates with their long-run economic development. However, by virtue of path-dependence, while investing in complex technologies is beneficial in principle, many areas simply lack the necessary competences and, most fundamentally, their underlying conditions prevent them from creating a new path of development. As a consequence, and third, these features are dynamically self-reinforcing.
In this paper, we employ analytic techniques developed within the Economic Fitness-Complexity (EFC) approach to economic prediction [17] in order to assess the development and geographical distribution of green technologies between 1970 and 2010. EFC is a data-driven methodology that originally targeted the relation between the composition of the export baskets of countries and their potential to become more developed economies. The idea behind this methodology is that for a country to become competitive in the production of a given good, it must first acquire the necessary skills. However, the process leading to the acquisition of new capabilities is by its very nature cumulative and highly path-dependent, which is consistent with the fundamental intuition that complex products requiring advanced skills will be exported mostly, if not only, by high fitness countries that will also be competitive in the production and trade of less complex goods. Capabilities are generally not observable, and can be conceived as a latent intermediate layer between countries and products in an ideal tri-partite network. Some recent successful applications of EFC [18,19] have aimed to extract information about the effects of accumulated capabilities by studying the bipartite network of countries and exported goods. These studies have shown that the EFC algorithm has considerable predictive power of the future development of countries, as measured by their future per capita GDP. Among its outputs, the algorithm features a ranking of country fitness values that proxy how advanced the set of capabilities of each country is, and a ranking of product complexity values that proxies how advanced are the capabilities required to produce each product. The satisfying performance of the method on empirical data has also led to the development of a diversified array of methods and indicators that rest on the same premises. One such derived measure, which is called sector fitness, is a straightforward modification of the method proposed by [17]. This narrows down the analysis to a set of similarly classified products and generates a snapshot of the strength of each country in a specific sector of activity. Notice that the EFC method exhibits substantial versatility. For instance, it has been applied successfully to study labour sectors instead of exported products [20]; another recent application of directly related techniques has been employed to analyse the capability spillovers between the patenting activity, the scientific production and the export profiles of countries [21].
For this study, we use patent applications as a proxy of capability. The main source is the European Patent Office (EPO) Worldwide Patent Statistical Database (PATSTAT) containing patent applications that can be traced to the environment-related technologies catalogue (ENV-TECH) developed by the Organisation for Economic Co-operation and Development (OECD) [22] and organised in macro-domains such as environmental management, water-related adaptation, and climate change mitigation. The transliteration of the EFC approach to this hitherto unexplored empirical context rests on the idea that the criteria for assigning patent applications to specific domains (i.e., technological classes) are identifying characteristics of the expertise that is necessary for successful invention. In particular, the co-occurrence of technological classes in a country allows us to identify the extent to which inventions and the attending capabilities are common across countries. Accordingly, a country that has a diversified portfolio of technologies spanning from the most to the least complex ones will have higher fitness while, in turn, complex technologies appear almost exclusively in the portfolio of high-fitness countries. As a consequence, more specialised (or less diversified) countries operate almost exclusively in less complex sectors. In other words, the portfolio of activities of low-fitness countries is (almost) nested in that of higher-fitness countries.
Bearing in mind the benefits and the shortcomings of using patent data for the study of technology development (see e.g., [23][24][25]), the juxtaposition of the above database and methodology yields proxies of environment-related inventive activities that allow cross-country and cross-technology comparisons. In particular, the set of indicators proposed here informs a ranking of countries propensity to create new green technology as well as of the development these technologies. While we remain agnostic about the pathways through which countries develop and apply capabilities to environmental issues, we provide insights into the extent to which each country contributes to the global network of technological capabilities, as well as into the extent to which the technologies grow and develop as a result of distributed inventive efforts. Furthermore, we expect that a thorough mapping of who is inventing and in what can enrich the current debate on leaders and laggards in the transition to sustainable societies. A detailed analysis of the contextual institutional processes that shape the accumulation of innovative competences within countries-such as i.e., research and development, labour markets, etc.-and how this affects differential performance between countries is beyond the scope of the current study, and is left for future research.

Data
The main data source is the PATSTAT database [26] of patent applications. In particular, we exploit patent classification codes to identify inventions in the domain of environment-friendly technologies within the classification ENV-TECH elaborated by the OECD [22], which groups International Patent Classification (IPC) and Cooperative Patent Classification (CPC) codes into 94 green technologies. The IPC and CPC are two widespread technology classification systems employed by patent offices to classify the patent documents based on the technological areas in which they claim to be novel. Both systems exhibit a hierarchical structure that describes the technical content of the patents in progressively finer detail at lower levels of aggregation [27].
We also exploit information in PATSTAT about patent families-i.e., collections of patents that can be linked to one or more common 'ancestor' patent documents. These collections typically contain documents relating to the multiple applications involved in protecting the same inventions in multiple countries, and are our unit of analysis [28]. We identify 1,179,657 patent families (or 2,690,606 patent applications) to which at least one ENV-TECH classification code is assigned. The resulting data set includes patent families, filed between 1970 and 2010, concerning a large share of green technologies in the following fields (and the associated 1-digit ENV-TECH code): climate-change mitigation technologies (CCMTs) related to energy production (4) • capture and storage of greenhouse gases (5) • CCMTs related to transportation (6) • CCMTs related to buildings (7) • CCMTs related to waste-water and waste management (8) • CCMTs in the production of goods (9) To measure national knowledge bases, we assign patent applications to countries using the inventor's address information in PATSTAT. This procedure yields a weighted matrix W(y) in which each element W c,t (y) represents the fractional count of inventions attributed to country c and technology t in year y (see Figure 1 for a detailed example). Such a value can be considered a proxy for the degree of involvement in green technology t of inventors residing in c (see Appendix C for a more detailed description of the procedure and the data sources).

A Fitness Approach to Green Technology
As mentioned in Section 1, we focus on the green sector-fitness of countries that host inventors of green technologies and the complexity of the green technology classes included in the inventions. Recall that the peculiarity of sector-fitness lies therein, to compute it, we do not extract information from the whole technology spectrum (all possible IPC and/or CPC classes) but, rather, we restrain to a subset of classes that identify the relevant area for the study of a particular sector of activity, in our case, green technologies. Furthermore, recall that this approach has already been employed successfully in the study of country exports to break down the fitness profile into individual industries. No doubt, applying sector-fitness to technologies does imply some risks. The main issue is that the interpretation of the sector fitness might not be as straightforward for technologies as it is for industries. In fact, defining an industrial sector from an aggregation of products implies grouping together objects that are classified unequivocally and generally assigned to only one sector. The same cannot be said for technologies, since multiple technological fields, namely the objects that we use to define the technological equivalent of a sector, usually contribute to the same patent, and these fields tend to be quite distant within the classification tree. For this reason, studying green technology classes in isolation neglects a wealth of non-green classes that however are part of green inventions. Bearing in mind these caveats, we expect that the selection of the data involved in applying the sector fitness approach to studying green technologies still yields reasonable results. The interested reader is referred to Appendix B for a more detailed discussion.
Computations involve EFC algorithm wherein inputs are binary matrices of countries (rows) and classes (columns). The underlying assumption is that each patent family weights one unit which is shared between (country, class) pairs. Since patent applications can be unambiguously attributed to their filing year, it is natural to build a series of yearly weighted matrices W(y), where each matrix element W c,t (y) is the sum of the shares of applications filed in year y that can be traced back to country c and green-technology class t. The EFC algorithm requires a binary matrix as input, thus, for each year y, we binarize W(y) based on Revealed Comparative Advantage [29,30] and obtain M(y) such that: (1) The binary matrices are then fed to the EFC algorithm to yield non-negative scores and rankings for fitness as well as complexity. In formulae: with initial condition: The fitness of a country is thus defined as the average complexity of its technologies. The definition of the complexity of a technology, instead, involves a non-linear equation that attributes lower complexity to the technologies patented by low-fitness countries. It should be noted that, depending on the structure of M(y), the scores of the lower-ranked entities can converge to zero [31]. Fortunately, rankings remain consistent and can therefore be trusted. For this reason, focusing on country and technology rankings is a good strategy.
It is worth mentioning that patenting intensity (and coverage) in several countries has grown sharply in the past decades. Moreover, filing of new patent applications in specific technological areas is relatively intermittent, meaning that for a given pair (c, t), the corresponding cell in matrix M(y) is often different from that in M(y + 1). This is more apparent if technological codes are disaggregated, and can induce some noise. A possible solution is to give up details for inter-temporal stability by aggregating ENV-TECH technology classes from 3 to 2 digits [32]. A further complementary approach entails averaging over multiple yearly snapshots of W(t) before bin arising to obtain M(y, δ). For our analysis, we choose δ = 10 and divide the data into four non-overlapping windows- 1971-1980, 1981-1990, 1991-2000, and 2000-2010-each labeled using the latest included year (e.g., 2010 stands for the period 2000-2010). Unless otherwise stated, 2-digit technology classes are employed throughout. The interested reader is referred to Appendix D for a more detailed account of the trade-off implied by inter-temporal and technological aggregation of the data as well as the trade-off implied by the choice of the extremes of the time interval included in the analysis. Figure 2 shows the green fitness rankings of all countries across all four time windows. The higher the ranking the more complex the country's portfolio of green technologies and, thus, the more advanced the invention competences. We provide a synthetic sketch focus on how countries' innovation capacity evolves over time using colour coding to distinguish three groups depending on the initial ranking: leaders (black), followers (purple), and laggards (orange). To begin with, most of the countries that were leaders in 1980 are still in the top ranking in 2010. Even so, we observe some heterogeneity in their long-term paths. A first group of global leaders such as the United States (USA), France (FRA), Germany (DEU) (in black in Figure 2) maintained a steady high ranking throughout the period, while others-e.g., Japan (JPN), Sweden (SWE), India (IND)-remained mostly in the upper echelons, but also declined slightly and were caught up with in the ranking by some follower and laggard countries. Among these, it is worth mentioning some fast-growing countries, listed by increase in the green fitness ranking, such as Malaysia (MYS), South Korea (KOR), China (CHN), Slovakia (SVK), Portugal (PRT) and Saudi Arabia (SAU). These all started from mid-to bottom positions in 1980 and after an impressive, and steady, acceleration have reached the top part of the ranking. Notice that over time the geographical distribution of inventive activity spreads out, primarily towards Asia, while the presence of Latin American and African countries is only marginal. As regards Europe, the distinction between leaders and followers resonates with the differences between countries in the core and those in the periphery. Notice also that laggard countries exhibit similar stability to leaders, meaning that countries starting in such groups in the 1980 time window tend, with some notable exceptions, to remain in the same group throughout.  Table A2 of Appendix A. Figure 3 lists the Green Complexity of 2-digit environmental technologies in our database and the associated ranking. Again, the idea is that a higher complexity ranking indicates that a technology entails a more advanced array of capabilities. Compared to countries, green technologies exhibit more fluidity, at least in the bottom half of the list, as about half of them have at least one appearance in the top 10 (conversely, only 4 have been in the the top 5). Looking more in detail, three groups of green technologies emerge. The first cluster comprises technologies that consistently rank highest (black in Figure 3), namely 'Nuclear Energy' (4_4), 'Environmental Monitoring' (1_5), 'Enabling Technologies for GHG Emissions Mitigation' (8_3) and 'Enabling Technologies in Transport' (6_5)-in fact, each one of them has been top of the list in the period under analysis. In the second cluster (purple in Figure 3) are technologies that, while being consistently high ranking, have at least once slipped out of the top 10. Among these we observe a variety of patterns, some stable technologies-such as 'Capture or Disposal of Greenhouse Gases other than CO 2 ' (5_2)-some oscillating technologies-such as 'Technologies for Efficient Electrical Power Generation, Transmission or Distribution' (4_5), 'CO 2 Capture or Storage' (5_1) or 'Air Transport' (6_3)-as well as steady growers-i.e., 'Road Transport' (6_1), 'Rail Transport' (6_2), 'Enabling Technologies' (4_6)-and steady decliners -like 'Technologies Relating to Chemical Industry' (9_2) and 'Climate Change Mitigation Technologies for Sector-Wide Applications' (9_7). The third cluster (orange in Figure 3) contains technologies that have only been in the top 10 once, e.g., 'Water Pollution Abatement' (1_2) and 'Renewable Energy Generation' (4_1).  Table A1 of Appendix A. Again, a closer look indicates heterogeneity of patterns over time: the most notable are the ascent of 'Road Transport' (6_1) and 'Technologies in the Production Process for Final Industrial or Consumer Products' (9_6) in contrast with the decline of 'Soil Remediation' (1_4) and 'Architectural or Constructional Elements Improving the Thermal Performance of Buildings' (7_3). An interesting indication is that Mitigation technologies rank in general higher than Adaptation. Another notable feature is that almost all Enabling Environmental Technologies-that is, horizontal technologies with potential applicability in a variety of fields-feature high in the ranking, thus reaffirming the complex nature of the underlying capabilities that are needed for their design and creation.

The Most Complex Green Technologies and the Main Innovators
Let us now juxtapose the information gathered so far and look into combined country-green technology patterns. In Figure 4 we plot the green fitness based on the country-green technology matrices M(y, 10) against per capita GDP(y), for y ∈ [1980,2010]. By pooling all countries and years in our database, we estimate the expected value of green fitness through a non-parametric Nadaraya-Watson estimation with a Gaussian kernel [33]. The corresponding 95% confidence interval is computed with a bootstrap resampling. Figure 4 provides a generalization of what has emerged so far, namely that there is a positive relationship between average GDP per capita and our measure of green fitness. We opt for GDP as a proxy of living standards in a country for two reasons. The first is that GDP is a gold standard which helps us ground our exploratory study on green innovation better within the existing literature, primarily prior studies that use EFC approach on trade. For all the known limitations that GDP carries it remains the most widely used measure. The second reason is that when we contemplated the Human Development Index (HDI) [34] as an alternative (and more comprehensive) measure, we found a strong correlation with GDP and, thus, that findings were substantially unaltered. In turn, the triangular shape of the country-technology matrix of Figure 5 indicates that countries with higher levels of GDP per capita possess, as several scholars advocate, more developed capabilities that allow them to be major producers of more complex green technologies. By the same token, inventive efforts in poorer countries are limited to less complex technologies as a reflection of overall lower capabilities. These two snapshots confirm that the distribution of inventive capacity in green technology is broadly in line with prior literature [17,20,35]. Looking more in detail, Table 1 shows the ten most complex green technologies over the entire time period of the analysis (1971-2010) and, for each one, it lists the top five inventor countries, the share in total world green innovation and the corresponding RCA index. A few features emerge from this table. First, eight out of ten of the most complex technologies are for Climate Change Mitigation-the only two exceptions being GHG Capture and Storage, and Environmental Management. Second, in the upper part of the list are three types of enabling technologies, which indicates that the most advanced inventive efforts are currently devoted to perfecting existing technologies for wide, cross-sectoral purposes. Third, and related to the former, the list provides a balanced mix between mature technologies (i.e., enabling or nuclear energy) and very experimental ones (i.e., carbon capture, superconducting elements for efficient energy distribution). Fourth, the table also portrays a balanced picture as the key environmental priorities encompass areas like transport, waste, industrial production, energy and buildings. Fifth, as already anticipated earlier, the leading producers are all high-income countries. Another notable feature is the recurrence of Asian catching-up countries in various domains. South Korea ranks high in all but two (i.e., environmental management and rail transport) as a reflection of the environmental challenges due to a wide industrial mix (e.g., [36]). China excels in waste management, rail transport, industrial production and energy, a profile that resonates with the heterogeneity of emission sources due to remarkable regional and sectoral differences [37]. Conversely, Taiwan only appears in waste management, plausibly as a result of targeted policy efforts (e.g., [38]). Table 2 reports the same information as Table 1, but for the lower-complexity technologies. Unsurprisingly, also in this case the top 5 innovators per technology are high-fitness countries, which consistently with the triangular structure of Figure 5, have the necessary capabilities to excel across the spectrum, while low-fitness countries perform relatively well only in mundane technologies.  Figure 6 focuses on a sample of top countries as of 2010 and shows that there is heterogeneity in the composition of the portfolios of such top innovators. Therein each panel contains the shares of patenting in all green technologies (ordered by increasing green complexity from left to right) in the first and final decade. For instance, Japan is relatively focused on the the most complex technologies. This contrasts with the country profiles of, say, the US or France which instead have a more balanced portfolio of green innovation across the complexity spectrum. The above is informative of the differential contribution of countries to the advancement of the green technology frontier. Moreover, this broad and long-term view allows us to discern countries that have been leaders since the beginning of the period, such as Japan, the US, France and Germany, from the latecomers like China and South Korea, which indeed only started to patent in the 1990s. The distribution of the patenting shares for each country-decade panel reveals the direction of inventive efforts. For instance, in the last decade Japan (Panel A of Figure 6) stands out as rather proactive in complex technologies with high and low complexity, rather than those in the middle. By contrast, the relative contribution of the US (Panel B of Figure 6) has decreased after the 1990s, due to the entry of other actors, as highlighted earlier in regards to Figure 1. In relative terms, and compared to Japan, the distribution of US shares in green technology is higher in technologies with middle levels of complexity. The relative shares of Germany and France (Panels C and D of Figure 6) are somewhat constant over time and spread evenly across the whole technological spectrum. Interestingly, newcomers like China and South Korea (Panels E and F of Figure 6) join the global path of green technology innovation with contributions to both less and more complex technologies.

How Does Green Innovation Capacity Vary with Income and Trade?
Coherent with the argument that the accumulation of competences is a vehicle for fostering growth [39], Figure 4 in the previous subsection hints at a strong positive correlation between green innovation and per capita income. At the same time, Figure 2 highlights a divide between mid-ranking countries, whereby some manage to climb up the green technology complexity ladder (i.e., China and South Korea) while others do not (i.e., Argentina, Bulgaria). No doubt, the structural characteristics of a country play a fundamental role in unleashing the innovation potential, and in this part of the paper we investigate some of these characteristics and the extent of their impulse. Given the exploratory nature of our analysis, in Figure 7 we focus on GDP per capita (as a proxy of standards of living and economic growth potential in each country) and export fitness (as a proxy of the trade performance of each country) [40]. We propose a graphical analysis based on a colour map which portrays the relation between GDP per capita and export fitness on the x-y axes, and the entire range of green technological fitness for all the countries in our database on the z-axis, represented with colour variation. In this case, as for Figure 4, green fitness is computed for each year as a moving average over a δ = 10. The colour map is obtained through a 3-dimensional Nadaraya-Watson non-parametric estimation [33] fed with a pooling of all countries in our database over the period 1980-2010. Figure A4 in Appendix E provides information about the green fitness estimation error, for ease of comparability with Figure 7 the iso-levels of green fitness are superimposed on the plot. Notice that the confidence level of the Nadaraya-Watson estimation is heterogeneously distributed in the export fitness-GDP per capita plane. The areas of Figure 7 with higher intensity are those of greater interest. The purple-coloured portion at the bottom left-hand of the graph indicates that, as expected, countries with low GDP and low export fitness exhibit the lowest green technology fitness; also expected is the growth of green fitness as one moves towards the top right corner of the plot.
Another interesting portion of this diagram is on the right-hand side, where intermediate levels of (log) GDP per capita (between 8 and 9.5) and very high export fitness correspond to very high levels of green fitness. This indicates that a highly diversified portfolio of trade matters for unleashing innovation capacity among both high-and mid-level income countries. Put otherwise, a country's level of wealth is not a barrier to developing advanced competences for environmental innovation insofar as they engage trade of more complex products. The diagonal movement of colour is in agreement with the EFC narrative according to which countries with higher export fitness than per capita GDP show a level of complexity that has not yet translated into higher income, but indicates higher development and growth potential [35]. This finding resonates with the descriptive analysis of the rankings in the previous subsections, where the performance of emerging countries in green innovation has been commented on. It also resonates with the recombinant nature of the technology at hand, and the fact that green patents exhibit more diversity of technical components and of know-how relative to non-green ones [41]. Openness to trade and strategic specialization in key components for green technologies are thus likely to enable middle-income countries to accelerate in the pursuit of environmental innovation. This is especially true if we consider the high levels of fitness of enabling technologies that bring together different pools of know-how into coherent solutions for wide applicability.

Conclusions
This paper uses a Economic Fitness-Complexity approach to analyse green innovation trends across countries and technological fields over a forty year period. The main questions we have addressed are: which countries innovate the most? What are the most complex green technologies? What is the relationship between economic development and specialisation in environmental technologies?
We make three major contributions to the literature. First, we provide an overview of spatial and temporal characteristics of green innovation by exploiting the geo-localisation of patent data. Second, we move beyond aggregate trends and delve into the relative performance of each country in relation to the complexity of the technology. This allows us to identify three typologies of countries: leaders, followers, and laggards. As expected, there is a direct relationship between GDP per capita and innovation capacity. That said, we also observe the growing relevance of countries that started from behind but that managed to become prominent actors. Most of these are based in East Asia. Third, we complement previous studies on green technology with a deeper understanding of how innovation capacity is distributed across areas of specialisation. The fitness ranking approach reveals that, after a period of deeper specialisation within diverse domains, innovation in green technology has become more horizontal, with bigger efforts being observed in cross-domain, or enabling, technologies. This trend seems to indicate that while the relative stage of development of individual areas-such as i.e., renewable energy generation or waste management-may have peaked in terms of technology life cycle, there is now demand for greater interoperability across green technologies-i.e., the integration of Information and Communication Technologies for monitoring energy distribution. A combination of more general characteristics of economic performance, such as greater cost efficiency and openness to trade, may entail that opportunities exist for countries that have remained at the margin of the geo-politics of climate change adaptation and mitigation. We hope that the empirical findings of our exploratory study will encourage further analysis of the untapped development potential of environmental sustainability, especially for fast growing countries at the periphery.

Acknowledgments:
The authors acknowledge the comments of three anonymous referees, which have contributed substantially to increase the quality of the present manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix B. Column Selection and Technological Sector Fitness
In general, the fitness and complexity rankings produced by the EFC algorithm are not invariant to the addition (or subtraction) of rows or columns in the binary matrix. Figure A1 depicts a toy example illustrating this point. The right part of panel (a) depicts a 3-by-3 matrix M with rows and columns sorted by fitness and complexity respectively. In particular, row c 1 has higher fitness than c 3 , which in turn has higher fitness than c 2 ; columns instead can be ordered in decreasing order of complexity as follows: t 1 , t 2 , t 3 . Panel (b) represents the same matrix to which one additional column τ 1 was added. Ordering the rows and columns of the new matrix with the EFC algorithm, both the ranking of rows and columns changes. In particular, c 2 is now fitter than c 3 and t 2 is more complex than t 1 .
This example indicates that applying sector fitness to technologies might yield biased results. However, two remarks mitigate this concern. First, while the figure shows that it is in general possible to alter the ranking by simply adding a column, we had to choose a rather extreme case to make the point. In fact, column τ 1 is built in such a way to bring very close together the compositions of the most and least fit row of panel (a). However, this is quite unrealistic because it would be much harder to achieve if the matrix were substantially larger (as are the empirical matrices of the analysis). On the other hand, adding τ 1 is akin to adding adding the information that a country the was thought to patent only in a very ubiquitous agricultural field and nothing else, also patents in ground breaking medical technologies. This however looks intuitively implausible and contrasts with the evidence shown e.g., in Figure 5.
Finally, it is worth noting that, although it is true that applying sector fitness to green technologies cuts some potentially relevant columns from M, it is also the case that these omitted fields are not exclusively linked to green technologies, otherwise they would certainly be included in the classification. Hence, even if we were to include them, we would only account for them partially, and, since we know that they can potentially influence the fitness and complexity rankings, adding them to the analysis would not make us feel any safer about the eventual introduction of biases in the results of the analysis performed with the EFC algorithm. For this reason, we believe that omitting the columns that are not exclusively (or very predominantly) linked to green technologies should not be a major shortcoming. Figure A1. Panel a) depicts a binary country-technology matrix M consisting of three countries (c 1 , c 2 , and c 3 ) and three technologies (t 1 , t 2 , and t 3 ); ordering the rows by fitness and the columns by complexity (right), we see that t 1 is more complex than t 2 and c 3 has higher fitness than c 2 . Panel b) depicts the same matrix M of panel a), to which an additional column τ 1 has been added; ordering the rows by fitness and the columns by complexity (right), we see that now t 2 is more complex than t 1 and c 2 has higher fitness than c 3 . The figure shows that, in general, the addition (or subtraction) of columns to M(y) can potentially affect fitness and complexity rankings.

Appendix C. Measurement of National Knowledge Bases
The geographical coordinates of each inventor's address were obtained through the GeoNames database [42], which contains worldwide geographical information on, among others, administrative borders and postal codes. We first try to detect a postal code in the address string and use GeoNames postal code information to assign geographical coordinates to it. For the remaining addresses, we try to identify the name of a city and use GeoNames city information to obtain the coordinates. Finally, for the addresses not geolocalised in the first two steps, we send the address to the Google Maps API which returns geographical coordinates. However, even if the EPO is improving and updating every year the PATSTAT database, an important share of inventor's addresses is still missing. To deal with this issue, we exploit the work by the Institut Francilien Recherche Innovation Société (IFRIS) filling missing addresses with other patent databases i.e., REGPAT and National Patent Databases [43]. Although PATSTAT assigns an unambiguous identifier to each applicant or inventor, multiple IDs assigned to the same person could be retrieved. In these cases, information on the applicant/inventor address may be fully provided by some IDs and missing in others. Hence, to reduce the number of applicants/inventors with a missing address, we focus on patent families and analyse multiple IDs in order to detect non-missing address assigned to the same person. To do so, for each missing address we calculate the Levenshtein distance between the inventor name and each of the other names with a complete address. In those cases in which the indicator value is lower than three, we assign the address of the inventor found to complete the missing one [44]. In so doing, we obtain 799,011 (67.7%) green patent families with at least one inventor geo-localised, distributed among 141 countries. Patent families without geo-localised inventor (either because inventor information is missing or because address is not found) has been dropped from the dataset. Variations of the geo-localisation rate across ENV-TECH families and patent offices are very small: the standard deviation is 0.089 in the first case and 0.118 in the second case, when you take into account the top 10 patent offices (accounting for 93% of all the patent families), thus we can conclude that the bias introduced by dropping non geo-localised inventors is negligible. Figure A2 suggests that the length of the time window we choose to study and its extremes can have a significant effect on the composition of each snapshot of data. For instance, Figure A2a shows that the number of active [45] technology classes ( Figure A2a) and active countries ( Figure A2b) has grown considerably over time, which implies that a clear trade-off exists between the length of the time series and the size of the intersection of the data available in each year. This is also due to the fact that the the number of applications containing green technologies has remained quite small until the 1980s (Figure A2a, inset). Notice that the trade-off is quite sharp also if we move the right extreme of the time interval too far forward, since there is a sharp drop in the number of filed patent applications, and hence in the density of the data matrices after 2012. This is compatible with the presence of a constant backlog of applications that have been filed but not yet examined and, for this reason, not yet added to the databases. The delay between filing and inclusion into PATSTAT varies depending on the patent office that received the application. For example, the backlog at the United States Patent and Trademark Office (USPTO) is estimated at around 40 months on average; this implies that the data for 2012 contained in the 2016a edition of the database is most likely still incomplete.   Figure A2 shows that by averaging over a longer time window (δ) and counting the number of non-zero rows in the average weighed matrix the upper bound on the number of potentially active countries in matrix M δ (y) defined by applying Equation (1) to Equation (4) increases substantially, especially since the mid 1970s. The further advantage of averaging over a reasonably long time window is that it would also help get rid of small "holes" in the presence of some technology classes, especially at the 2-digit level, as shown in figure the right panel of Figure A3, which counts the frequency of each 2-digit class within every M 1 (y) over the period 1940-2015. The left panel, which reports the yearly count of 3-digit classes, clearly shows that, at this disaggregation, some classes are too recent and inconsistently represented in the data to be included into an inter-temporal comparison.

Appendix E. The Relationship between Export Fitness, GDP Per Capita and Green Fitness: Estimation Error
To validate the three-dimensional analysis of the relationship between export fitness, GDP per capita and green fitness represented as a colour map in Figure 7, we compute the standard error (SE) of the green fitness Nadaraya-Watson estimation. As can be observed in Figure A4, where the portions of the plot with SE 0.4% are in black, and those with SE 0.2% are in white, the standard error is very heterogeneous. In fact, since GDP per capita and export fitness are positively correlated [35,46], most of the points used in our estimation lie on the diagonal of the x-y plane. Only a few countries show different trends, for instance China has higher export fitness than per capita GDP and lies in the bottom-left corner of the graph, by contrast most of the oil exporters have higher income than export fitness, and thus they are placed in the opposite corner. The diagonal shades of grey and white are consistent with our narrative. The green technological competitiveness of countries, proxied by green fitness, is determined by the interplay of export fitness and GDP per capita, and the role of export fitness can compensate low levels of income per capita. (1) In the grey scale, the green fitness ranking estimation error in the Nadaraya-Watson kernel method. White indicates a standard error of ∼0.2% or less, and black a standard error of ∼0.4% or more. (2) The iso-lines of the green fitness ranking levels (lowest in deep purple, highest in clear yellow). The plot is obtained by pooling all countries in our database over the time interval 1980-2010. The different shades of black and white confirm our findings: export fitness and GDP per capita are complementary in determining the green technological capabilities of countries.