Classiﬁcation of Sustainable Activities: EU Taxonomy and Scientiﬁc Literature

: In March 2020, the European Commission published the EU Taxonomy, a classiﬁcation system of economic activities that can be considered environmentally sustainable. Motivated by this policy initiative, we propose a bibliometric analysis, based on the Web of Science database for the period January 1990–March 2020, regarding the extant scientiﬁc production related to the EU Taxonomy environmental objectives and macro-sectors. We ﬁnd that a considerable number of scientiﬁc works—161,595 publications—have investigated Taxonomy-related areas, showing that the EU Taxonomy deﬁned a working method, based on the cooperation among regulators, academics, and industry, representing a valuable example of evidence-based policy making. Furthermore, topic modelling analysis shows that extracted papers focused on improvements in production processes, innovation, and environmental performance. Thus, exploiting time and geographic patterns of the scientiﬁc publications, we perform a multivariate analysis to investigate its relationship with subsequent levels of pollution. Our evidence shows that, for the past, a higher level of EU Taxonomy-related publications is associated with a lower level of CO 2 emissions, supporting the view that scientiﬁc production has a societal impact in terms of environmental sustainability. Accordingly, now that EU Taxonomy-related topics have been incorporated into policy measures, further positive environmental e ﬀ ects are expected from here on out. of pollution. Our results are robust to control for di ﬀ erent levels of government expenses in education. We also control for natural di ﬀ erences across countries, which may lead to di ﬀ erent carbon emissions with variables for oil rent, energy use, and energy production from renewables (models (4) to (6)). The inclusion of these variables does not a ﬀ ect our main ﬁndings. Finally, our results are also conﬁrmed when considering a longer time span to compute Taxonomy, i.e., a cumulative sum over four or ﬁve years.


Introduction
The Paris Agreement, signed in December 2015 by 189 Nations and entered into force in November 2016, represents a turning point in recognizing global warming and the need to steer the economy towards low-carbon activities. Specifically, the main aim is, "holding the increase in the global average temperature to well below 2 • C above pre-industrial levels and pursuing efforts to limit the temperature increase to 1.5 • C above pre-industrial levels, recognizing that this would significantly reduce the risks and impacts of climate change". In regard to the general goal of the Paris Agreement, a dominant issue is the channeling of financial flows towards a green, low-carbon, and climate-resilient economy. The reasonable assumption is that finance should follow an allocative efficiency principle able to quantify to what extent a company is environmentally sustainable. Coherently, several institutional initiatives pursued worldwide have begun to cope with this problem and its related issues, such as the classification of a business as 'green' and its certification, the public disclosure of this quality, the metrics to measure the sustainable attributes of economic activities, and so on.
In Europe, a vivid debate on global warming, with a focus on the link between economic activities and climate change, generated expectations for decisive actions and policy interventions to address these concerns. Thus, environmental issues have been placed at the top of the European Sustainability 2020, 12, 6460 3 of 25 represents a comprehensive map of the research papers that have been written on the environmental objectives specified by sectoral definitions identified in the EU Taxonomy.
As a first straightforward result, we found a significant body of pre-existing scientific production in line with the environmental objectives by economic sectors identified by the European Commission. In fact, among the 432 queries, only 27.5% (119 extractions) returned a null response. Further, on the opposite side, 72.5% of queries combining the EU environmental goals with specific economic sectors revealed the presence of scientific literature that is largely described in this paper by development over time and by geographical areas.
At this point, since the data obtained from the WoS core collection were retrieved using keywords, we performed a further analysis to exploit, in addition to quantity, the content of the research papers on EU-taxonomy related areas, by making use of the Latent Dirichlet Allocation. The topic model analysis highlights the prevalence of papers related to business innovation, with an emphasis on process improvements, which are definitively expected to improve the environment. These findings are in line with a recent stream of literature which suggests that low-carbon technology innovations, when combined with traditional policy tools, such as carbon trading and carbon taxes, might be largely effective in shaping firms' environmental sustainability performance [14][15][16][17][18]. Thus, supported by our evidence and the related literature, we then exploited time series and geographical information of our collected dataset to develop and estimate an econometric setting that investigates the relationship between the EU Taxonomy-related scientific production (which is here considered a driver for innovation towards sustainability) and pollution levels. This analysis is also motivated and reinforced by the expected societal impact of environmental studies suggested by a flourishing business literature among the others, see [19][20][21][22][23][24][25].
To approximate pollution levels, we considered carbon emissions, recognized as the main source of global warming [26], and largely used in the literature see, e.g., [27,28]. From our estimations we found a negative relationship between the EU Taxonomy-related literature and pollution levels, suggesting that a higher production of scientific literature on topics related to the EU Taxonomy promotes lower levels of pollution, over time. Importantly, this result also holds true when considering the scientific production for each environmental objective and macro-sector of the Taxonomy. These findings are robust when controlling for international events related to pollution issues, potential confounding factors, including economic development, generic scientific production, and with country and year fixed effects. This implies that the identified relationship is not simply driven by countries with a higher investment in research and innovation, or other domestic factors that affect scientific research, but that it depends on the specific research outcomes in areas and on topics related to the environment.
Although for the time being it is still premature to quantify the impact on companies and investors that will be generated by the EU Taxonomy, it is reasonable to expect deep changes in economic activities and financial flows. The empirical results of this paper are supportive of a possible relevant role of the EU Taxonomy: The negative association of the Taxonomy-related scientific production with the pollution levels that emerged from our empirical analyses, lead us to argue that, in the future, as the Taxonomy is implemented in business and finance, emissions could be further reduced.
This paper makes an important contribution to several strands of research. First, to the best of our knowledge, it represents the first attempt to use the EU Taxonomy in the context of scientific literature, providing an example of its usability and some empirical evidence about the diffusion of taxonomy concepts in the scientific domain. Second, it expands the empirical findings on the "societal impact" of scientific research. Finally, our findings are relevant for policy makers and regulators in other parts of the world, as well. As the EU Taxonomy could be effective for tackling global warming, at the international level, other non-EU countries may decide to adhere to it, or to adopt similar classification systems, to identify sustainable economic activities.
The remainder of the paper is structured as follows. Section 2 offers the background of the paper, with a mention of the institutional setting that leads to the development of the EU Taxonomy. Section 3 presents the methodology adopted to construct the dataset of scientific papers related to the Sustainability 2020, 12, 6460 4 of 25 EU Taxonomy and the analysis conducted with topic modelling. Section 4 reports the main empirical results, and Section 5 concludes with a discussion of the main findings.

Institutional Background of the EU Taxonomy
In July 2018, the European Commission set up a Technical Expert Group on sustainable finance (TEG) including a heterogeneous group of participants from academia, business, and the finance sector, on the one hand, and members and observers from EU and international public bodies, on the other hand. In this context, one of the four mandates of the TEG was specifically to develop a unified, clear, and detailed EU classification system for sustainable economic activities. Given the complexity of the task, the TEG activities were extended until 2020 and, after the first release in June 2019, the final version of the Taxonomy Technical Report was released in March 2020 [29,30] with the publication of the so-called EU Taxonomy. The six environmental objectives pursued in the EU Taxonomy are: (1) Climate change mitigation, (2) climate change adaptation, (3) sustainable use and protection of water and marine resources, (4) transition to a circular economy, (5) pollution prevention and control, and (6) protection and restoration of biodiversity and ecosystems. Taking these environmental objectives as the rational target, the TEG applies the EU Taxonomy as the framework that guarantees a unified classification system for determining when economic activities can be considered environmentally sustainable. Based on the NACE (French: Nomenclature statistique des Activités économiques dans la Communauté Européenne) classification, the official industry classification used in the European Union, the TEG selected a list of macro-economic sectors considered relevant in terms of GHG emissions in the EU (roughly 93.5% of Europe's emissions), and covering a significant proportion of GDP and total employment at the EU 28 level. For each macro-sector, the TEG identified a list of eligible activities and, then, determined the "detailed technical screening criteria" necessary to validate whether economic activities meet the relevant substantial contribution to the environmental objectives. Note that eligibility under the EU Taxonomy is assessed on an activity basis rather than by entity (i.e., company). Nevertheless, a key part of the EU Taxonomy assessment includes defining what part of a corporate's performance can be assessed as sustainable. To this end, determining to what degree a company can be considered environmentally sustainable depends on the individual contribution of each eligible economic activity to the company performance, in terms of turnover, or revenues when appropriate, but also in terms of capital or operational expenditure [30].
As a general rule guiding the performance criteria, an economic activity is qualified as environmentally sustainable when (i) it contributes substantially to at least one of the six environmental objectives; (ii) it follows the principle of "Do No Significant Harm" to any other environmental objectives; and, (iii) it complies with minimal social safeguards. Indeed, when an economic activity meets the EU Taxonomy performance thresholds it is then certified as "EU Taxonomy-aligned". In practice, a company that is fully or partially involved in eligible economic activities (e.g., reforestation, electricity generation, public transport) and meets the EU-Taxonomy technical screening criteria, it is allowed to classify a corresponding percentage of its performance (turnover, capital expenditure, or operational expenditure) as environmentally sustainable. Consequently, a financial intermediary investing in that company may compute its proportional share of Taxonomy-aligned investment. In this way, the intermediary (e.g., investment and mutual funds, private and occupational pensions) will disclose the nature and the extent to which its financial products are EU Taxonomy-aligned. This represents a powerful edge for investment firms, as companies will have to disclose their EU Taxonomy-aligned turnover and capital expenditure in their annual reports, as required in the Non-Financial Reporting Directive. Otherwise, the financial intermediary will have to declare that a product does not align.
The EU Taxonomy will be converted into a Regulation by the end of 2020. The criteria will be laid down in delegated acts, following recommendations from key stakeholders (i.e., expert groupsthe TEG at first, followed by the 'Platform on sustainable finance'-and public consultation) and Member States. Consequently, market actors will have to start complying with the Taxonomy Regulation requirements as of December 2021.

Materials and Methods
Our main data source was Web of Science, a Clarivate Analytics' product, used to identify publications related to the EU Taxonomy. We created a list of 432 combined keywords obtained as a text union between words representing the six environmental objectives and terms describing each economic activity included in the EU Taxonomy. As anticipated, the EU Taxonomy adopts the NACE -Revision 2 for industrial classification. The selected NACE Macro-Sectors are: Agriculture, Forestry and Fishing; Mining and Quarrying; Manufacturing; Electricity, Gas, Steam, and Air Conditioning Supply; Water Supply and Waste Management; Construction; Information and Communication. Within each NACE Macro-Sector, 72 economic activities are identified as eligible (see Table A1 in the Appendix A). Keywords describing each activity and the environmental objectives (keywords come from the TEG report [29]) were inserted in order to run extraction queries from the WoS database's search engine. Then, we downloaded the full record for each query. The data included various types of information about research contributions, such as title, abstract, keywords, publication type, publisher name, publication date, authors' names and their affiliations, and other details.
Following this procedure, we constructed a unique dataset of metadata about the EU Taxonomy in the scientific literature. Our choice to rely on the WoS core collection database is consistent with previous literature, which supports using it as a tool in bibliometric studies on climate-related research [31]. In addition, it is commonly suggested that science-technology interactions be measured by (but obviously, not limited to) citations of documents in the WoS. For example, [32] it has been shown that 50 to 55% of non-patent references in patents documents are journal references covered by the WoS. Moreover, it is considered one of the most reliable data sources for some of the research areas related to the Taxonomy, such as Natural Sciences and Engineering [33,34], and it also contains the most relevant and influential journals in its record, so it has been widely used to study scientific production and citations [35,36]. However, WoS over-represents English-language journals (in our sample, roughly 97% of the papers are in English), but this is a common bias to other databases, e.g., Scopus.

EU Taxonomy in the Scientific Literature: Descriptive Results
The first step of the analysis presented in this paper involved an investigation of the state of the art of the previous scientific literature filtered by the environmental objectives and economic activities proposed in the EU taxonomy framework. Analysis of scientific production in the environmental issue is not new, with contributions addressing different topics and using different approaches. Some of these works look at concepts related to various environmental aspects [35] or sustainability topics, such as green, circular, and bio economy [37][38][39]. Other papers explore scientific productivity by analyzing growth over time and the cross-country patterns [40,41]. The present paper adds to this literature by providing a new perspective, where the existing scientific production on environmental issues is selected using the objectives and criteria proposed by the TEG in the EU Taxonomy.
Following the methodology described in Section 3, we identify 161,595 publications related to the EU Taxonomy for the period January 1990-March 2020. Thus, this evidence provides a response to our first research question, because scientific papers about the environmental objectives and the macro-sectors of the EU Taxonomy are a non-negligible part of the overall WoS publications. Among the 432 queries, only 27.5% (119 extractions) yielded a null response. Specifically, empty repository of scientific production mainly refers to the goals of Climate change adaptation (30 cases), Circular economy (29 cases), Climate change mitigation (21 cases), and Biodiversity/Ecosystems (19); 14 unproductive extractions refer to the environmental goal of Pollution and only 6 are related to Water and Marine resources.
Turning to the cases of productive extractions, Table 1, below, provides a detailed overview of the distribution of the scientific production, divided by environmental objectives and NACE Macro-Sectors. As far as the objectives are concerned, Water and Marine resources is the most populated topic in the sampled literature, followed by Biodiversity/Ecosystem and Pollution, suggesting that these three are the most well-established research fields. All together they account for 94% of our sample. For the other objectives we find a more limited number of publications. Climate change adaptation and Climate change mitigation, together, account for less than 10,000 papers, while for Circular economy we find fewer than 1000 publications. Based on the NACE Macro-Sector, our evidences indicate that the majority of the papers refers to Water, Electricity, and Agriculture (roughly 89% of the sample), followed by Manufacturing and Transportation (8%). Papers dealing with Buildings and ICT are quite scarce. Figure 1 depicts the number of publications related to the EU Taxonomy, by environmental objectives, over time. Water and Marine resources have always accounted for the largest share of publications, with a marked increase since 2015. Climate change mitigation and Climate change adaptation are amply represented in the sample (more than 100 papers each), but only starting from 2010; nevertheless, by the end of 2019 that number increased exponentially. This might reflect the growing efforts that regulators and policymakers put into developing climate change mitigation and adaptation strategies, which, in turn, may have influenced the academic community. Papers about Circular economy were mainly published in the last three years, possibly showing that although it is not new as a research field, the interest that its applications has raised in the scientific community is more recent. This evidence is in line with [37] who, conducting a bibliometric review on the concepts of Circular Economy, Green Economy, and Bioeconomy, suggest that these different concepts are currently being mainstreamed as development and sustainability avenues. Our data also cover the first quarter of 2020 (omitted in Figure 1 for representation purposes) and it essentially confirms the described trends for the most recent years. Figure 2 illustrates that all macro-sectors showed a growth trend over time, with a substantial increase in the last five years, as a result of the research community's increased attention towards climate change topics, also for more applied fields. Compared to 2014, the number of publications about Water, Electricity, and Agriculture are almost doubled at the end of 2019, while for Manufacturing and Transportation they tripled.  In Figure 3, we show the geographical distribution of scientific papers by the nationality of authors' university affiliation; the darker green color indicates a higher number of papers published. Notably, EU Taxonomy-related publications are widespread at the global level, including developing countries. As expected, the United States, China, and Great Britain show a larger number of publications, but EU Member States are well represented, as well. In line with the findings of [37], we found notable sources of literature for sustainability issues in Australia, Brazil, Canada, Russia, and South Africa. In Figure 3, we show the geographical distribution of scientific papers by the nationality of authors' university affiliation; the darker green color indicates a higher number of papers published. Notably, EU Taxonomy-related publications are widespread at the global level, including developing countries. As expected, the United States, China, and Great Britain show a larger number of publications, but EU Member States are well represented, as well. In line with the findings of [37], we found notable sources of literature for sustainability issues in Australia, Brazil, Canada, Russia, and South Africa. In Figure 3, we show the geographical distribution of scientific papers by the nationality of authors' university affiliation; the darker green color indicates a higher number of papers published. Notably, EU Taxonomy-related publications are widespread at the global level, including developing countries. As expected, the United States, China, and Great Britain show a larger number of publications, but EU Member States are well represented, as well. In line with the findings of [37], we found notable sources of literature for sustainability issues in Australia, Brazil, Canada, Russia, and South Africa.

EU Taxonomy in the Scientific Literature: Topic Modelling Analysis
Since the abovementioned data obtained from the WoS core collection were extracted using keywords, we performed a further analysis to provide a thematic landscape of research papers in the EU Taxonomy-related areas, by making use of the latent Dirichlet allocation (LDA). The LDA is an unsupervised Bayesian machine-learning approach developed by [42] that identifies latent topics in a corpus of documents. The basic assumption of LDA is that each document exhibits a mixture of latent topics and, using the probability of words co-occurring within documents, it allows the identification of sets of topics. In recent years, the LDA model has become the most widely used approach for topic modelling. Following [37], we applied the LDA to the abstracts of scientific literature related to the Taxonomy for each corpus (i.e., six bodies of literature). The publications are aggregated by environmental objective and, as a result, we obtained a list of salient key terms associated with the overall corpus as well as a maximum of four topics, with a list of associated topicspecific key-terms.
For a selection of three research-relevant objectives (namely, the 'Climate change mitigation', 'Climate change adaptation', and 'Circular Economy'), we programmed the model to highlight four topics and identify the key-terms associated with the topics. The model also calculates the distance between topics on a two-dimensional plan. This scatterplot, based on multi-dimensional scaling (MDS), enables the visualization of semantic relationships between topics, such that topics (circle areas) that appear distant in the graph, are the ones semantically distant. As a matter of fact, like factor analysis, MDS allows the level of semantic similarity or distance between topic to be visualized [37]. Figures 4-6 represent the visualized application of the LDA utilized to perform the topic generation of the analyzed abstracts in order to explore hidden semantic structures. In the left panel of Figures 4-6, we plot in the scatterplot the topics as circles. The right panel depicts a horizontal bar chart whose bars represent the most salient key-terms, useful for interpreting the selected topic on the left (highlighted in red). A pair of overlaid bars represent both the overall term frequency (in blue) as well as the topic-specific frequency of the term (in red). We encode each topic's overall prevalence using the areas of the circles, where we sort the topics in decreasing order of prevalence. In Figures 4 and 5, we provide a snapshot of the results from the LDA analysis for the objectives 'Climate change mitigation' and 'Climate change adaptation', which were the object of greater emphasis by the TEG. Topics that emerge most frequently in the analyzed literature are related to environmental performances and climate-related issues (see e.g., Topic 1 in Figure 4). This evidence suggests that although the sample has been constructed using just keywords, the extracted scientific literature deals with issues that are fully compliant with the aims of the EU Taxonomy.

EU Taxonomy in the Scientific Literature: Topic Modelling Analysis
Since the abovementioned data obtained from the WoS core collection were extracted using keywords, we performed a further analysis to provide a thematic landscape of research papers in the EU Taxonomy-related areas, by making use of the latent Dirichlet allocation (LDA). The LDA is an unsupervised Bayesian machine-learning approach developed by [42] that identifies latent topics in a corpus of documents. The basic assumption of LDA is that each document exhibits a mixture of latent topics and, using the probability of words co-occurring within documents, it allows the identification of sets of topics. In recent years, the LDA model has become the most widely used approach for topic modelling. Following [37], we applied the LDA to the abstracts of scientific literature related to the Taxonomy for each corpus (i.e., six bodies of literature). The publications are aggregated by environmental objective and, as a result, we obtained a list of salient key terms associated with the overall corpus as well as a maximum of four topics, with a list of associated topic-specific key-terms.
For a selection of three research-relevant objectives (namely, the 'Climate change mitigation', 'Climate change adaptation', and 'Circular Economy'), we programmed the model to highlight four topics and identify the key-terms associated with the topics. The model also calculates the distance between topics on a two-dimensional plan. This scatterplot, based on multi-dimensional scaling (MDS), enables the visualization of semantic relationships between topics, such that topics (circle areas) that appear distant in the graph, are the ones semantically distant. As a matter of fact, like factor analysis, MDS allows the level of semantic similarity or distance between topic to be visualized [37]. Figures 4-6 represent the visualized application of the LDA utilized to perform the topic generation of the analyzed abstracts in order to explore hidden semantic structures. In the left panel of Figures 4-6, we plot in the scatterplot the topics as circles. The right panel depicts a horizontal bar chart whose bars represent the most salient key-terms, useful for interpreting the selected topic on the left (highlighted in red). A pair of overlaid bars represent both the overall term frequency (in blue) as well as the topic-specific frequency of the term (in red). We encode each topic's overall prevalence using the areas of the circles, where we sort the topics in decreasing order of prevalence. In Figures 4 and 5, we provide a snapshot of the results from the LDA analysis for the objectives 'Climate change mitigation' and 'Climate change adaptation', which were the object of greater emphasis by the TEG. Topics that emerge most frequently in the analyzed literature are related to environmental performances and climate-related issues (see e.g., Topic 1 in Figure 4). This evidence suggests that although the sample has been constructed using just keywords, the extracted scientific literature deals with issues that are fully compliant with the aims of the EU Taxonomy.  In addition, another topic that emerges frequently in the corpuses analyzed is related to production processes and their optimization. Salient words such as, among others, plant, processes, efficiency, material, methods, and technology can be associated with the research area of production efficiency (see Topic 1 in Figure 5). PC1 and PC2 refer to the two Principal Components used to visualize the distance between topics on a two-dimensional plane. Right panel shows the most salient key-terms found in the abstracts. Some words are truncated because this is how LDA algorithm works. The salient key-terms referring to Topic 1 are highlighted as red circle.  Similar considerations can be made for the literature on Circular Economy which revolves around resource-efficiency, recycling and, more generally, increasing resource productivity (see Topic 4 in Figure 6). are truncated because this is how LDA algorithm works. The salient key-terms referring to Topic 1 are highlighted as red circle. Similar considerations can be made for the literature on Circular Economy which revolves around resource-efficiency, recycling and, more generally, increasing resource productivity (see Topic 4 in Figure 6).  In addition, another topic that emerges frequently in the corpuses analyzed is related to production processes and their optimization. Salient words such as, among others, plant, processes, efficiency, material, methods, and technology can be associated with the research area of production efficiency (see Topic 1 in Figure 5). Similar considerations can be made for the literature on Circular Economy which revolves around resource-efficiency, recycling and, more generally, increasing resource productivity (see Topic 4 in Figure 6).
In line with [37], we find that salient key-terms relating to Circular economy are associated with environmental sustainability such as waste, energy, and recycling. The analysis of the co-occurrence of words shows that the words 'agriculture' and 'water' recur in every examined corpus.
Up to this point, the topic model analysis highlights the prevalence of papers related to business innovation, with an emphasis on process improvements, which are ultimately expected to improve the environment. The rise of renewable energy, as well as the reduction and measurement of pollution levels, with an increasing green awareness, have configured economically, renewably, and socially sustainable business approaches harnessed by technologies [43]. Such evidence is in line with the literature claiming innovation to be an important tool for achieving sustainability [44][45][46][47][48][49][50]. Some studies posit a significant and positive relationship between innovation and sustainability, supporting the argument that companies need to innovate, develop new products, redesign existing products, and invest in research and development for construction, sustainability, and innovation. In addition to this, the pressure from interested parties, in a broad way, is a motivator for companies to undertake an innovation process that leads towards sustainability [51][52][53]. To the extent that business sustainability is understood as an effective contribution to sustainable development, then innovations will have greater impacts on the development of more sustainable production cycles, products, and services, and even, on new business management models [54,55].

Regression Analysis
Supported by the abovementioned findings and also by the most recent literature-suggesting that low-carbon technology innovations, especially when complemented by traditional policy tools such as carbon trading and carbon taxes, might be effective in shaping firms' environmental sustainability performance [14][15][16][17][18]-we exploit time series and geographical information of our collected dataset to develop and estimate an econometric setting that investigates the relationship between the Taxonomy-related scientific production, as a condition for innovation towards sustainability, and the sustainability goal related to the reduction of CO 2 emissions. Such research is also grounded on the literature dealing with the societal impact of scientific research [56], given that scientists are increasingly under pressure to communicate their research better, to interact with the public, and to prove the "societal" impact of their work [57]. In particular, the latter is a hot topic in the environmental arena where the benefits coming from research are those that benefit the natural capital of a nation, by reducing waste and pollution, and by increasing nature preserves or biodiversity [56]. In this sense, we add to the previous literature in that we investigate: To what extent does the scientific production in EU Taxonomy-related areas have a societal impact in terms of pollution reduction? To address this research question, we hypothesize a relationship between the knowledge accumulated thanks to scientific research, on a global scale-and disseminated through WoS publications-and subsequent pollution levels. Here, and as is widely used in the literature see, e.g., [27,28], pollution is measured by CO 2 emissions. Thus, we estimate the following equation: The dependent variable is the total per capita emissions of carbon dioxide in country c at time t, as this proxy of pollution allows a cross country comparison of the pollution levels without reflecting a mere size effect. Taxonomy is our main variable of interest, which is defined as the cumulative sum over the previous three years (from t-1 to t-3) of the number of scientific papers related to EU Taxonomy issues, per capita published by authors in country c. We use the cumulative sum of scientific production over three years. Such a lag identification [58] is used here to alleviate threats to causal identification and to take into account that scientific achievements need time to be applied and implemented into production technologies. Following the previous literature see, e.g., [27,28], we control for a number of time-varying country-specific variables that may explain the different levels of emissions [59,60]. We include the real per capita GDP and the squared per capita GDP. We define the dummy variable recession that is equal to one for negative GDP growth and population is the country population in millions. These independent variables are lagged one year. By doing so, we also contribute the literature investigating the relation between economic growth and pollution [27,28]. In fact, in the early stages of economic growth, an increase in per capita income is associated with higher levels of pollution, but beyond a certain level of income per capita, the environmental impact indicators improve. This relationship has come to be known as the environmental Kuznets curve; however, contrasting results are quite frequent in empirical papers because the signs of the coefficients depend on the specific set of countries under consideration [59][60][61][62]. While the relationship between pollution and economic outcome is a topic of great interest, it is out of the scope of the present paper. Additional factors have contributed to explaining carbon emissions patterns, such as regulation, international trade, liberalization, and market mechanisms see [63,64] for a review of the literature. Rather, we have focused on the relationship between research, and specifically, that related to the environmental objectives and macro-sectors as identified in the EU taxonomy, and CO 2 emissions. We also exploited additional information about the scientific works related to the EU Taxonomy regarding the year of publication and the country of origin of the authors. Importantly, we controlled for different levels of scientific research with the logarithm of the number of scientific publications computed, consistently with Taxonomy, as the cumulative sum over the previous three years. In addition, we saturated the empirical specification with country and time fixed effects. This controls for the possibility that our results were driven by omitted country specific variables and eliminates time-varying factors that are common to all countries. In the regressions, standard errors are clustered at country level. All variables are winsorized at 1 and 99 levels. As documented in the previous section, EU Taxonomy papers are authored by scientists from all over the world. Thus, to exploit the cross-country dimension as much as possible, CO 2 emissions data were downloaded from the International Energy Agency (IEA). All additional control variables were retrieved from the World Bank. The final dataset was assembled based on the country-year variables. In the Appendix A, we report detailed definitions of all variables (Table A2), as well as the summary statistics (Table A3). Because data on carbon emissions and country controls are not available for each country-year, the number of observations is reduced to, at most, 2006 and the regression analysis covers the period 2002-2017. Figure 7 shows the average trend over time of the CO 2 emissions per capita. We can notice a quite substantial increase in the years 2002-2006, but afterwards, the average pattern displays a small but constant decrease. The aggregate trend nonetheless hides a significant heterogeneity existing among countries that was then exploited in our econometric approach. relationship between pollution and economic outcome is a topic of great interest, it is out of the scope of the present paper. Additional factors have contributed to explaining carbon emissions patterns, such as regulation, international trade, liberalization, and market mechanisms see [63,64] for a review of the literature. Rather, we have focused on the relationship between research, and specifically, that related to the environmental objectives and macro-sectors as identified in the EU taxonomy, and CO2 emissions. We also exploited additional information about the scientific works related to the EU Taxonomy regarding the year of publication and the country of origin of the authors. Importantly, we controlled for different levels of scientific research with the logarithm of the number of scientific publications computed, consistently with Taxonomy, as the cumulative sum over the previous three years. In addition, we saturated the empirical specification with country and time fixed effects. This controls for the possibility that our results were driven by omitted country specific variables and eliminates time-varying factors that are common to all countries. In the regressions, standard errors are clustered at country level. All variables are winsorized at 1 and 99 levels. As documented in the previous section, EU Taxonomy papers are authored by scientists from all over the world. Thus, to exploit the cross-country dimension as much as possible, CO2 emissions data were downloaded from the International Energy Agency (IEA). All additional control variables were retrieved from the World Bank. The final dataset was assembled based on the country-year variables. In the Appendix, we report detailed definitions of all variables (Table A2), as well as the summary statistics (Table A3). Because data on carbon emissions and country controls are not available for each country-year, the number of observations is reduced to, at most, 2006 and the regression analysis covers the period 2002-2017. Figure 7 shows the average trend over time of the CO2 emissions per capita. We can notice a quite substantial increase in the years 2002-2006, but afterwards, the average pattern displays a small but constant decrease. The aggregate trend nonetheless hides a significant heterogeneity existing among countries that was then exploited in our econometric approach.

Findings of Model Estimations
The results shown in Table 2 model (1) respond to our second research question regarding the effect of the EU Taxonomy-related scientific production on the CO 2 emissions: Higher level of research in the EU Taxonomy areas is associated with lower per capita carbon emissions. Moreover, a one (1.0) standard deviation increase in Taxonomy is associated with a decrease of carbon emissions per capita by 0.79, suggesting that the effect is slightly less than proportional.
Looking at the control variables, we highlight the non-linear relationship with the logarithm of the GDP, and the negative and statistically significant relationship between recession and CO 2 emissions, suggesting the inherent difficulties of keeping economic growth paired with low levels of pollution [61,64,65]. The coefficient of Scientific productivity is positive but not statistically significant. In models (2) to (7), we consider the variable Taxonomy computed along with each of the six environmental objectives. We find that the scientific papers related to the EU Taxonomy objectives are all negatively associated with pollution levels. The point estimates are significant at the 1-percent statistical level in all models, except Taxonomy-Circular economy that is significant at the 5-percent statistical level. Taxonomy-Water and Marine resources is the objective with the highest economic impact among the six, followed by Pollution and Biodiversity/Ecosystem.  (1), Taxonomy is defined as the sum of total scientific papers related to taxonomy per capita from t-1 to t-3. In models (2) to (7), Taxonomy-'objective' is defined as the sum of total scientific papers related to that specific taxonomy objective per capita from t-1 to t-3. All regressions include country and time fixed effects. Robust standard errors clustered at country level are included in parentheses, where ***, ** and * indicate significance at the 1, 5 and 10 percent statistical level, respectively.
Next, we exploited the sectoral dimension of the EU Taxonomy, summing up the numbers of scientific papers along each activity. The results, presented in Table 3, below, show that scientific research on all the macro-sectors listed in the EU Taxonomy is negatively associated with levels of pollution and, with the exception of Taxonomy-Manufacturing, all coefficients are highly statistically significant. Electricity, followed by Agriculture, and Water, are the EU Taxonomy macro-sectors with the largest environmental impact. The results of Tables 2 and 3 suggest a good level of science-technology interaction in the area of environmental issues; this trend is evident throughout the EU Taxonomy-related scientific literature, regarding both general environmental objectives and more applied fields.
In Table 4, we report additional results. In models (1) and (2), we run separate regressions for the sample of OECD and non-OECD countries. Worthy of note in our findings, the effect of Taxonomy is more pronounced for non-OECD countries. This implies that less economically developed countries benefit more from scientific research oriented to address environmental aspects in terms of the reduction of CO 2 emissions. For OECD countries, the coefficient of Taxonomy is negative but not statistically significant. This does not mean that scientific research devoted to addressing climate-related issues is not valuable in developed countries. Rather, it suggests that for those countries with production systems based on advanced technologies, the marginal benefit in terms of lower carbon emissions is low and difficult to capture with an aggregation at country level. Given that the EU Taxonomy was developed by the European Commission, in Table 4, models (3) and (4), we investigated whether European countries have gained a greater advantage than non-EU countries from EU taxonomy-related scientific research. Consistent with the results shown in models (1) and (2) of Table 4, below, we found that this is not the case. In other words, the marginal benefit in terms of lower carbon emissions is more easily detected in non-EU countries, where environmental issues are more recent and still part of a mindset change.
Furthermore, seeing as scientific research in the WoS is clearly English language-biased, one possible concern is that our results were driven only by those countries in which English, or a language close to English, is the main language. To address this point, we split the sample based on the English proficiency of each country. Notably, models (5) and (6) in Table 4 show that our main results are confirmed for both subsamples, suggesting that language skills are not an issue in the environmental science-technology effect. In order to add an additional investigative step on the effects of the EU Taxonomy-related scientific production that would capture the quality of the research, we looked at the number of citations, as a main determinant of the CO 2 emissions, leaving the set of control variables unchanged. Model (7) in Table 4, shows that the total number of citations is negative and statistically significant, while the standard deviation is positive. This suggests that the quality of research related to taxonomy matters, but just a few often-cited publications are not enough to have an impact on reducing carbon emissions.  Table reports the results of the panel model of taxonomy-related scientific production on CO 2 emissions per capita. In models (1) to (7), Taxonomy-'sector' is defined as the sum of total scientific papers related to that specific taxonomy sector per capita from t-1 to t-3. All regressions include country and time fixed effects. Robust standard errors clustered at country level are included in parentheses, where ***, **, and * indicate significance at the 1, 5, and 10 percent statistical level, respectively.  (1) and (2) report the results of the baseline model for OECD and non-OECD countries. Models (3) and (4) report the results of the baseline model for EU and non-EU countries. Models (5) and (6) report the results of the baseline model for countries with a high level of English proficiency (above 3) and a low level of English proficiency (equal to or below 3). Taxonomy citation is defined as the sum of total citations of scientific papers related to taxonomy per capita from t-1 to t-3. Taxonomy citation (mean sd) is defined as the mean of the standard deviation of total citation of scientific papers related to taxonomy per capita from t-1 to t-3. All regressions include country and time fixed effects. Robust standard errors clustered at country level are included in parentheses, where ***, ** and * indicate significance at the 1, 5 and 10 percent statistical level, respectively.
Our findings are robust to a series of robustness checks. Firstly, since our analysis covers a time span in which significant international events related to pollution issues happened, including the Kyoto Protocol that marked a turning point, in Table A4 in the Appendix A, we split our sample into three sub-periods to test whether the relationship between taxonomy related scientific production and CO 2 emissions has changed. The Kyoto Protocol setting binding emission reduction targets not only contributed to the reduction of the CO 2 emissions in some countries, but it also has influenced the scientific research related to environmental topics and stimulated patenting activities [66]. Adopted on 11 December 1997, the Kyoto Protocol underwent a complex ratification process and entered into force on 16 February 2005; then, it was further modified with the Doha Amendment on 8 December 2012. We find that the Taxonomy coefficients in model (1) and (2) are very similar, while it is reduced to 0.13 in the period 2013-2017 (model (3)). Importantly, the parameters of interest are statistically significant in all three subperiods. Thus, despite a lower importance in the years 2013-2017, the negative impact of Taxonomy-related research on pollution is confirmed.
Secondly, in Table A5 in the Appendix A, we include additional control variables that approximate different levels of investment in scientific research across countries (models (1) to (3)). Specifically, we consider R&D expenditures as a share of GDP; the related coefficient suggests that a high level of R&D is associated with higher levels of pollution. Our results are robust to control for different levels of government expenses in education. We also control for natural differences across countries, which may lead to different carbon emissions with variables for oil rent, energy use, and energy production from renewables (models (4) to (6)). The inclusion of these variables does not affect our main findings. Finally, our results are also confirmed when considering a longer time span to compute Taxonomy, i.e., a cumulative sum over four or five years.
While the documented effect of the EU Taxonomy scientific literature on pollution cannot be interpreted as causal, being possible that carbon intensity and environmental research are affected by a common factor, our findings suggest a notable interrelation between scientific production and the improvement of environmental sustainability. The link between research and innovation is well-known [1][2][3], and, indeed, the scientific literature in EU Taxonomy-related areas evidently fosters innovation towards sustainability, allowing for a benefit to the natural capital of countries, by reducing CO 2 emissions and other forms of pollution [56]. The use of EU Taxonomy-filtered scientific production also proves the evidence-based policy making approach adopted by the European Commission; it is thanks, in no small part, to the role played by the TEG in particular, which was instrumental in narrowing the gap in the science-policy interaction. Last but not least, marginal benefit in terms of lower carbon emissions is more easily detected in non-EU countries. This suggests that somehow, for years, scientific production in the EU context has had a role in shaping environmental policies and has therefore contributed to the mindset change on environmental issues, worldwide.

Conclusions
Drawing inspiration from the European Commission's recent publication the EU Taxonomy and addressing the issue of evidence-based policy making, the main aim of this paper is to investigate the link between the policy guidelines included in the 2020 EU Taxonomy and the 30 years' worth of scientific production in EU Taxonomy-related areas. Moreover, supported by the long-standing relationship between research and innovations introduced by firms, the present paper explores the societal impact produced by the EU Taxonomy-related scientific literature, in terms of its relationship with pollution, here proxied by CO 2 emissions.
The key findings we document are: (i) The EU Taxonomy takes advantage of the extensive scientific literature of the last 30 years, in terms of environmental objectives and economic activities; (ii) the scientific production related to the EU Taxonomy is negatively associated with the subsequent level of carbon emissions, particularly for developing and emerging countries.
The association of the EU Taxonomy-related scientific production with a reduction of pollution levels, all other drivers remaining unchanged, lead us to argue that innovations that come from research do have an impact on the environment. Even more importantly, EU Taxonomy-related topics have been now incorporated into policy measures (the EU Taxonomy itself and the other regulations that are going to refer to the Taxonomy) and this reinforces expectations for positive impactful environmental effects, from here on out.
In a broader perspective, our findings show that supporting environment-related research can contribute to a reduction of carbon emissions. Indirectly, this suggests that supporting research and development, innovation, and applied knowledge with specific environmental objectives may contribute to the pursuit of sustainable growth strategies. Given that the normative process for the implementation of the EU Taxonomy remains ongoing, a further and future mandate emerges for the scientific research community, that is, to document how market operators will use this classification system in order to aim for the ultimate success of the Taxonomy initiative and, indeed, the achievement of its long-term environmental goals.
We are aware that this analysis of the EU Taxonomy in the context of the scientific literature represents only the initial step toward a more comprehensive assessment of this classification system. Further research is needed in various areas, and should be oriented toward verifying the specific efficacy of EU Taxonomy enforcement of its main goal, i.e., the channeling of financial flows towards sustainable economic activities; this research should include analyzing companies' strategic choices as well as the disclosure system in the financial marketplace.
In fact, on the side of large companies in need of being financed, enforcement of the Taxonomy could not only influence firms' strategic decisions, but could also be a powerful incentive to invest in activities aligned with environmentally sustainable growth, in an expected pattern of business model innovations to fill existing technological gaps. It is worth pointing out that the EU Taxonomy offers detailed criteria for determining, in a non-ambiguous way, whether an economic activity is environmentally sustainable, or not. Therefore, further research should be targeted to measuring the reactions of companies in terms of both reshaping their business models and strengthening their disclosure of environmentally sustainable activities. Subsequent research could also entail analyzing the effect of Taxonomy enforcement on reducing miss-classifications or manipulations of information, hidden behind the well-known phenomenon of greenwashing.
On the side of financial market participants offering financial products, the Taxonomy alignment appears to constitute a crucial step toward reducing transaction costs and information asymmetries. When the relevant market actors start to comply with the regulations, avenues for future research lead to empirically testing these points and to fully evaluate the effectiveness of this classification system. In fact, existing studies, and mainly actors in the financial marketplace, rely on environmental, social, and corporate governance (ESG) indicators, as they focus on the environmental factor, in particular see, e.g., [67]. A growing demand for ESG themes has led to a proliferation in rating offerings; however, their reliability and comparability are questionable, as agencies adopt different methodologies [68]. While our paper applies the EU Taxonomy to the scientific literature, we believe that a similar procedure, using text analysis, could be applied to companies' balance sheets and reports in the future. This qualitative data could complement the quantitative data and be used to measure the percentage disclosed by the company of the Taxonomy-aligned turnover (or capital expenditure). The outcome could help in better identifying Taxonomy-aligned companies, offering a more objective and comparable source of information to build ratings or scores, thus enriching the scientific debate in economics and finance.
Nevertheless, some limitations of this paper must be recognized. Firstly, it relies only on the literature obtained from the WoS core collection, which might not cover all the available literature on the scientific domain. Hence, our results could be biased by the fact that we apply a classification system-identified by European regulators mainly for European companies-to a worldwide scenario. Other geographical areas or countries may have different environmental objectives and priorities that nonetheless may lead to an environmentally sustainable growth. For those countries, our sample may underestimate the impact of scientific productivity. It is true that the EU Taxonomy is expected to exert a significant influence both for reporting standards and setting green criteria at the international level and not exclusively in Europe. However, whether or not this will actually happen is far from obvious. Secondly, a revision of the technical criteria of the EU Taxonomy, and of the transition activities, is foreseen at least every three years. Thus, the framework proposed in this analysis, could be refitted according to revisions that might be made over time, and this represents an implication in terms of opportunity for future research, as well.
In any case, regardless of the specific classification to be followed, the case of the EU Taxonomy has defined the working method and provided a valuable example of evidence-based policy making, based on the commitment and coordinated efforts of companies, financial market operators, research, and political communities, which are essential for environmentally sustainable growth.   A   Table A1. List of 72 economic activities by NACE Macro-Sectors.

Economic Activities
Afforestation (1) Production of Electricity from Solar PV (3) Storage of Electricity (5) Water collection, treatment and supply (6) Rehabilitation, Reforestation (1) Production of Electricity from Concentrated Solar Power PV (3) Storage of Thermal Energy (5) Centralized wastewater treatment (6) Reforestation (1) Production of Electricity from Wind Power PV (3) Storage of Hydrogen (5) Anaerobic Digestion of Sewage sludge (6) Existing forest management (1) Production of Electricity from Ocean Energy PV (3) Manufacture of Biogas or Biofuels (5) Separate collection and transport of non-hazardous waste in source segregated fractions (6) Conservation forest (1) Production of Electricity from Hydropower PV (3)

Retrofit of Gas Transmission and Distribution
Networks (5) Anaerobic digestion of bio-waste (6) Growing of perennial crops (1) Production of Electricity from Geothermal PV (3) District Heating/Cooling Distribution (5) Composting of bio-waste (6) Growing of non-perennial crops (1) Production of Electricity from Gas PV (3) Installation and operation of Electric Heat Pumps (5) Material recovery from non-hazardous waste (6) Livestock production (1

Economic Activities
Manufacture of other inorganic basic chemicals-Manufacture of disodium carbonate (2) Freight transport services by road (4) Production of Heat/cool from Bioenergy (Biomass, Biogas, Biofuels) (5) Individual renovation measures, installation of renewables on-site and professional, scientific and technical activities (7) Manufacture of other inorganic basic chemicals-Manufacture of chlorine (2) Interurban scheduled road transport (4) Production of Heat/cool using Waste Heat (5) Acquisition and ownership of buildings (7) Manufacture of other organic basic chemicals (2) Inland passenger water transport (4) Data processing, hosting and related activities (8) Manufacture of fertilizers and nitrogen compounds (2) Inland freight water transport (4) Data-driven climate change monitoring solutions (8) Manufacture of plastics in primary form (2) Infrastructure for low carbon transport (water transport) (4)