Evaluating the Causal Relations between the Kaya Identity Index and ODIAC-Based Fossil Fuel CO2 Flux

The Kaya identity is a powerful index displaying the influence of individual carbon dioxide (CO2) sources on CO2 emissions. The sources are disaggregated into representative factors such as population, gross domestic product (GDP) per capita, energy intensity of the GDP, and carbon footprint of energy. However, the Kaya identity has limitations as it is merely an accounting equation and does not allow for an examination of the hidden causalities among the factors. Analyzing the causal relationships between the individual Kaya identity factors and their respective subcomponents is necessary to identify the real and relevant drivers of CO2 emissions. In this study we evaluated these causal relationships by conducting a parallel multiple mediation analysis, whereby we used the fossil fuel CO2 flux based on the Open-Source Data Inventory of Anthropogenic CO2 emissions (ODIAC). We found out that the indirect effects from the decomposed variables on the CO2 flux are significant. However, the Kaya identity factors show neither strong nor even significant mediating effects. This demonstrates that the influence individual Kaya identity factors have on CO2 directly emitted to the atmosphere is not primarily due to changes in their input factors, namely the decomposed variables.


Introduction
Reducing carbon dioxide (CO2) emissions is currently one of the international community's main goals. However, despite the clarity of this objective, determining which measures have the strongest effects on CO2 emissions is difficult. To design appropriate strategies, policy makers need detailed insight into causality as well as the effects of individual measures. The Kaya identity is a tool to analyze the drivers of CO2 emissions by providing a conceptual framework to characterize those driving forces [1]. In this context, CO2 emissions are disaggregated into five factors: CO2 divided by fossil fuel-based energy consumption (EC), EC divided by total energy consumption (TEC), TEC divided by the gross domestic product (GDP), GDP divided by population (P), and P itself [2]. Although simple, the Kaya identity has a powerful ability to track progress in implementing CO2 emission reduction efforts, because many countries already express their climate policies in terms of Kaya components [3][4][5]: Under the Paris Agreement, treaty parties set nationally determined contribution (NDCs) plans based on the Kaya identity with respect to various CO2 emission scenarios. Thereby, the parties of the United Nations Framework Convention on Climate Change (UNFCCC) branch out the Kaya identity factors into different sectors to design detailed NDC goals. In this context, policy makers focus on reducing the driving forces behind Kaya identity factors by managing the intensity of decomposed variables (e.g., TEC and GDP). A common climate change policy is to promote electricity generation based on solar and wind power. To supersede and reduce fossil fuel electricity generation, solar and wind power generation have been growing at about 37% (solar) and 23.4% (wind) per year on average [6].
However, a notable drawback of the Kaya identity is that it is an accounting equation and the driving forces it addresses are not independent. The fixed links of the Kaya identity do not represent causality and ignore non-monotonic or non-proportional effects between individual factors [7]. For instance, scenario builders often assume that high economic growth rates result in high capital turnover, which encourages the development of more advanced and more efficient technologies, in turn leading to lower energy intensities for the economy [2]. Another recent example of dependencies between Kaya identity factors and CO2 emissions is the decrease in CO2 emissions induced by the COVID-19 crisis. The COVID-19 pandemic has considerably transformed energy demand patterns around the world by, for example, closure of international borders, underutilization of labor and capital, increased international trade costs, decreased travel services, and a redirection of demand away from activities [8]. As a result, global GDP is predicted to decrease by 6% for single-hit scenarios and by about 7.6% in case of a double-hit scenario, which became reality in the second half of 2020 [9]. In concert with a severe economic recession, by early April 2020 daily global CO2 emissions declined by 17% relative to 2019 [10].
Most previous studies have focused on the crucial drivers contributing to inventory CO2 emissions by disaggregating the Kaya identity factors using either the Stochastic Impacts by Regression on Population, Affluence, and Technology (STIRPAT) model or the Logarithmic Mean Divisia Index (LMDI) method [11,12]. LMDI, which applies the Divisia index and logarithmic mean weighted values to decompose CO2 emissions, quantifies the relative influence of various factors on the changes of an aggregate CO2 emissions indicator. The method implies that changes in the relative size of factors can change the factors' relative influences on the CO2 emissions [13,14]. The STIRPAT model, developed as a stochastic version of the impact by population, affluence, and technological development (IPAT) approach, allows testing of the hypothesis that impacts have a weaker relationship with affluence or economic development over time. The STIRPAT model allows the application of sociological theory regarding anthropogenic drivers of environmental impacts, and individual driving forces can be quantified by their respective coefficients [15,16]. Both LMDI and STIRPAT focus on the individual driving forces of the Kaya identity factors, which allows examination of the direct connection between each Kaya identity factor and CO2 emissions. However, possible interactive effects between the individual Kaya identity factors (and/or their decomposed variables) are neither analyzed nor quantified. Causality issues are also ignored. Han et al. (2018) considered this aspect by conducting a Granger causality test. They intended to explore the directional causality between CO2 emissions, economic growth, urbanization, and material stocks [17]. Duro and Padilla (2006) proposed applying the Theil index to decompose international inequalities in per capita CO2 emissions into Kaya identity factors and two interaction terms. They found that the international inequality in per capita CO2 emissions can be attributed to inequality in per capita income levels [18]. Lately, Hwang et al. (2020) [19] looked closer at the mutual interdependencies between the Kaya identity factors as well as their decomposed variables and the ODIAC-based fossil fuel CO2 flux. Thereby, ODIAC stands for Open-Source Data Inventory of Anthropogenic CO2 emissions. Their focus was on EU countries, and they identified different spatial patterns in the computed correlation values. Also, the authors found significant multicollinearity among the decomposed variables, which is not reflected in the Kaya identity itself. Thereby the focus was on the potentials and constraints of the decomposed variables in explaining the variations of intensity of fossil fuel CO2 flux. However, so far, it has not been deeply explored whether there are any a priori causal relations and indirect effects among the driving forces of Kaya identity factors themselves, or between Kaya identity driving forces such as their subcomponents and actual CO2 emissions. Therefore, in this paper, we analyze these interactive effects and causalities in depth using a concept called mediation analysis. Mediation analysis is applied to identify the underlying relations between a dependent variable and an explanatory variable via a third variable, which is called a mediator. This is done by designing and calibrating multivariate linear models in which the mediator variable is inserted between the original independent variables and the dependent variable. It is a fairly simple setup, but it allows understanding of the nature of dependence between dependent variables and explanatory variables via quantifying indirect effects among the explanatory variables and direct effects between individual independent and dependent variables. Hence, it is apt for our purpose: we want to find out tangible evidence for interdependencies among the Kaya identity factors. Can we quantify the causal relations among the Kaya identity factors and their decomposed variables? Can we also quantify the causal effects of the mentioned factors on actual CO2 emissions? The answers to those questions can help us to monitor the execution effects of mitigation policy efforts and set effective targets for decreasing CO2 emissions. Our results show-similarly to previous studies [19]-multicollinearity among Kaya identity factors' decomposed variables. Besides which, using mediation analysis, we were able to identify both direct and indirect effects (including causalities) among those variables and from those variables on actual CO2 emissions, which is a novelty compared to existing studies. We found out that the net (causal) effects among the decomposed variables alleviate the direct effects of their corresponding composite Kaya identity factors on actual CO2 emissions. This exacerbates to identify the true driving forces among the Kaya identity factors on actual CO2 emissions. In practical terms: common policy measures like subsidizing electric vehicles aim to influence the decomposed variables of the Kaya identity. However, according to our results, this does not necessary imply a reduction of CO2 emissions. The Kaya identity factors are only weak and sometimes ambivalent mediators, which implies the need for measures that directly reduce CO2 emissions.
The paper is structured as follows: in Section 2 the composition of our data set is explained, while, in Section 3, the reader can find an introduction to the methodology used in this paper, with a special focus on mediation analysis. The results of our research are presented in Section 4, and discussed more in detail in Section 5. Here, we comment also on possible implications for politics. Section 6 summarizes and concludes the paper.

Materials
There is a wide range of economic and environmental data available; in Section 2.1 we motivate why we focus on a subset of European countries. Section 2.2 contains a details explanation of our set of historic CO2 observations, while Section 2.3 defines the Kaya identity in detail, gives an overview over all data sources and provides some fundamental statistics for all data sets.

Countries Included in the Study
European countries in the Annex 1 group have accurate and developed statistical technologies for recording both EC and CO2 emissions [20,21]. Europe is the second smallest continent in the world after Australia, consisting of 44 highly diverse and densely populated countries. These countries differ with respect to EC, industry, and many other factors [22], making Europe ideal for exploring the causal relations between fossil fuel CO2 flux and Kaya identity factors under different conditions. In addition, Europe is taking a leading role in implementing active climate change policy, as all countries jointly decided to reduce CO2 emissions by 40% until 2030 and by even 55% until 2050 (in relation to the 1990 levels). The latter goal was set in the context of the European Green Deal [23], which-among other things-involves the transition of the EU's economies towards using clean, affordable, and secure energy. To have a legal measure for enforcing this goal, i.e. climate neutrality, by 2050, the European Commission designed the European Climate Law [23,24]. Thus, Europe shows more substantial changes with respect to the Kaya identity and CO2 emissions than other continents. To process the fossil fuel CO2 flux, accurate preliminary CO2 emissions data are required because CO2 emissions are recorded and fixed by in situ surveys. The uncertainty report for Annex 1 countries notes that the uncertainty of CO2 emissions associated with fossil fuel combustion is less than 6% on average [20]. European countries are a reliable study area for exploring the causality between decomposed variables and Kaya identity factors and fossil fuel CO2 flux. For this purpose, we selected 30 Annex 1 countries in Europe, excluding countries that are either too small or too far away from continental Europe. We excluded those countries because it is hard to establish a precise countryspecific flux data base as the satellite-based data are recorded on a 1° × 1° grid.

ODIAC-Based CO2 Emission Data
Previous literature used inventory net CO2 emissions on a national scale [25][26][27]. Net inventory CO2 emissions fluctuate depending on several variables, such as the collection and reporting system used for the country's energy statistics, data definitions and data processing methods, level of detail, and specific local conditions. In addition, the accuracy, transparency, and uncertainty of inventory CO2 emission data varies between countries due to differences in the proficiency in and level of development of statistics [28][29][30][31][32]. These fluctuations and uncertainties preclude the interpretation of interactive effects and causality between decomposed variables and Kaya identity factors against CO2 emissions in reality. To explore the true causality and existing interactive effects between the decomposed variables and Kaya identity factors, as well as their effects on actual CO2 emissions, we need reliable CO2 emissions data measured using standardized tools and methods on a regional scale with high spatial resolutions. For this purpose, we chose ODIAC data. ODIAC was developed by Oda and Maksyutov in the context of the Greenhouse Gases Observing Satellite (GOSAT) project at the National Institute for Environmental Studies (NIES) in Japan [33]. The purpose of the ODIAC is to provide precise prior fossil fuel CO2 emissions data for global and regional CO2 inversions using the column-averaged CO2 (XCO2) data collected by GOSAT. The ODIAC has been widely utilized for the inversion of the official GOSAT Level 4 CO2 flux data [34][35][36]. ODIAC data are global, highresolution monthly emission data, which are the result of a spatial disaggregation of total national emission estimates taken from the Carbon Monitoring for Action (CARMA) data set. It includes emission levels of all types of power plants (fossil fuel, nuclear, hydro, and other renewable energy plants) for over 50,000 locations [37]. ODIAC data at country scale are built to match data from the Carbon Dioxide Information Analysis Center (CDIAC) with a 1 × 1 km spatial resolution [36,[38][39][40]. The ODIAC-based fossil fuel CO2 flux has detailed information on regional CO2 sources in terms of distribution patterns and actual fossil fuel combustion with a high spatial resolution, including the international bunker, which is not presented in the Carbon Dioxide Information Analysis Center (CDIAC) [36]. Thus, we can specifically measure the CO2 amounts directly emitted to atmosphere precisely in those areas where CO2 sources are located [39,41,42]. Therefore, the ODIAC-based fossil fuel CO2 flux can facilitate the exploration of realistic causality and interactions among Kaya identity subcomponents against directly emitted to the atmosphere on a regional scale. In this study, we used the ODIAC-based fossil fuel CO2 flux from GOSAT L4A global CO2 flux V02.06 provided by the National Institute for Environmental Studies (NIES) of Japan [43].

Kaya Identity-The Concept
The IPAT model ( I = P × A × T ) explains the level of environmental impact (I) from anthropogenic activities with population size (P), affluence (A)-i.e. the level of income, and technological development (T). Kaya identity is based on the IPAT model, but it is more specific regarding the total level of CO2 emissions. Kaya identity expresses the CO2 emissions as the product of four factors: (1) population (P), (2) GDP per capita (GDP/P), (3) energy intensity per unit of GDP (total energy consumptions/GDP), and (4) carbon intensity (CO2 emissions/total energy consumptions) [44] (Figure 1). Recently, many studies extended the carbon intensity in the traditional Kaya identity by decomposing it into CO2 emission intensity related to fossil fuel consumption and the share of fossil fuel in the total energy consumption [11,25,45]. In its extended form, the Kaya identity splits fossil fuel CO2 emissions into five different factors: P, G = GDP/P, E = TEC/GDP, M = EC/TEC, and I = CO2 emissions/EC. The factors G, E, and M quantify per capita income, energy intensity, and the share of fossil fuel consumption relative to TEC, respectively. The factor I describes the fossil carbon intensity of energy [46]. The corresponding formula reads as follows: In a second step we also modify the right side of Equation (1). In order to assess the influences of anthropogenic drivers on fossil fuel CO2 flux presented in the Kaya identity model, we use a modified form of the Kaya identity with respect to the fossil fuel CO2 flux [47]. The modified Kaya identity splits up the fossil fuel CO2 flux multiplicatively instead of the CO2 emissions. As suggested by Le Quéré et al. (2018) [48], we recalculate the fossil fuel CO2 flux in CO2 equivalents (F ) to serve as estimates of CO2 emissions instead of inventory CO2 emissions in Equation (2) [47] and replace it by F = F /EC, i.e., the fossil carbon intensity of energy. Hence, in Equation (2), F appears on both sides of the equation.

Kaya Identity-The Data Sets
The International Energy Agency (IEA) collects EC and production data globally [50,51] and categorizes energy statistics into 13 energy types. In the IEA Energy Statistics [50,51] observations per energy type are converted into energy units (Ktoe). Here, we applied the total final consumption sector from the IEA energy balance from 2010 to 2017 to calculate TEC and EC, which we used to compute the factors E (TEC/GDP), M (EC/TEC), and F factor (F /EC). To attain EC, we extracted the fossil fuel consumptions (coal, crude oil, oil products, natural gas) from the IEA energy balance. We use 2010-2017 GDP and population data from the World Bank data base in order to derive the factors G (GDP/Population), E (TEC/GDP), and P (population size). For an overview over variable definitions and the sources of our data sets, see Appendix A.
To compare the factors' variation, we calculated the coefficient of variation (CV), which is defined as a factor's standard deviation divided by its mean. Hence, CV indicates the size of variation relative to a factor's average level. As shown in Table 1, most factors have large CV values. M, which is the share of fossil fuel consumption in net EC shows the lowest CV (approximately 18%), meaning that over the years and over all countries, the ratio of fossil fuel consumption is relatively constant compared to other factors. Apart from that, high CV values are not surprising as we used data from 30 heterogeneous countries.

Methodology
Our analysis is based on the calibration results of a multiple linear regression model combined with a mediation analysis. Multiple linear regression is widely known and often used; hence we keep Section 3.1, which concerns the regression models, short and focus on mediation analysis in Section 3.2.

Multiple Regression Models
To examine the mutual interdependencies between the Kaya identity factors and F , we established a regression model that uses Kaya identity factors as independent variables and F as the dependent variable. Equations (3) and (4) display the corresponding multiple regression models calibrated using the ordinary least squares method.
where α , β ∈ ℝ, i, j = 1, … ,5, and both ϵ and ϵ are Gaussian distributed with mean values of zero and standard deviations σ > 0. Equation (3) is the regression model based on the Kaya identity and Equation (4) (3) and (4) are annual net amounts and observations of F are annual mean values.

Mediation Analysis-Introduction and Research Design
Mediation analysis is a statistical method to analyze how the causal antecedent X transmits its effect to the consequent variable Y [52]. A mediation model intends to identify and explain the causality between an independent variable and a dependent variable by inserting a third variable known as a mediator variable [53]. In lieu of a direct causal relation between the independent and dependent variables, a mediation model suggests that the independent variable affects the mediator variable, which again influences the dependent variable. Thus, the mediator variable illuminates the level of correlation (or interdependence in general) between independent and dependent variables [54]. Mediation analysis helps with finding a valid interpretation of the relationship between independent and dependent variables when those variables seem to have an ambiguous connection.
A parallel multiple mediator model with k mediators has k + 1 consequent variables (one for each of the k mediators M , i = 1, … , k, and one for the outcome variable Y). This requires the calibration of k + 1 equations to cover all possible effects of X on Y [55], and the formulas for these calibrations are expressed in Equations (5) and (6): In these parallel multiple mediator equations, ai measures the effect of X on M ; b estimates the effect of M on Y, controlling X and the other k − 1 mediator variables M . The factor c calculates the effect of X on Y , holding all n mediator variables M constant [55]. Both intercept and intercept are constant factors. If M is an effective mediator, it is substantially correlated with X due to the path from X to M (path a). If X explains most of the variation of M , there would be no unique variation in M to explain Y. According to literature the minimum required sample size for testing where k is the sample size and r is the correlation coefficient between the causal variable and the mediator. If M is a successful mediator (i.e., path or coefficient a is large), in order to properly examine coefficients b and c', we need a comparably larger sample size in order to achieve a test power equivalent to the case of M being a weak mediator [56].
In this study, we applied the parallel multiple mediator model to explore the causality between each of the Kaya identity factors as well as their respective decomposed variables and F . Each of the total effects from antecedent X (Kaya identity factors and decomposed variables) on consequent Y (F ) could be partitioned into direct and indirect effects through at least one mediator. The proposed mediators (i.e., Kaya identity factors and decomposed variables) can be associated with an outcome either because it gives rise to the outcome or because it correlates with another variable that causally affects the outcome. Hence, including multiple mediators between independent and dependent variables allows detection of the underlying causality. To identify indirect effects among the Kaya identity factors and their decomposed variables, we used all of them as mediators and performed multiple analyses simultaneously. However, in order to do that we had to overcome one problem: our sample size is comparably small as we are handling annual data. Preacher and Hayes (2004) proposed a solution by applying the widely used bootstrapping method [57]. Bootstrapping is a method especially developed for small sample sizes that has the advantage of being non-parametric; therefore, we do not need to impose any distributional assumption on the original data. By resampling the existing set of observations this method allows us to increase the sample size, which, in turn, increases the chance of finding statistically significant mediation effects. For example, if we obtain a non-zero-point estimate, we can only confirm its statistical significance if the zero is not contained in the corresponding confidence interval, the size of which directly depends on sample size because larger samples have smaller confidence intervals. This study has a relatively small number of observations for each variable (240). Hence, by applying the bootstrapping method we increased the chances of finding significant mediation effects. Here we applied the Preacher and Hayes bootstrapping method with 5000 bootstrap samples to explore possible mediation effects.

Two Multiple Linear Regression Models for the Fossil Fuel CO2 Flux-Based CO2 Emissions
We calibrated the models from Equations (3) and (4) and performed a mediation analysis (see Section 3.2) to identify causalities between the Kaya identity factors and their decomposed variables.
As Table 2 shows, the multivariate linear regression models based on Equations (3) and (4)  Hence, their influence on F cannot be proven using the model of Equation (3). Another remarkable result is that Decomposition 2 shows significant multicollinearity whereas Decomposition 1 does not. Similar results were demonstrated by Hwang et al. (2020) [19], whichcontrary to this analysis-only focused on the superficial relations between the decomposed variables and the ODIAC-based fossil fuel CO2 flux with an additional focus on the country-specific situation.
Here we look closer into the causal relations between the individual factors by using mediation analysis (see Section 4.2).  Note that the factor Population, which is identical to Kaya identify factor P, is not statistically significant (with a p-value of 0.22) whereas P is (p-value of 0). Hence, all relevant information about population size must already be included in the other factors. Thus, we also tested various multiple regression models with different combinations of decomposed variables from Decomposition 1. The results of this test are shown in Table 3. We saw that Population is statistically significant when it is used as an exogenous variable in combination with GDP, TEC, or EC. However, if two of the variables are included in the regression, the p-value of Population increases significantly (up to 0.34). Regarding multicollinearity between GDP, TEC, and EC, the VIF values vary with the model setup. If at least two of GDP, TEC, or EC are used together in the regression, we saw a considerable increase of variance inflation factor (VIF) values (up to 26.23). These VIF values are remarkably high when TEC and EC are used conjointly in the regression model. This seems reasonable when considering the fairly high correlation value between both variables (see Table 4). To look more deeply into the interdependencies among Decomposition 1 and 2, we applied mediation analysis as described in Section 3.2, and use F as endogenous variable in all setups.   . There is seemingly no direct relationship between Population and F ; however, there is an indirect one that is mediated by some decomposed variables ( Figure 2). The strong mediation effect causes Population to be statistically insignificant in the regression on F . In contrast, Population has a comparatively weaker mediation effect on GDP, TEC, and EC, ranging from -10.19% to -9.99%. However, GDP, TEC, and EC have strong mediating effects on each other, with values ranging from −42.09% to 79.69 % on F (see Table 5). This means that GDP, TEC, and EC are highly dependent on each other, which supports the results of the above correlation analysis (see Table 4). When GDP, TEC, and EC are applied as mediators to each other, they mediate the other decomposed variables by explaining most of the variation caused by mediated variables. Thereby, the high indirect effects from GDP, TEC, and EC on each other indicate high multicollinearity among these variables. In this regard, we also see that the lack of statistical significance of E, which is calculated by dividing TEC by GDP, is also due to the opposite mediating effects between TEC and GDP. A strong correlation between X (TEC, GDP) and the mediator (E factor) can inflate the standard error of Path b (E factor→F ). If Path b is not significant because of multicollinearity, the indirect effect necessary to create mediation would be probably insignificant [58]. GDP has a significant negative mediating effect on TEC, at a level of -42.09%. This means that GDP depresses the direct effect of TEC on F at a level of -42.09%. In contrast, TEC has a positive mediating effect on GDP, encouraging a direct effect of GDP on F at a level of 78.81%. Hence, TEC and GDP mutually compensate their influences on F , which flattens the influence E has on F and causes the insignificancy.

Indirect Effects between the Decomposed Variables and Kaya Identity Factors on ODIAC-Based Fossil Fuel CO2 Flux
In our mediation analysis we observed that Kaya identity factors have weakly negative mediating effects on the casualty between the decomposed variables of Kaya identity factors and F . Indirect effects of Kaya identity factors range from -5.86% to -2.12%, except for GDP versus Kaya identity factors, which is 31.69 (Table 6). Even though the mediator variables, which are the Kaya identity factors in this subsection, do not change the direction of the relationships between the decomposed variables and F , they slightly decrease the strength of the existing relationships ( Figure 3). In particular, the high (total) indirect effect of GDP results from the high indirect effects of P. That is because countries with a comparably higher GDP, such as France, Germany, Italy, Spain, and the United Kingdom, are among the most densely populated countries in this study. In addition, the Kaya identity factors and their respective decomposed variables are statistically insignificant in terms of mediating effects. Table 6 displays statistically insignificant indirect effects between the Kaya identity factors and their decomposed variables in three scenarios: (1) Population→G factor→F , (2) GDP→G factor→F , and (3) EC→M factor→F . Both G and M factor do not play a statistically significant mediating roles for their decomposed variables, which is also true for Population, GDP, and EC. This can be shown by considering the confidence interval for the case where G operates as mediator for Population, which ranges from a lower level of -0.21 to an upper level of 0.05. The confidence intervals for the scenarios where G operates as a mediator for GDP or M factor as a mediator for EC are -0.02, 0.34 and -0.10, 0.2 , respectively. In all three cases, zero is included in the confidence interval. Hence the effects are not statistically significant. In other words, the variation of some Kaya identity factors (G and M factor) cannot be explained by their decomposed variables.  Given our dataset, there is a fair chance that we observe significant multicollinearity in the regression of the decomposed variables on F as indicated by the VIF values in Table 2. Mediation analysis reflects these interdependencies as net effects among the decomposed variables are quantitative visible and measurable. As displayed in Table 5 and Figure 2 we see indirect effects between −42.09% and +79.96%. These are significant effects, which, besides multicollinearity, are underestimated or in the Kaya identity model, where a mediation analysis shows weaker indirect effects between −5.26% and 38.29% (see Table 6 and Figure 3). Hence, changes of the decomposed variables have a considerably smaller effect on the CO2 emissions because of weak mediating effects of the Kaya identity factors and strong mediating effects among the decomposed variables. Quantifying such effects is a benefit of mediation analysis and contributes to the existing literature. Previous studies have already attempted to account for multicollinearity. Authors like Purcel (2020) [59], Georgiev and Mihaylov (2015) [60], and Choi et al. (2010) [61] mainly focused on empirically explaining interdependencies using the environmental Kuznets curve (EKC) [62]. Multicollinearity is sometimes also accounted for by decomposing the independent variables and excluding selected ones from the regression [63]. However, despite being reasonable, these decomposition approaches face the danger of an omitted variable bias. Besides, if you first decompose and then eliminate input factors, the whole theoretic foundation of the regression model itself might be altered [64]. Consider, for example the Kaya identity or the IPAT modification. These are basically composite indices that have been in use for the past 15-20 years, and many studies and political decisions have been based on findings derived from this model. If you decompose the influence factors and eliminate those causing multicollinearity, you have to reconsider all theories based on the Kaya identity model.
Alternatively, some case studies examine the correlation, i.e., interdependencies between CO2 emission and its contributors with composite index factors that are the result of combining two highly correlated variables. In other words, the composite index is used as a measure to eliminate multicollinearity. Tavakoli [4], for example, conducted a multiple regression on the Kaya identity factors from the top ten CO2 emitting countries. It postulates that population, energy intensity, and GDP per capita are major influential factors. However, if-as in our data set-the measured correlation is significantly large but not perfect (i.e., +/−1), a model based on composite factors might over-or underestimate the net effects of changes in the original independent variables [65]. Even if this is not the case: a composite model provides no precise insight, whereby our mediation analysis does.
To sum up our results: because it is important for policy makers, there is substantial literature exploring the driving forces of the decomposed variables of Kaya identity. As explained, current approaches have limitations as problems like omitted variable bias or over-/underestimation of a parameter's influence might occur. In contrast, using mediation analysis in this study, we are able to evaluate the interactive net effects among the Kaya identity factors and their decomposed variables on direct CO2 emissions to the atmosphere. This is especially of relevance as we analyze recent emissions from 30 European countries, which, being classified as post-industrial economies, have entered a new stage in the EKC [66]. These insights can contribute to decisions made by EU policymakers, who currently struggle to find appropriate measures to mitigate CO2 emissions. Knowledge about the direct and indirect effects of the decomposed factors of Kaya identity provides valuable information for this purpose.

Discussion
In Section 5.1, we discuss implications of the multicollinearity and the mediation effects found in Section 4 in more detail. Besides, in Section 5.2, we briefly reflect on the benefits of the Kaya identity concept in the light of both our findings and of the criticism found in contemporary literature. This is important, as we can only draw accurate conclusions from our findings if we are also aware of the model's shortcomings.

Consequences of Multicollinearity
As noted in Section 4.1, the decomposed variables of the Kaya identity factors show strong multicollinearity. That means changes in the levels of the decomposed variables significantly account for variation in other decomposed variables. Hence, we concluded that the decomposed variables are controlled by other decomposed variables that function as mediating factors in our study. On the contrary, the variance of the Kaya identity factors cannot be substantially explained by their decomposed variables as they are independent of each other. The Kaya identity is by definition a multiplicative identity, and its factors are computed by dividing two decomposed variables. It is thus the ratio and proportional intensity of the anthropogenic CO2 emitting activities. The calculation process eliminates the multicollinearity of the respective decomposed variables. The results of the mediation model in Table 6 support the Kaya identity factors' independence.
Multicollinearity of the decomposed variables means that if two decomposed variables of a specific factor are changed with similar growth rates, the driving force of the factor on CO2 emissions will be stable or change only slightly. As TEC, EC, and GDP remain stable or decline, it stands to reason that CO2 emissions are likely to remain stable or decline as well. To examine this slightly different scenario with a focus on the dynamics, i.e., the relationship between changes in Kaya identity factors and changes in their decomposed variables, we fitted another linear model. This is based on a concept called proportional growth rate r X), which is defined as r X) = (%/year) [47]. We fitted the model to all combinations of growth rates of both decomposed variables and Kaya identity factors, and found that-except for GDP-all models have low R 2 values, ranging between 0.02 and 0.54 (Figure 4). The GDP's R 2 is comparatively high, especially considering the regression of GDP's growth rate on the growth rates of both E and G, with values of 0.84 and 0.99, respectively ( Figure 4). This means that-except for GDP, Population, and G-there is only a fairly weak connection between the decomposed variables' proportional growth rates and the corresponding Kaya identity factors' growth rate. As mentioned, we measured multicollinearity among the decomposed variables as well as strong mediating effects among them. Hence, the decomposed variables have a comparatively weak influence on the Kaya identity factors. In addition, unexpected events such as economic crises, warmer winters, or the COVID-19 pandemic affect the individual decomposed variables simultaneously. For example, net EC and fossil fuel combustion are the major drivers of and targets for mitigating CO2 emissions. Therefore, reducing the decomposed variables does not always yield the desired effect, as the respective resulting Kaya identity factors may not have a large enough negative effect on CO2 emissions.

Kaya Identity in the Context of Contemporary Literature
Fischer-Kowalski and Amann (2001) [67] stated that-in the context of the Kaya identitypopulation (P) and technology (E, M, F) seem to dominate GDP per capita or affluence (G) in terms of environmental impact (CO2 emissions). Looking closer, affluence is related to the development and the internal structure of the economic sector, where (besides mobility) the global supply chains of companies are one of the major drivers of CO2 emissions. Only considering the factor G in the Kaya identity ignores the complex interdependencies among and between the country-specific economic sectors. Ahi [70], for example, addressed this problem by analyzing and proposing a green supply chain management. Karl and Ranné (1997) and Sadik-Zada and Gatto (2020) stated that conjoint efforts towards a decarbonizationfriendly transition of the energy mix, tertiarization, general trade openness, and green supply chains would significantly contribute to mitigating CO2 emissions [24,[71][72][73].
Our analysis cannot cover these international developments, as it is based on the Kaya identity. However, by considering the decomposed factors and the interdependencies among them, we provide a basis for further research concerning intercountry effects. Peters et al. (2017) discussed the recent slowdown of CO2 emissions with Kaya-derived indicators by using an interconnected and nested structure composed of different forms of Kaya identity components [49]. They observed that economic factors and energy efficiency have contributed more to decreased CO2 emissions than the adoption of wind and solar power. They showed that most of the indicators are currently consistent with emission scenarios from the Paris Agreement goal of keeping the temperature increase below 2 °C. Nonetheless, literature shows that the lack of carbon capture and storage (CCS) technologies, slow improvements in energy efficiency, or missing transformations of energy structures would threaten the 2030 goals and net-zero emissions targets of the Paris Agreement. These issues are also well-projected in this study. Changes in the decomposed variables cannot directly and effectively affect the reduction of fossil fuel CO2 flux because of multicollinearity among them. To increase practical CO2-mitigating effects from key Kaya identity factors, CO2 emissions must be reduced at the same time. In this regard, additional policywise efforts, e.g. towards increasing energy efficiency, reducing energy consumption itself, and/or implementing new technologies like CCS are required. Independent of politics, industry has to contribute as well. e.g. by designing and implementing sustainably strategies (see, e.g. Scarpato et al. (2020) [74]) or, as mentioned above, focusing on a greener, i.e. less carbon-intensive, supply chain.

Conclusions
In this study, we evaluated the causal relationship between the Kaya identity factors and their decomposed variables to explore the real interactive influences on F , which is the CO2 emitted from in situ fossil fuel CO2 sources to the atmosphere. We also include indirect effects among the decomposed variables. The analysis was done by evaluating a linear model where the decomposed variables of the Kaya identity serve as explanatory variables and F as dependent variable. An analogous model was set up for the Kaya identity factors as well. Also, to look deeper into the direct and indirect dependencies between the individual variables, we conducted an extensive mediation analysis.
Results (Section 4) show that these indirect effects are especially significant; however, there are no strong or significant mediating effects from the Kaya identity factors on their decomposed variables. With respect to the proportional growth rate, changes in the decomposed variables do not lead to diverse and enormous growth rate changes of Kaya identity factors. Because of the strong indirect effects among the decomposed variables on F , policy makers must consider the causalities between the decomposed variables and Kaya identity factors to set realistic carbon mitigation targets. The quantitative net effects (direct effects and indirect effects) among the Kaya identity factors and their decomposed variables indicate that there are hidden and intricate interdependencies.
For example, weak indirect effects between the Kaya identity factors themselves suggest that changes in the decomposed variables unproportionally affect the Kaya identity factors and hence F . Our analysis using proportional growth rates confirms these findings: except for G, GDP, and population, there are only weak interdependencies between the growth rates of the Kaya identity factors and its corresponding decomposed variables. Furthermore, as discussed in Section 5.2, the Kaya identity model is a national model-the international flow of goods and energy, for example, is not considered.
To sum up our findings: technically speaking, most policy measures target one or more decomposed variables of the Kaya identity [3][4][5]-if possible while keeping negative effects on the economy at a minimum. Policymakers thereby especially focus on E (CO2 emissions/EC) and M (EC/TEC). The factor P cannot be controlled that easily, and both G and I are closely related to economic growth, i.e. should not be influenced negatively. Popular examples are subsidies for electric vehicles and renewable energies like solar power. However, given the weak mediating effects of the Kaya identity factors, which are, according to the Kaya model equation, eventually responsible for reducing CO2 emissions, this is not enough. Even if policy measures to mitigate CO2 emissions successfully influence either E or M or both, this does not necessarily imply a reduction of emissions on a national level. The results of our study can be used to account for such unforeseen effects and adjust policy strategies adequately. Besides this, our results show the need for a global sustainable strategy to reduce CO2 emissions directly and not only by influencing one of the factors discussed above. However, some promising new technologies like CCS are controversially discussed, as their long-term impact on environment is still unclear. A lot of measures that reduce emissions are also quite costly. This shows the need for more research about the cross-border effects of CO2 emissions reduction measure, e.g. along the global supply chain, to decide which of them is most effective. Besides, note that when interpreting this study, we had a short dataset consisting of observations from eight consecutive years. Additionally, only European countries were included in this study. Hence, more research involving a longer set of observations (20 years or more if possible) and data from other continents is required and would produce more information for designing appropriate policy measures. Funding: This article is co-funded by the Baden-Württemberg Ministry of Science, Research and Culture. data, energy consumption, population, Gross Domestic Product (GDP) and National Inventory Report data.

Conflicts of Interest:
The authors declare no conflicts of interest.