Integrated Multidimensional Sustainability Assessment of Energy System Transformation Pathways

: Sustainable development embraces a broad spectrum of social, economic and ecological aspects. Thus, a sustainable transformation process of energy systems is inevitably multidimensional and needs to go beyond climate impact and cost considerations. An approach for an integrated and interdisciplinary sustainability assessment of energy system transformation pathways is presented here. It ﬁrst integrates energy system modeling with a multidimensional impact assessment that focuses on life cycle-based environmental and macroeconomic impacts. Then, stakeholders’ preferences with respect to deﬁned sustainability indicators are inquired, which are ﬁnally integrated into a comparative scenario evaluation through a multi-criteria decision analysis (MCDA), all in one consistent assessment framework. As an illustrative example, this holistic approach is applied to the sustainability assessment of ten different transformation strategies for Germany. Applying multi-criteria decision analysis reveals that both ambitious (80%) and highly ambitious (95%) car-bon reduction scenarios can achieve top sustainability ranks, depending on the underlying energy transformation pathways and respective scores in other sustainability dimensions. Furthermore, this research highlights an increasingly dominant contribution of energy systems’ upstream chains on total environmental impacts, reveals rather small differences in macroeconomic effects between different scenarios and identiﬁes the transition among societal segments and climate impact minimization as the most important stakeholder preferences.


Background
From the Paris Agreement and the related Nationally Determined Contributions (NDCs), to the European Green Deal [1], national energy and climate plans [2], climate Life cycle assessment (LCA) is increasingly combined with energy system models (see [15] for more references). This approach allows one to estimate the life cycle-based environmental impacts of energy transition or to take ecological factors into account when optimizing the expansion of the energy system [35]. Challenges of such integration approaches are discussed in [36].
On the other hand, the assessment of economic impacts of transformation strategies requires an integrated macroeconomic model with all relevant linkages between the energy system and the economy. Transitional changes in the energy system lead to new constellations of prices, as well as structures of demand and supply. Relationships between the decisions of companies and consumers, on the one hand, and changes in the energy sector, on the other, should be represented in an input-output context. Furthermore, it is important to consider that there is a path dependency in the economy which can be represented particularly well in an empirically based, dynamic model with high disaggregation [37]. For an overview of approaches and challenges for macro-economic models that have a special focus on energy economy, see, e.g., [38,39].
Another approach for a multidimensional development and assessment of transformation strategies for the energy system are Integrated Assessment Models (IAM). They aim at integrating main features of the energy system, economy, climate and environment in one modeling framework (see, e.g., [40][41][42][43][44]). They play a central role in the development and evaluation of technologically and economically feasible strategies to meet climate goals, such as the Paris Agreement [45].
When aiming at a broad assessment framework, many aspects (i.e., ecological, economic, and social) come together that need to be systematically evaluated. This is where multi-criteria decision analysis (MCDA) methods come into play. MCDA methods have already been applied in the context of energy system assessment [46,47]. According to [48], the most popular techniques for multi-criteria decision analysis for sustainable energy planning are AHP, PROMETHEE, and ELECTRE. Other methods such as MAUT, weighted sum methods, weighted product methods and TOPSIS were mentioned as appropriate methods in [49]. Multi-criteria decision analysis has already been applied in order to develop and evaluate energy scenarios as, e.g., in [50].

Research Questions Addressed in This Paper
In this paper, we present new generic modeling and assessment approach for energy scenarios, in which technical-structural development paths of the energy system, as well as their economic and ecologic consequences, can be analyzed and evaluated. The approach combines energy system models, life cycle assessment databases and macroeconomic models. To deal with the complexity of the various sustainability dimensions, we apply MCDA methods using stakeholder preferences derived from a discrete choice experiment. Such an approach provides a powerful toolbox for an integrated assessment of the various dimensions of sustainability for existing and future energy scenarios, and it might help to develop and test this assessment approach on a variety of existing energy scenario studies in Germany. On one hand, our research questions focus on the variety of these energy transformation pathways: • How can strategies to transform an energy system towards low CO 2 emissions, as formulated in variety of different studies, be compared in terms of their environmental, economic and social impacts? How can these strategies be harmonized to allow for a fair comparison?
On the other hand, our research addresses the challenge of integrating various sustainability dimensions, setting the scene for a more genuine sustainability assessment: • How do variations in technical solutions in these different CO 2 reduction strategies influence environmental and (macro-)economic impacts? • How can stakeholder preferences be integrated into MCDA methods? What influence do stakeholder preferences have on the ranking of scenarios? How can robust results be achieved?
The paper here focuses on the presentation and discussion of the methodology of an integrated scenario assessment. Results of the approach are presented primarily to illustrate the outcomes of the methodology developed here. The method is applied as an exemplary case for the sustainability assessment of 10 different energy system transformation strategies for Germany. Figure 1 illustrates the main workflow of the approach presented here, as well as the concepts of "multi-level scenario evaluation" and "multidimensional impact assessment". The first step is the selection of scenarios, which describe socio-economic boundary conditions and the main technical strategies for a defossilization of the different sectors of the German energy system (see Section 2.2).

Overview of the Assessment Workflow
Step 2 is the selection of sustainability indicators to be used in this study (Section 2.3). In the next step, step 3, the principal technical defossilization strategies identified in the different studies are used as input for detailed energy system modeling. Other boundary conditions (demand, prices, and others) have been harmonized between all re-modelled scenarios (Section 2.4). The output of the energy system models (ESMs) serves as the basis for the estimation of life cycle-based environmental impacts (step 4, Section 2.5), macro-economic effects (step 5, Section 2. 6) and an assessment of system diversity (Section 2.7). The preferences of stakeholders with respect to chosen indicators are determined in step 6 in focus groups, and with the help of a conjoint analysis and discrete choice-experiments (Section 2.8). Economic and ecologic impacts, on the one hand, and stakeholder preferences derived from a discrete choice experiment (DCE), on the other hand, are input for a multi-criteria decision analysis in step 7 (Section 2.9). The "core" of this approach is thus a three-fold integration effort: First, the multi-level scenario evaluation, which comprises the "classical" techno-economic perspective based on energy system modeling, is performed. This is followed by an ex-tensive multidimensional impact assessment and a subsequent comparative evaluation of different transformation strategies with an MCDA. The multi-level scenario evaluation thus integrates different assessment levels into one consistent approach.
Second, the multidimensional impact assessment of the different transformation strategies considers both the ecologic and (socio-)economic dimensions of sustainability via the modelbased assessment of ecologic and (socio-)economic impacts of the transformation pathways. However, it also integrates other aspects of the social dimension via stakeholder-based preference estimates. It thus integrates all "pillars" of sustainability.
Third, the different transformation strategies are integrated into a harmonized modeling approach, which enables a systematic comparison of impacts of those strategies. The harmonized modeling approach, in turn, comprises harmonized boundary conditions for the scenarios (e.g., useful energy demand), data harmonization (e.g., technology costs and efficiencies) between the different models used.
A further asset of this approach is the fact that a wide range of different transformation strategies have been included in the analysis. Of course, not all technically feasible strategies could be explored within this analysis. However, the results this study provide a good insight into the range of impacts (and the corresponding benefits and challenges) that can be expected from energy system transformation in Germany.

Selection of Scenarios as "Inspiration"
The selection of technical transformation strategies for the German energy system for an in-depth analysis of ecologic and economic impacts followed a two-stage selection process: in the first step, an extensive review of scenario studies of Germany was carried out, and 40+ scenarios were selected which fulfilled the following criteria (see [51]):

•
Coverage of the entire energy system (power, heat, transport); • Reduction of energy related direct CO 2 emissions of at least 80%, until 2050; • Sufficiently detailed documentation of scenario results; • Study commissioned by a relevant stakeholder and/or study carried out by an established research institution; • Study not older than 2012.
In a second step, out of these 40+ scenarios, 5 scenarios each with an emission reduction of approximately 80% and 95% were selected for a detailed analysis within this study. This second selection was based on the criterion that the scenarios should represent the broadest possible spectrum of transformation strategies for the energy system fulfilling a target of 80% and 95% emission reduction, respectively. Table 1 summarizes the selected scenarios. Table 1. Overview of the selected transformation strategies. The reduction of direct CO 2 emissions in scenarios I-V is between 80% and 90%, and more than 90% in scenarios VI-X.

Selection of Indicators
The selection of sustainability indicators followed a three-step process: First, an extensive literature review (see Supplementary Material for details) resulted in a comprehensive list of indicators that have been discussed in the national and international sustainability debate in the past, especially with regard to energy technologies and systems. We then selected indicators according to the following criteria:

•
The indicator is relevant to the current sustainability discussion; • The indicator value depends significantly on the technology mix in the future energy system; • The future development of the indicator can be estimated satisfactorily with available models and methods; • The full set of indicators addresses (if possible) the ecologic, economic and social dimension of sustainability and, additionally, considers system related aspects; • The indicators describe causally independent impact mechanisms; • The preferred direction of the indicators' development is clear.
The full set of the 23 selected indicators is documented in Figure 2 below. It comprises 16 environmental indicators (in the sub-categories climate change, human health, ecosystem quality and resources) that have been compiled for the European environmental footprint version 2 [61]. However, the indicator for mineral and metals resources has been updated, according to [62]. In addition, six socio-economic indicators and one socio-technical "systemic" indicator were selected (see Supplementary Material for details). Overview of the full set of indicators, as well as the sub-sets used in the discrete choice experiments (DCE) and the multi-criteria decision analysis (MCDA). "AGG" indicates that DCE uses an aggregated indicator for "human health" and "resources". "CUM" indicates that cumulated values (2021-2050) used in the MCDA. References: ILCD 2.0.2018 [61], van Oers: [62], Stirling Index: [63].
Although this full set of indicators is used for the comprehensive impact assessment of transformation strategies for the energy system, it is too comprehensive to be practical for discussions with stakeholders. Therefore, a sub-set of indicators was selected for the discrete choice experiment which additionally met the following criteria:

•
The number of indicators is manageable for discussions with non-experts; • The indicator relevance is also understandable for non-experts; • The indicators are relevant for the citizens' daily lives; • The sub-set of indicators still addresses ecologic, economic, technical and social dimensions.
The sub-set of indicators used for the discrete choice experiment is depicted in Figure 2. It comprises the indicator "climate change", the socio-economic indicators "system cost" (as a proxy for consumer prices) and "people in employment", as well as the socio-technical indicator "resilience/security of supply". Finally, the indicators "human health" and "resources" are used in the DCE. Note that more detailed information on human health and resource issues are available from the impact assessment (see column "Indicator" for both sub-categories). However, for practical reasons, in the DCE, only preferences for the aggregated indicators are determined.
In contrast to the DCE sub-set, the MCDA sub-set uses six differentiated indicators within the sub-category "human health" and two differentiated indicators within the subcategory "resources", as provided under the environmental footprint life cycle impact assessment method 2.0. Disaggregation of the aggregated preferences for "human health" and "resources" from the DCE was performed using the weighting set provided under this scheme [64]. The detailed procedure is laid out in Section 2.9.
There are two other minor differences between the two subsets: within the subcategory "employment", the DCE subset uses number of employees as an indicator, whereas the MCDA subset uses unemployment rate. Finally, DCE uses the security of supply (see Section 2.7) as an indicator, whereas the MCDA uses the diversity of the energy system as one aspect of its resilience (and, thus, of its security of supply).
Note that, for most indicators, low values are desirable. The only exceptions are the indicators "gross domestic product (GDP)", "people in employment" and "resilience/security of supply".

Harmonised Techno-Economic Re-Modeling of the Selected Transformation Strategies
The selected original studies scenarios (Table 1) made different assumptions regarding the future development of socio-economic drivers (such as population and GDP), the demand for energy services, as well as on energy and technology prices, full load hours, etc. Since the approach presented here focuses on ecologic and macro-economic impacts of different supply side strategies to reduce greenhouse gas emissions, demand for energy services was harmonized between the different transformation strategies. This was necessary because different demand estimates result in different magnitudes of environmental impacts, even if supply side technologies are identical. The same is true for assumptions of technology, energy carrier, and CO 2 certificate prices, which were harmonized in order to avoid spurious macro-economic impacts that result only from different price assumptions. Finally, the original scenarios have been calculated using different models with different definitions of sectors, different technology and energy carrier portfolios, etc.
The selected original studies were translated into a new set of scenarios, which were all modeled in a harmonized manner using a unified set of boundary conditions. The principal technical supply side strategies for the decarbonization of the energy system, however, were taken from the original studies, i.e., the market shares of different technologies and/or fuels in each sector.
The scenarios were developed with the scenario generator MESAP/PlaNet [65,66], which was itself coupled with the power market simulation model flexABLE [67] (see Supplementary Material for details on the models).
The core of MESAP/PlaNet is an accounting framework. This means that the main model parameters that determine the development of the energy system are set by the modeler. MESAP/PlaNet, however, assures a consistent calculation of energy flows from primary energy to useful energy services. In contrast to energy system optimization models, MESAP/PlaNet is thus very flexible in developing consistent scenarios for fundamentally different (technical) transformation strategies.
MESAP/PlaNet calculates annual energy demand and supply balances for the end-use sectors residential, industry, services and commerce, and transport (further differentiated by application type, different end-use technologies and energy carriers). In a similar manner, the conversion sector is considered, i.e., generation of power, district heat, synthetic and biogenic fuels and gases, etc. MESAP further estimates the required capacities for power, heat, gas and fuel generation, as well as annual gross new installations (i.e., including replacement of old plants) based on external assumptions of full load hours of these technologies. Finally, MESAP calculates economic quantities, such as annual investment in energy technologies and the corresponding annuities, annual fixed and variable operation and maintenance costs (fuels and CO 2 ), levelized costs of electricity and the total system costs. These quantities partly serve as input for the economic impact assessment (see Section 2.6).
The model flexABLE is an agent-based electricity market simulation. The model follows a bottom-up approach and includes main types of generation assets, such as conventional power plants, variable renewable generators and storage units. These assets, represented as agents, can participate in both an energy exchange market and a control reserve market. Agents formulate their bids based on several techno-economic parameters separately for each market at each time-step. The model allows a simulation of the dispatch plans of single units and market outcomes. The simulation runs confirm the technical feasibility of the scenarios, as it is ensured that demand and supply for electricity is always matched. In some scenarios, this required setting or adapting installed generation and storage capacities in order to map the energy balances assumed in MESAP/PlaNet. MESAP/PlaNet was run from 2000 to 2050, with annual time steps. In the model, the years 2000-2017 were calibrated with available statistical data for Germany. The years 2018-2020 were identical in all scenarios.
On the other hand, flexABLE calculated the hourly dispatch of power plants, electrolyzers and battery storages on the basis of the energy balances from MESAP. In turn, flexABLE results (full load hours, storage demand) for the year 2050 were fed back into MESAP in order to obtain consistent energy balances and supply side infrastructure requirements (including power storage) for the entire energy system (see Figure 3). • Technology efficiencies, investment costs and fixed operation and maintenance costs from the DLR techno-economic data base.
For each of the ten re-modelled energy transition concepts, the market shares of different technologies in the different sectors were kept as far as possible from the original studies. This means, for example, that while the development of the transport service by individual passenger cars is identical in all re-modelled scenarios, the technology mix for passenger cars is taken from the original studies. Similarly, the useful energy demand for space heating in the residential sector was taken from [52], but the share of different technologies/energy carriers which provide space heat in this sector was from the original studies. Other quantities, such as the gross power demand, in contrast, were calculated individually for each scenario from the assumptions of the demand for energy services (identical in all scenarios) and from the (scenario-specific) technology mixes in each sector. The technology shares in power generation were again taken from the original study. In this way, different supply side strategies could be made comparable. This was a basic prerequisite for a systematic, non-biased comparison of the economic and ecologic impacts of those strategies.

Assessment of Life Cycle-Based Ecological Impacts
The life cycle-based assessment of ecological impacts was performed using the Framework for the Assessment of Environmental Impacts of Transformation Scenarios (FRITS) [15]. FRITS is a Python-based tool that couples results from energy system models (ESM) with a life cycle inventory (LCI) database. It calculates a broad set of life cyclebased environmental impacts for entire transformation scenarios, while widely assuring consistency between ESM and LCI assumptions.
The main ESM outputs (on an annual basis) required by FRITS were (see Figure 3): • Gross new installed capacities of (i.e., including replacement after end of life) of all technologies in the end-use sectors (including transport) and conversion sectors; • Heat, power, fuel, and gas generation of each ESM technology; • Transport services for each transport technology; • Technical characteristics (such as efficiency, heat-to-power-ratio for CHP, coefficients of performance, etc.) of each ESM technology; • Blending quota of biogenic or synthetic fuels and gases in total fuel and gas supply, respectively.
The core of FRITS is a life cycle inventory data base focusing on energy and transport technology. The FRITS data base is based on the ecoinvent 3.3 data base [68], but includes various additional LCI data sets for energy and transport technologies from other data bases, e.g., from BioEnergeDat [69], UVEK [70], and SYSEET [71], from project partners and from the literature. An illustration of the workflow of the FRITS framework can be found in the Supplementary Material.
The original LCI data sets in the data base are subject to several manipulations in FRITS. These manipulations were necessary in order to tailor the LCI data to the model as accurately as possible: First, the LCI data sets were separated into the operation phase and construction phase. This allowed a more correct temporal allocation of impacts from those life cycle phases on the scenario level, but also a (implicit) harmonization of both technical lifetime and full load hours of all technologies in the ESM and LCI databases. Second, some scenario technologies require an energy carrier as input which is generated by another technology considered by the ESM. This is the case, e.g., for electric heat pumps, hydrogen fuel cells or gas burners using synthetic gas. In this case, in the LCI data sets for the operation phase of those technologies, the respective energy carrier input, had to be deleted in order to avoid double counting of impacts. Third, the power mix in upstream processes in the ecoinvent data base roughly represented the power mix today. In order to take into account the effect of (globally) changing power mixes in the future, the power mix in the upstream processes in the LCI database could be adjusted to regionally differentiated scenarios with global coverage. In this study, power mix scenarios from [72] (2 • C scenario) were used.
After the manipulations described here, a Life Cycle Impact Assessment (LCIA) was performed. As a result, for each of the technologies represented in the data base, specific life cycle impacts for the construction (impacts per kW) and operation phase (impacts per kWh) were calculated (see Section 2.3 for a list of impact categories considered here). The original LCI data make assumptions about, e.g., the efficiency of the technology considered, which might differ from the assumptions made in the ESM. In a last harmonization step, efficiencies in the LCI data were adjusted towards the ESM estimates by assuming a linear dependence of specific impacts with efficiency. As a result, efficiency-adjusted life cyclebased specific environmental impacts for the operation phase were obtained. Specific impacts on the construction of energy technologies were reported per kW installed capacity, for the operation phase per kWh output. For vehicles, construction impacts were per car (passenger cars only) and operation impacts per passenger kilometer (pkm, for passenger transport) and per ton kilometer (tkm, for freight transport, respectively).
In the last step, the ESM output was multiplied with the following specific impacts: gross new capacities with specific impacts from the construction phase, and energy carrier output (or transport services) from the ESM with specific impacts from the operation phase.
Life cycle-based environmental impacts of the scenarios could thus be analyzed on a scenario level, on a sector level and on a technology level, and individually on the impacts due to construction and operation of all technologies. Furthermore, impacts from the conversion sectors (e.g., power or H 2 generation) could be allocated to the end-use sectors according to the demand for those energy carriers for end-use applications.

Methdological Approach for the Assessment of Macro-Economic Impacts
Evaluating economic aspects is an important part of the sustainability assessment and comparison of the scenarios. Each scenario setting results in changes in the German economic indicators, such as GDP, employment, value added or output, through a variety of channels. The energy mix in the respective scenarios leads to a certain price level, which has economy-wide impacts on foreign trade balance and needs for investment along a path over time. Collecting all effects requires a consistent framework, typically provided by an economic simulation model. Here, the macroeconometric modeling framework PANTA RHEI was used; one which had been applied in energy policy assessments [37,73,74], especially in analyses of the German energy transition [52,75].
PANTA RHEI includes the System of National Accounts and Balances (SNAB) at its core and time series-based estimated behavioral equations. An overview can be found in the Supplementary Material. PANTA RHEI is consistent and fully interdependent (i.e., the mutual impact of model variables is considered simultaneously), as well as open for bottom-up information and data. In the case of this study, details of the energy mix, investment pathways and estimates on certain cost parameters, such as costs for operation and maintenance of energy systems and levelized costs of electricity (LCOE), were provided by the MESAP model (see Figure 3 and Section 2.4). An interface was defined to match data output from MESAP and data requirements from PANTA RHEI. With the help of this interface, MESAP scenarios could automatically be linked with PANTA RHEI. The macroeconomic scenario assessment could thus easily be adjusted or updated (Figure 3).
Both models are based on common framework data, which are assumed to be the same for all scenarios examined. Therefore, data on, e.g., population development, number of households, as well as prices for energy imports and CO 2 emissions were taken from [52]. The macroeconomic analysis was carried out both at a national level and at a federal state level. The temporal resolution of the input data from MESAP was five years, from 2020 onwards. These values were interpolated linearly to fit the annual data framework in PANTA RHEI.
In PANTA RHEI, data for the energy system followed the systematic of the energy balance, with 30 energy sources, 20 final energy sectors and 11 conversion sectors. Fur-thermore, new technologies (hydrogen, methane, PtX technologies) were part of the future energy system in some of the scenarios, which neither official statistics nor the model have covered so far. Hence, the energy balance and the related model variables were supplemented by these new technologies. The proposal for the extension is documented in [76].
PANTA RHEI was used to assess macroeconomic effects for ten scenarios (see Section 2.2). Economic impacts were assessed by comparing simulation runs obtained under different energy system scenarios, with respect to several economic quantities. By keeping everything else constant (ceteris paribus assumption), the results, which are described in Section 3.2, were obtained. Every scenario had its own pathway of investment, system costs and energy imports. These inputs were implemented in the interdependent macroeconomic framework in PANTA RHEI and results were calculated annually.

Power System Diversity as a Systemic Indicator
A criterion for the project's MCDA model is required, which will consider the resilience, robustness and security of supply for the various energy scenarios in 2050. As there is no obvious measurement corresponding to security for long-term scenarios, a substitute measurement must be found which can stand in for security in all of its aspects. Several studies include increasing diversity as a primary means of increasing energy security [77][78][79][80][81]. In fact, diversity is often used as a singular substitute indicator of the long-term security of an energy mix [82]. Diversity is quantified in this study with the Stirling Diversity Index (SDI), as it is robust to potentially subjective definitions and demarcations [63]. It is defined as follows: where i, j are different power generation technologies, p i is the percentage of installed power capacity delivered by technology i and d ij is the disparity between technologies i and j. The disparity is calculated via the following formula [83]: where N is the number of disparity criteria, w n is the weight assigned to each criterion and c ni is the value that energy source i has for criterion n.
Here, 13 criteria were selected to calculate the disparity, namely average plant lifetime, average plant capacity, proportion of fuel imported, nations providing imported fuel, correlation between average daily generation with average demand, self-correlation of the generation data, ramping ability, minimum operable load, market concentration of suppliers and critical raw materials. These aspects cover technology characteristics that are not considered in the larger MCDA, so there is no redundancy of information. A detailed description of the criteria selection and the results of the SDI can be found in the Supplementary Material.

Determination of Stakeholder Preferences: Discrete Choice Experiments and Focus Groups
In order to gain insights into citizens' preferences regarding future energy systems, a mixed method approach was applied. Qualitative focus groups and a quantitative discrete choice experiment were conducted. While the quantitative analysis grants compatibility and serves as input to the MCDA, the qualitative data enhance the insight in the reasoning of citizens where necessary. The former focused on three core concepts of social sustainability: • Quality of life-as a totalizing variable to include the degree of fulfillment of multiple diverging lifestyles and their goals found among different societal groups; • Justice of distribution-as a variable to address the distribution of cost and benefits connected to any scenario to be debated within society; • Justice between generations-as a variable to give special attention since the worldwide implementation of the "Fridays for Future" movement.
To gather data on these aspects, in-depth focus groups among six groups (in total 63 people) were conducted in two different German cities (Stuttgart and Osnabrück). Here, the participants were not only asked to assess multiple energy technologies along their impact on the three social aspects of sustainability, but also to explain their evaluations and define the three concepts with their own words. This puts emphasis on the strength of qualitative social science-the exploration of complex concepts and their underlying dimensions of meaning to several different individuals. To foster a diverse perspective gained in this measurement, the groups varied in age (young adults, general public and seniors) and were selected in equal gender proportions. The discussion was recorded and later transcribed and analyzed with Max QDA software by three different social scientists to ensure inter-coder reliability.
Quantitative data was gathered via a discrete choice experiment [84], where respondents were asked to choose between eight different energy scenarios. Discrete choice experiments have a long tradition in marketing research, where they are used to discover the importance of different traits of a product for consumers' buying decisions [85,86]. They are seen as a more valid measurement than directly asking for the importance of different characteristics for one's decision [87], and have been applied in energy related research to analyze, e.g., investments in energy technologies [88], nuclear waste storage [89] or wind power developers' perspectives on the effect of support policies [90].
The scenarios in the decision set of this discrete choice experiment were chosen to represent a variety of energy futures with reduced CO 2 -emissions, originating from different technologies. The scenarios therefore differed with regard to system costs, employment effects, security of supply, health effects, climate effects, land use and resource depletion. To facilitate the decision between scenarios, they were characterized by their relative effects on these variables compared to the other scenarios in the set (see Table 2). For example, the value of −7% for "employment" in Table 2 meant that the scenario discussed here performed 7% worse than the arithmetic mean of all scenario values. To facilitate processing of the scenarios performance, these values had been color-coded: more sustainable values by comparison with the alternative scenario with green, less sustainable values with red, equally sustainable values with yellow. In total, 1488 pairwise comparisons were made by 130 interviewees in Stuttgart and Osnabrück. The interviews were conducted online and in workshops in spring 2019 using Qualtrics software.
To account for unobserved heterogeneity, a respective multinomial logit model was applied [91] when analyzing data. The utility U of a chosen alternative j for individual n on choice t is given by: "where β n is a vector of individual-specific coefficients, x njt is a vector of observed attributes relating to individuals and alternative j on choice occasion t, and ε njt is a random term that is assumed to be an independently and identically distributed extreme value" [92].
To enable direct comparison of the β-coefficients, the indicators have been normalized. The weights passed on to the MCDA equate to the mean of the estimated unconditional β-coefficients corresponding to the normalized indicators x njt .

Method of Multi-Criteria Decision Analysis Integrating Impact Assessment Results and Stakeholder Preferences
At the conclusion of the multi-level scenario evaluation, an attempt is made to rank the scenarios based on the previous impact assessment, along with stakeholder preferences. For this purpose, a multi-criteria decision analysis was applied. MCDA describes approaches to assess several alternatives with respect to a set of criteria. It takes into account conflicting targets by first evaluating the performance of the alternatives for each individual criterion and then aggregating these results to derive conclusions [93].
Reasons for using MCDA in energy planning include the ability to consider the interests of multiple actors, the combination of objectivity and subjectivity, and user friendliness, which altogether improves the understanding of the assessed alternatives [94].
In this paper, the weighted sum method (WSM) or simple additive weighting (SAW) was used for the assessment (also see figure on WSM workflow in the Supplementary Material). Several studies with a focus on a life cycle based sustainability assessment of energy systems and technologies use weighted sum methods for MCDA [103][104][105]. This method is widely used because of its simplicity. Other MCDA methods such as TOPSIS or PROMETHEE are also applicable for the analysis to rank the scenarios.
In WSM, the first step is the normalization of the indicator scores. Here, it was performed by min-max normalization. The score A i for alternative i was calculated by multiplying the normalized alternative score for each criterion (a ij ) with the criteria weight (w j ). Subsequently, the multiplied score for each criterion was summarized for all n criteria. The alternative which had the highest total score was the best alternative.
The alternative scores a ij result from the impact assessment (results for indicator i and scenario j). The basis for the weights were the ß-coefficients, as a result of the discrete choice experiment (DCE, see Section 2.8). In order to apply the WSM, results from discrete choice experiment and impact assessment had to be harmonized, since the discrete choice experiment used aggregated indicators (see Figure 2), while impact assessment provided single indicators: The impact assessment delivered six differentiated indicators within the sub-category "human health" (ozone layer depletion, carcinogenic effects, noncarcinogenic effects, respiratory effects, ionizing radiation, photochemical ozone creation) and two differentiated indicators within the sub-category "resources" (minerals-metals and fossils). Disaggregation of ß-coefficients was performed using the weighting set provided under this scheme [64]. To determine the final weighting factors, according to the approach taken by [64], the ß-coefficients are multiplied with a robustness factor to take into account that the methods used to survey (environmental) impacts vary in validity. The robustness of the indicators "unemployment rate", "resilience" and "system costs" were assessed by expert judgement within the project team.
For environmental indicators and system costs, cumulative values from 2021-2050 were applied, since most of the environmental burdens and costs do not occur in the target year, but over the whole transformation period. Socio-economic and socio-technical indicators were taken from 2050, as their state in the target year is decisive for the assessment.

Results
As mentioned above in Section 2, the impact assessment of the transformation scenarios can be performed on two levels: On the first analysis level, the scenario performance with respect to individual impacts is determined. This allows an analysis of strengths and weaknesses of individual scenarios and a comparison of the performance of different scenarios with respect to individual impacts. This level of analysis is illustrated by way of examples in Sections 3.1 and 3.2 for ecological and socio-economic impacts. A detailed analysis, however, is beyond the scope of this paper. Additional information, e.g., on results for the systemic indicator "diversity" and the total system costs can be found in the Supplementary Material.
On the second analysis level, a multi-criteria decision analysis using citizen's preferences allows us to rank the various scenarios in terms of their sustainability performances, as shown in Sections 3.3 and 3.4. Here, as well, only exemplary results are shown. Figure 4 compares the cumulated life cycle-based greenhouse gas (GHG) emissions between 2021 and 2050 in all ten scenarios. The direct GHG emissions-i.e., the emissions from the foreground technologies explicitly represented by the energy system model-are shown in red. It can be seen that the cumulated direct emissions in scenarios I-V are higher than those in scenarios VI-X. This result was expected, as scenarios I-V achieve a reduction of direct CO 2 emissions of around 82%, on average, by 2050. In contrast, the "ambitious" scenarios of VI-X achieve a reduction of (on average) around 95% (compared with 1990), and CO 2 emissions represent by far the largest share of all direct GHG emissions. However, if indirect impacts are also taken into account, the picture changes (indirect impacts stem from upstream processes not explicitly considered in the model, such as construction of the plants or the cultivation of energy plants for biofuel generation): Due to relatively high "background" impacts, the cumulative life cycle-based greenhouse gas emissions of some "very ambitious" scenarios (Scen IX and X) even surpass those of scenarios describing pathways to a lower reduction in direct CO 2 emissions. This result implies that a life cycle perspective is necessary when aiming at a GHG emission reduction. Note that scenarios I-V aim at a reduction in direct CO 2 emissions by ca. 82% by 2050, relative to 1990 scenarios VI-X by ca. 95%.

Ecologic Impacts
As Figure 5 illustrates, the transformation of the energy system towards climate friendliness may have a number of co-benefits, as well as adverse side effects in the ecologic domain. It shows the relative change of each ecological indicator between today's energy system (2020) and 2050 in all scenarios. For some indicators, the environmental impact decreases in this period in all scenarios. This is the case for the indicators "climate change", "freshwater eutrophication", "ionizing radiation" and "fossil (and uranium) resources". This reduction is closely related to the shutdown of nuclear and fossil fuel fired power plants in Germany in all scenarios. However, other indicators (such as land use and the demand for minerals and metals) significantly increase in the same period in (almost) all scenarios. Higher land use is mainly caused by the higher specific land use (per kWh output) of renewable power generation (in particular, wind and PV) compared to conventional power plants and the cultivation of energy crops. Renewable power generation is also more material intensive than conventional generation per kWh of output, thus leading to high values for the minerals and metals impact indicator. This is also true for "new" technologies in the transformation of the transport sector (e.g., platinum in fuel cells for electric fuel cell vehicles). For the remaining impact categories, impacts increase or decrease between 2020 and 2050, depending on the technology mix in the specific scenario. It can therefore be stated that climate-friendly transformation strategies are not automatically beneficial in all other environmental impact categories, and comprise different strengths and weaknesses. The results of the ecological indicators prove the necessity of deeper analyses for deriving an overall conclusion as to how advantageous the different transformation strategies are. Moreover, the shown ambiguity means that an MCDA would even be useful for ecological indicators alone, but even more so when all indicators are taken into account.

Macro-Economic Effects
From an economic perspective, several indicators are of interest. The most reported indicator is, despite all its flaws, the Gross Domestic Product (GDP), i.e., the summed value of all marketable products and services in the economy. Annual changes in GDP are the typical growth rates associated with economic well-being. Forward looking economic activities are reflected by total investment (in construction and in equipment). Employment secures household incomes and social security systems, in turn generating demand and added value. To accentuate the social importance of employment, an indicator for the unemployment rate is added. It relates the number of unemployed persons in a respective year to the number of persons in the labor force in the base year (2017). Figure 6 provides an overview of the simulation results from the years 2030 to 2050. When comparing these scenarios, the first thing that stands out is that the economic quantities are very close to each other. All scenarios represent an energy system transformation, with investment in new technologies and an energy mix that is fundamentally different from the past one. More ambitious scenarios (bottom five rows) predominantly exhibit average larger positive effects on GDP and employment than the scenarios aiming at 80% GHG reduction compared to 1990 (top five rows). Investment is higher in these cases, leading to positive economic effects. Scenarios with high investment in technologies produced and operated in Germany tend to exhibit higher employment. Development of GDP depends also on the evolution of price levels, operation and maintenance costs, as well as imports along the pathway. Comparably low employment in the ambitious Scen X, for instance, is the result of an investment level comparable to an 80% scenario and high energy imports (as in other ambitious scenarios). The unemployment rate varies little between the 80% scenarios, while, under the ambitious scenarios, there is a 0.5 percentage point difference between the highest (Scen X) and lowest (Scen IX) values. To better understand economic effects, an analysis of differences between scenarios is typically applied. In our analysis, we did not specify a counterfactual scenario explicitly (i.e., an alternative development in which no energy transition path is followed), as is apparent, for instance in [52] or [75]. If the results had been compared to a counterfactual scenario with no energy transition at all, all economic effects could be expected to be larger. However, it is reasonable to presume that the energy transition has been decided, and the sustainability analysis only aims at comparing the different pathways to attain the targets. Instead, we have selected one of the 80% scenarios (Scenario I) as reference and compare the simulation results from the other nine scenarios against the selected scenario. Relative differences in percent illustrate the size of the impact of a scenario, with all other factors held equal (ceteris paribus assumption). From relative differences, one observes whether the impact on a certain economic sector or quantity is large or small, compared to what it is without the respective scenario assumptions. Figure 7 shows the relative differences of employment for the year 2050 and the seven sectors (groups of ISIC-rev4-sections). The scenarios have mixed effects on employment in the respective sectors. Construction shows the largest positive effect in five scenarios and very small negative effects in three of the remaining four scenarios. Scen X shows less employment compared to the reference scenario, because it mostly relies on hydrogen imports, with low investment in domestic infrastructure or technology. Investment activity leads to higher demand, employment in construction, business services, industry and public and private services (especially in the long-term). Note that the ceteris paribus assumption also contains the general structural change toward services in all scenarios. Overall employment will decrease in all scenarios until 2050, due to demographic change. The labor force will decrease accordingly.
For the impact assessment, a comparison of regional results completes the picture. An assessment of effects for the 16 federal states shows that there are three ambitious scenarios among the five scenarios with the lowest regional inequality. However, the regional distribution of additional investment has at least as strong an influence as the technology mix in the transformation. Although regional differences between the scenarios are small, the results can help to support the acceptance of the energy transition if used in political decision communications.

Stakeholders' Preferences: Results from the Discrete Choice Experment and the Focus Groups
The data gathered by the discrete choice experiment (see Section 2.6) have been analyzed using a mixed logit model [91]. Results (Figure 8) show that the overall model is highly significant, and that seven indicators had a significant contribution to the perceived utility of a scenario (Implying a p-value < 0.05; p-values for single coefficients are reported in column P > |z|, p-values for the overall models are reported in row P > χ 2 ). Since the indicators had been normalized before the analysis, the coefficients (β) indicate the relative importance for each group. However, coefficients cannot be compared directly between different groups; only the relation between different coefficients for a group is suited for group comparisons. The upper part of the table displays the mean coefficients, the lower part reveals whether the distribution of coefficients departs significantly from a normal distribution, suggesting there exist important predictors influencing the effect of these variables omitted by the model (All variables for which this is not the case (i.e., costs, security and employment) have been modelled with fixed effects only, and thus do not show up in the lower part of the table). For the whole sample, the model shows a McFadden-R 2 of 0.52, meaning that roughly half of the variance in scenario choice can be explained by the model. The β-coefficients for all participants ("mean" values in first row in Figure 8) are the stakeholders' preferences used as a basis for weights in the MCDA analysis in Section 3.4.
Overall, climate effects had by far the biggest impact on the preference for a scenario, followed by health effects, resource depletion and security of supply. Qualitative analysis mirrors that climate change is, in every age group, a repeated, non-controversial narrative which serves as a motivational background for changes in the energy system. There have been significant differences in the evaluation of the importance of climate effects, health effects, land use and resource depletion among the respondents which do not follow a normal distribution, indicating that there are important unobserved factors influencing the evaluation of these variables. Various knowledge indices about renewable energies had been queried in the DCE but showed no significant effect on the evaluation of the indicators.
In comparison to the overall sample, system costs and employment effects did not show significant effects on the choice of preferred scenarios among students. Qualitative analysis of the arguments the students present shows a multitude of different attitudes among the students when it comes to the costs: some emphasize them as they lack sufficient financial stability, others oppose this with the argument that health, climate and human wellbeing should never be compromised by money, even others do not want to rely on the financial numbers presented (see quotation at the end of this section). Security of supply and health effects were more important for scenario choice among students-in relation to climate effects-than in other groups; land use and health effects seem less important for students.
Unsurprisingly, results for job holders resemble, more or less, those of the overall sample, since this group makes up the biggest part of the sample. However, two divergences attract attention: the effect of system costs on scenario choice is not significant for this group, and resource depletion is valued as more important than in other groups. Retired persons seem to attribute far more importance to health effects, security of supply and system costs than other groups, though their arguments presented within the focus groups do not mirror the topic of cost sensitivity beyond the average level among all groups.
The results of the qualitative analysis of the focus groups mirror the quantitative results and enhance them with deeper insights into the participants' reasoning regarding the assessment of scenarios and energy technologies.
We observed that financial burdens and impacts on employment of the energy transition is of more importance to younger people, as their socio-economic resources are limited (see quotations A1, A2, A3 in the Supplementary Material): Retired people we asked not to fear financial insecurity for themselves to the same degree as young people. They debate more about how frugality and renouncement add to a sustainable lifestyle. They are also concerned about how to translate and compare the needs and definitions of quality of life among people to find fair solutions and arrangements (see quotations A4 and A5 in the Supplementary Material).
Results of the DC-experiment were integrated into a MCDA of the 10 re-modelled scenarios (see 2.2 for scenario selection, 3.5 for MCDA results). Additionally, these results were supposed to identify preferred points on a pareto-front optimizing for climate effects vs. system costs (see [35]). However, regarding the indicators captured by the DC-experiment and other social indicators (unemployment, GDP, regional disparity), the results of different scenarios on the pareto-front show little variance; the biggest differences between these scenarios are in terms of land use (5.7%), system costs (2.6%) and climate effects (2.2%), and other indicators differ by less than 1%. Land use and climate effects have shown a much bigger influence on scenario preference in the DC-experiment than system costs, and an increase of system costs by 2.6% equals a reduction in land use of 5.7%, as well as a reduction in adverse climate effects of 2.2%. Thus, it seems reasonable that citizens would prefer the scenario with the higher system costs, as they perform better on the other aspects. In the qualitative analysis, this can be mirrored, though some doubts about the reliability of the numbers presents were also voiced. (see quotations A6 and A7 in the Supplementary Material).

Integration of Impact Assessment and DCE into MCDA
In the multi-criteria analysis, the results of the discrete choice experiment were used to form the weights in the weighted sum method that was applied to the results of the impact assessment. The final weighting factors are shown in Table 3. For a detailed derivation of final weighting factors, we refer to the Supplementary Material. The results from the weighted sum method show that the first rank is taken by scenario VIII. Recall that, reversely to impacts, where low values are preferred, here, high scores refer to more preferable scenarios. Scenarios VII and VI, which have similar weighted sum scores, rank second and third, respectively. The score distance to rank one is rather high. Scenario X ranks last. Figure 9 shows all ranks and WSM score results broken down by indicator shares. Climate change dominates the weighted sum, and it correlates with the ranking of climate change as a single (life cycle) indicator. System costs, unemployment rate and diversity also have relatively high shares, but they are not decisive because the sum of these three weighted indicator scores is very similar across most scenarios. Since the min-max normalization applied here emphasizes small differences of indicator scores, the range of resulting weighted sums is rather high-between 0.1 and 0.8. Based on these results, it can already be stated that scenarios aiming at an ambitious reduction in CO 2 emissions (approximately 95% compared to 1990) do not necessarily perform better (or worse) in terms of an integrated sustainability assessment than scenarios that only achieve a reduction in emissions in the order of 80%. It also shows that the more ambitious scenarios score very differently compared to one another, ranking both best and the worst in the scenario comparison. In contrast, the range of weighted sums is lower for the 80% CO 2 emission reduction scenarios, which rank between fourth and eighth of all ten scenarios. However, to achieve a robust statement of this type, methodological sensitivity tests are needed, which are subject to separate work.

Discussion
As this paper focuses on presenting the methodology of a multidimensional impact assessment and multi-level evaluation (see Figure 1), the discussion here will be mainly limited to methodological aspects.
The selection of the indicators for an integrated sustainability assessment approach requires an early intensive coordination between all parties involved. The goal is to define a common set of indicators whose future development can be estimated by (model) instruments, but which is also usable and understandable for stakeholder engagement. Here, the conflict has to be resolved so that, for a comprehensive analysis of possible impacts of a system, many relevant indicators as possible should be considered, however, for MCDA, the number of indicators should not become too large.
Although it is now clearly established that the transformation of the energy system is a socio-technical process, it is still a research task to identify and operationalize relevant social indicators beyond economic and health aspects that can reasonably be assessed in a prospective manner (i.e., time horizon from 2050 and beyond). Regarding systemic indicators, a first step was made here to use the Stirling Index for describing the diversity of the power supply as an aspect of its potential resilience. However, a more elaborated approach to assessing the resilience of energy systems requires further methodological developments and a significantly expanded set of models (for example, power grid simulation models). This was beyond the scope of this study.
Standards for reporting scenario data would help to better compare scenarios from different sources. The same applies to harmonized boundary conditions and assumptions in the studies (scenario frameworks). Since standardized and harmonized scenario data from different sources are generally not available, a harmonized re-modeling of transformation strategies is necessary if one wants to avoid biases due to different boundary conditions. Focusing on sustainability aspects of different supply side strategies, including transportation (as in this study), ignores that the demand side is a key lever to significantly reducing energy-related greenhouse gas emissions. Eco-sufficiency strategies, and the resulting reduction of energy-related services and consumption, or the modal shift in transportation, could have been addressed within the set of models and competences available here. However, the complexity and diversity of technologies on the consumption side (building insulation, electrical appliances, industrial processes, etc.), as well as the availability of LCI and cost data, are major barriers. Therefore, as a first step, it seemed reasonable to focus on an impact assessment for different technical strategies on the supply side, knowing that this is only one side of the coin.
The focus on Germany makes it difficult to appropriately re-model the scenarios, since it is to be expected that, with higher renewable energy shares, Europe-wide electricity exchange will also play an increasingly important role in the German energy system. Thus, it is advisable to extend the geographical horizon of such an analysis to at least the European level. However, with possibly increasing relevance of PtX imports from outside Europe, even a European level might be too narrow.
An integrated, harmonized approach for a multidimensional impact assessment of scenarios required the development of a harmonized framework comprising energy system models, macroeconomic models and models assessing (life cycle-based) environmental impacts. The effort to couple the necessary models and harmonize the data is not low, but manageable in the context of larger projects.
Within the approach presented here, impacts of transformation strategies for the entire energy system are assessed. A systematic comparison of different transformation strategies for individual sectors would be interesting, but could not be implemented within the scope of the work due to the great variety of possible scenario variants and limited available resources.
Coupling energy system models with LCI data makes it possible to determine life cycle-based environmental impacts of entire transformation strategies. However, there are a number of challenges to this, such as availability, quality and representativeness of LCI data, prospectively, in the foreground technologies and in the background database, in the avoidance of double counting of impacts between foreground and background system and in uncertainties in impact assessment methods. Prospective adjustments of technologies in the background LCI database are a major methodological challenge, the solution to which could also benefit from increased cooperation between institutions and a move to open-source solutions. It would be helpful if a shared prospective LCI database for energy and transportation technologies could be developed and maintained within the life cycle assessment community or LCI database providers. There is a need for harmonization and standardization of prospective LCI data in the energy sector.
Coupling energy system models with macroeconomic models makes it possible to determine macroeconomic consequences of different transformation pathways. However, a number of issues complicate this analysis. The long time horizon of the analysis and the fundamental change of the analyzed energy system require numerous exogenous assumptions as boundary conditions for the macroeconomic model. The type of macroeconomic model used here is fundamentally based on historical contexts and behaviors. As a consequence, many further exogenous adjustments to macroeconomic interactions are usually necessary to adequately and comprehensively describe possible future developments.
Determining uncertainties for all selected indicators was not possible within the scope of this project. Nevertheless, it would make sense to also include the uncertainty of the impact assessment of indicators in an MCDA for transformation strategies in the future in order to achieve more robust results.
The new approach of applying discrete choice experiments to energy scenarios is promising, but it comes with challenges. The simultaneous complexity and high abstraction level of the future represented in the scenarios make it difficult to find an appropriate set of indicators and a condensed representation of the future that is suitable for a discrete choice experiment.
Indicator weights are crucial for MCDA results. This highlights the importance of providing a solid basis for determining the weights, which should ideally involve a variety of stakeholders. On the other hand, the large number of criteria and their complexity require an extended understanding, which cannot be assumed by every stakeholder.
The integrated multidimensional sustainability assessments presented in this paper highlight the importance of assessing energy transformation pathways in a holistic manner. They go beyond analyses focused exclusively on climate change and can be utilized by researchers and practitioners to support and develop future energy policies and strategies that contribute to global sustainable development. Further results of this multidimensional assessment will be presented in separate publications, which also will focus on the implications of the findings.

Conclusions and Outlook
In this paper, an approach for an integrated, multidimensional sustainability assessment of energy transformation scenarios was presented. The approach combines a detailed model-based assessment of (life cycle) environmental impacts and macroeconomic consequences of transformation strategies, with stakeholder-based indicator preferences and a multi-criteria impact assessment.
The strengths of the integrated approach for a sustainability assessment of energy scenarios presented here are manifold. First of all, the harmonized re-modeling of the scenarios, as well as the extensive coupling of energy system, macroeconomic and LCA models allow a consistent and unbiased comparison of ecologic and macro-economic impacts of different (supply side) strategies for the transformation of the German energy system.
The multidimensional impact assessment, i.e., the explicit modeling of economic and ecologic affects, including an extensive assessment of stakeholders' preferences, enables the identification of conflicting goals and co-benefits between sustainability dimensions. The extensive consideration, in particular of ecologic and macro-economic aspects, nourishes the hope that many (if not most) relevant aspects of sustainable transformation (at least on a national level) can be addressed by this approach. The detailed environmental impact assessment allows an analysis of underlying cause-effect relationships between scenario features (transformation strategies) and related impacts, identifying possible future hot spot sectors and technologies.
On the other hand, the MCDA significantly reduces the complexity of the multidimensional impact assessment by aggregating and reducing the diversity of environmental and economic analysis to a reasonable and more understandable level. It thus allows the drawing of clear conclusions from a very complex analysis.
However, as discussed extensively above, a number of methodological challenges remain that have to be addressed in future research: This comprises (among others) a comprehensive and compelling (yet pragmatic) indicator selection (considering both the capabilities of the models and the requirements for stakeholder engagement and MCDA), methodological challenges arising from prospective LCA, LCI data availability and quality, common system boundaries for all models, integration of demand side strategies in the assessment, the estimation of uncertainties of the analyzed impacts, etc. Furthermore, it would be very helpful to better understand which scenario features are responsible for a good or bad performance in MCDA analysis. This would allow the use of insights from previous MCDA analyses for an iterative improvement of transformation strategies in a follow-up scenario development exercise. Along with approaches such as multi-objective optimization, this could eventually lead to the development of transformation strategies for energy systems which are more sustainable (and, as a consequence, hopefully more acceptable) in a broad sense of this term.

Conflicts of Interest:
The authors declare no conflict of interest.