Assessing the Performance of Sustainable Development Goals of EU Countries: Hard and Soft Data Integration

: The European Union (EU) energy policy for sustainable development has been the topic of continuous debate, research, and analysis, which frequently focused on objectives and the evaluation of quantitative and qualitative performance. Different approaches can be used for the assessment of sustainable development goals. The authors of the article conducted a literature review of relevant research papers dated 2016–2020. The most common are quantitative methods based on hard data. Some qualitative studies based on soft data are also available but rare. This article proposes hybrid Rough Set Data Envelopment Analysis (DEA) and Rough Set Network DEA models that integrate both approaches. Also, the models allow the inclusion of uncertainty in the underlying data. The article uses hard data of the International Energy Agency (IEA) and the results of the EU survey regarding the influence of the socio ‐ economic environment on CO 2 emissions in EU countries. The authors demonstrate that multifaceted and objective assessment is possible by merging concepts from the set theory and operational research.


Introduction
Energy is the main driver for economic growth; however, it is also the leading cause of CO2 emissions [1]. A shift away from high-carbon energy sources resulted from depleting fossil fuels and the growing awareness of the negative impact made by economic development on the environment. There is a global consensus that energy needs to become eco-friendly [2]. Based on the United Nations Agenda 2030 [3], sustainable development is impossible without "access to affordable, reliable, sustainable, and modern energy for all".
Many economies have set ambitious goals for their energy sectors, emphasizing safety, sustainability, environmental friendliness, resource-efficiency, and low-carbon dependency [4]. The European Union (EU) is especially active in introducing relevant regulatory framework and policies [5,6]. Policy instruments for the smarter and more sustainable electricity in the EU include a wide range of directives, strategies, and roadmaps with targets for a low-carbon economy, energy efficiency and renewable resources [4]. At the micro-level, European enterprises are also encouraged to innovate responsibly [7,8] and assess newly-developed technologies [9] considering the intended and unintended environmental and social impact.
Debates, research efforts, and analyses have been focusing on ways to measure and monitor the progress toward sustainability. Indicators commonly used for those purposes show the indivisibility of the relationship between energy and development. Instead of absolute values, the impact made by economies on the natural environment is measured by the share of renewable energy in the total final energy consumption, energy intensity per unit of GDP [3], and CO2 per unit of value-added [10]. However, other issues should also be considered. For example, energy sustainability is impossible without sufficient investments in energy efficiency, expressed as a percentage of GDP or direct investments in infrastructure and technology [3]. It is also important to compare the progress of countries considering many aspects of sustainable energy and development as well as find ways to aggregate the final assessment [11][12][13].
The review of the literature on the performance of sustainable development goals revealed many available approaches. The used methods range from a comparison of simple indicators via multivariate statistical analysis to an adaptation of multicriteria decision-making methods. In particular, the latter-the adaptation of multicriteria decision-making methods-is a rapidly growing research area [14]. However, most efforts are limited to the analysis of hard quantitative data that describe the progress of a country leaving out the context-dependency defined by the soft qualitative data.
Papers that use qualitative data either present it as an evaluation of policies and initiatives transformed into simple dummy variables [15,16], or consider it indirectly as various ranking scores [11]. In some instances, the data is presented as expert opinions [17] or importance weights determined by experts [18]. Qualitative data often contains more information. However, it is also ambiguous.
A dismissal or oversimplification of qualitative information means the rejection of a vast amount of knowledge. The success of a policy to reduce CO2 emissions depends not only on economic prosperity measured by GDP per capita but also on priorities, set directly and indirectly by the public. The significance of both factors is evident in the midst of unprecedented challenges faced by societies, such as the expected recession and global changes resulting from the COVID-19 pandemic.
The suggestion to integrate soft and hard data on the socio-economic environment makes this article original. The proposal is to use different types of data, i.e., social authorization, the readiness for change, the standard of living, and efforts made toward the decoupling of economic growth and carbon emissions. The article offers to use hybrid models, i.e., Rough Set Data Envelopment Analysis (DEA) and Rough Set Network DEA, to deal with uncertainty in the underlying data used for assessing the performance of sustainable development goals in EU countries.
The article has the following structure. First, it provides a solid review of papers issued in 2016-2020 on monitoring and assessing efforts of EU countries toward energy sustainability. In this part, it also indicates methods and goals used as the background for a proposed approach. Next, Rough Set theory, DEA, and network DEA models are used to deal with uncertain data in the case of sustainable development assessment. Then, a case study is discussed, and results of the analysis are presented. The article finishes with conclusions.

Background Literature
Many research projects originated from the importance of sustainable development and free public access to EU data aggregated at a national level. Table 1 presents the results of the literature review conducted in the Scopus database. The analysis targeted relevant articles that focused on the EU and were published in 2016-2020. The summary below indicates the methods and aims of each research.  [32] Multidimensional scaling data reduction method and CA data: energy indicators, e.g., energy imports, energy use, energy production, capacity, the share of RES, GHG To measure the differences between the countries in the Eastern European region in terms of RE and economic development Lindberg and Markard (2019) [6] Transition pathway (semi-coherent pattern of major changes) analysis data: list of key EU electricity policies and their key industry actors To assess the EU electricity policy mix supporting different transition pathways Lyeonov, Pimonenko, Bilan, Štreimikienė, and Mentel (2019) [33] Modified OLS data: GDP per capita and GHG emissions, RE consumption, green investment To analyze the linkages between GDP per capita, GHG and RE in the total final energy consumption and green investments in the EU Malinauskaite, Jouhara, Ahmad, Milani, Montorsi, and Venturelli (2019) [5] Descriptive statistics analysis data: energy consumption trends, sources and sectors, energy savings To review EU strategies and policies on energy efficiency; to present national case studies for Italy and the UK Mikalauskiene, Štreimikis, Mikalauskas, Stankūnienė, and Dapkus (2019) [34] Descriptive statistics analysis data: GHG emissions and removals by sector, a set of indicators for the assessment of energy intensity, the structure of consumption, dependency, shares of RES in sectors To assess GHG emission trends and climate change mitigation policies in the fuel combustion sector of Lithuania and Bulgaria Neofytou, Karakosta, and Gómez (2019) [18] Promethee II data: 12 indicators from 4 dimension: environmental impacts, e.g., GHG reduction, energysaving; social impact, e.g., employment; economic impacts, e.g., GDP; energy systems impacts, e.g., import, intensity To assess alternative climate and energy policy scenarios and their socioeconomic, environmental, and energy impacts Pach-Gurgul and Ulbrych (2019) [35] Hellwig's multidimensional comparative analysis data: energy consumption, the share of RE in energy consumption To empirically verify progress made implementing the provisions of the EU Energy Package by the V4 countries Siksnelyte and Zavadskas (2019) [4] MCDM, TOPSIS data: indicators for monitoring the progress (electricity interconnection, market concentration, electricity prices, retail electricity markets share of RES in final electricity consumption), indicators for the To monitor the progress of the electricity sector toward EU objectives; assess the sector sustainability assessment of sustainability: economic (e.g., prices), environmental (e.g., share of RES, distribution losses) security (import) Siksnelyte, Zavadskas, Bausys, and Streimikiene (2019) [36] MCDM MULTIMOORA optimization based on Ratio Analysis technique data: indicators for monitoring the progress of energy import dependency and energy security (e.g., import, supplier concentration), indicators for monitoring the progress of decarbonization (e.g., energy consumption, GHG emission), national energy targets and their implementation, set of EISD indicators to comparative assessment of sustainable: social (e.g., affordability of electricity), economic (e.g., energy use and productivity), and environmental (e.g., GHG emissions) To present the EU energy policy context; to analyze trends in energy development in eight Baltic Sea Region countries Arbolino, Boffardi, Simone, and Ioppolo (2020) [13] Efficiency index based on normalization, weighting, and aggregation and PCA data: indicators from dimension: sectoral trends (e.g., GDP, energy intensity per capita, RE production per capita, energy consumption), interaction with the environment (e.g., CO2 emissions), economic and policy aspects (e.g., Tax, R&D Expenditure) To propose an approach for achieving increased efficiency energy; to present the test on a sample of 20 Italian provinces Based on the review, the interest in the problem of objective assessment is high and constant. Most papers focused on the results of multidirectional efforts toward sustainable development and the reduction of its negative impact on the environment. Some papers investigated the speed of changes taking place in this area, e.g., the transition to renewable energy sources.
The regression models could be divided into simple equations [28,33,38] and extended, multiple variables [15,27]. Some authors created rankings based on several indicators [12] or their set that described different dimensions [4,11,13,36,37]. Policy and strategy studies were mostly a yes/no analysis of documents [16] or dependencies between them [6]. The portfolio of methods is extensive and still open for suggestions.
The main advantage of the DEA method is a comparative analysis of multiple variables. The method was applied empirically in many areas [39]. DEA-based assessment of sustainable development goals allows for the relativization of performance based on the underlying relationship between the weighted sum of results and the weighted sum of cost [40]. The issues of weighting are solved using objective linear programming. Table 2 presents examples of DEA applications for the assessment of sustainability in EU countries. In this case, the inputs and outputs are given as a reference to previous studies and practice. Hence, DEA-based approaches to the assessment of sustainability in different countries were used for various purposes. In some instances, the focus was placed on efficiency while transforming hard data on labor, capital, and energy into GDP. Other papers considered the associated production of pollutants (mainly greenhouse gas emissions) [45,48,49] and renewables [45][46][47]. Changing trends of RE amounts [46,48,49] was another popular area of assessment.
The cited papers assumed that the data underlying the assumption were solid and reliable. Most of them analyzed absolute energy sustainability indicators from statistical databases. The few cases that used qualitative data treated it in the same way: the transformation into numerical variables was a conversion to zero-one values [15,16]. Also, it was used directly as expert opinions [17,18]. Conversion of qualitative data into quantitative data without considering the subjectivity and the resulting uncertainty was associated with the loss of information, shallow conclusions, and misinterpretation.
There are many approaches to incorporation of uncertainty into the DEA, e.g., chanceconstrained DEA model [74]. In the context of sustainability assessment, uncertainty in data is mainly modeled using concepts derived from Fuzzy Sets [75,76]. For example, DEA and fuzzy best and worst methods were combined and used to prioritize renewable energy sources [77]. Fuzzy numbers were used to describe the ambiguity of a qualitative index used to assess an RE technical plan [78]. This article proposes a hybrid model that integrates the Rough Set theory and DEA to model the vagueness of data. Rough Sets, although different with some origin assumptions, by some authors are treated as an approach derived from Fuzzy Sets [79,80]. The merging of concepts from the set theory and operational research address the need for multifaceted and objective assessment as it proved to be effective in the case of technology prioritization [81]. Fuzzy sets in non-empty space are based on the membership function. Using the membership function relates to the basic problem of choosing the method of its construction. The Rough Set theory, which does not require special assumptions as to data and probability distribution, characterized by mathematical simplicity, has found many applications. However, the combination of Rough Sets with the DEA method is unpopular and relatively rarely used.

Methods
Data Envelopment Analysis (DEA) is a linear programming technique that allows evaluating the relative efficiency of Decision-Making Units (DMUs). The method was initiated by Charnes, Cooper, and Rhodes (1978) [53] based on a paper by Farrell (1957) [82] and his concept of the best practice frontier. The DEA method uses the idea of technical efficiency. It evaluates the performances of the units, considering the relationship of outputs and inputs in connection to the value of this relationship in other DMUs covered by the study (Figure 1). (1) The score , specified in Equation (1) ranges from 0 for the worst performing units, to 100%, for the best ones. The symbol represents the weight of DMUj. In the case of sustainability assessment and comparisons, data ( , ) are often presented and interpreted in the form of ratios rather than in absolute numbers (e.g., GDP per capita, or CO2 per GDP). On the other hand, subjects of decision are numerators ( , ) and denominators ( , ) of these fractions ( , ). The basic standard CCR model (Charnes,Cooper,and Rhodes [20] DEA model) is technically incorrect when data is used in the form of ratios [83][84][85][86]. Assuming ⊆ and ⊆ represent, respectively, ratio inputs and ratio outputs, and \ and \ represent absolute (non-relative) inputs and outputs and, furthermore, , for each ∈ and , for each ∈ , the BCC model Equation (1) can be deployed [83] or the nonlinear solution [84,85] Equation (2): (2) Symbols used as in Equation (1). This paper proposes to integrate DEA and Rough Set methods to deal with data uncertainty [87]. The Rough Set theory, introduced by Pawlak (1982) [88], is a mathematical approach to vagueness and uncertainty. The uncertainty modelling with the concept of rough variables of inputs and outputs [89]: , , , , , , , , where: , , 1, … , , and 1, … , is presented in Figure 2. The ranges , and , represent lower approximations, , and , upper approximations (boundaries) of the unknown real values of input and output , respectively. Using the concept of trust in rough variables and assuming the level α, such as 0.5≤α≤1, αoptimistic and α-pessimistic values can be calculated for each rough variable and each DMU: , , , [89]. The following Equation (3)-(6) formulas can be used: The Rough Set DEA model results in the range of efficiency indicators for the assumed level of α: , [81,89]: , ∑ , , 1, … , , Achievement of sustainability goals is the effect of many consecutive stages (subprocesses). Network DEA models are multistage. They allow examining the efficiency of DMUs that have internal network structures and provide measures for the components that make up the DMUs [90]. Presented below is a general two-stage process, where the first stage uses inputs , , … , , … , ∈ 1, … , to produce outputs , , … , , … , ∈ 1, … , , and then, these are used as inputs of the second stage to produce outputs , , … , , … , , ∈ 1, … , , where ~ represents unknown decision variables. Assuming and are weights that are specified by users and reflect the user preference [91]: In the case of a rough variable, for example from the range: , , where: , ∈ 1, … , formulas Equation (3)-(6) reduce to: For α = 0.5 , reduces to mean: /2, /2 , and for α = 1.0 to origin range: , . The presented models and their combinations can be used to assess the level of development of EU countries from the perspective of sustainability. The main advantages of Rough DEA include the simple interpretation of both the level of trust: the higher the wider the efficiency range, and the relations: to obtain the maximum score the best characteristics of the evaluated DMU is compared with the worst of others and vice versa while forming the lower bounds.

Discussion of Data and Models
In the literature the conventional data for productivity analysis by DEA are labor and capital as inputs and economic growth measured by GDP as an output. In the case of sustainability development assessment, the volume of energy is considered as the main input of GDP. It is used in the form of, e.g., the share of RE, resource depletion, and GHG emissions (Table 2).
Data on sustainable development are processed by many institutions, e.g., the World Bank, the European Environment Agency (EEA), and Eurostat. Differences in collection and aggregation methodologies often imply a certain inconsistency in the published data. In this study, hard data: GDP (in billion US dollars using 2010 exchange rates), population (in millions), and CO2 emissions (in millions of tons of CO2) in 2017 were taken from IEA (2019), CO2 Emissions from Fuel Combustion [72].
The soft data is based on the survey Europeans' Attitudes on Energy Policy [73], which was conducted in 28 EU member states. The survey revealed a high consensus and positive attitude toward the current energy policy. For example, about 60% of respondents totally agreed, and 30% tended to agree with statements "it should be the EU's responsibility to encourage more investment in renewable energy" and "it should be the EU's responsibility to encourage more investment in energy research and innovation". The percentages of agreements were correlated between countries. The Pearson's correlation coefficient for the statements was 0.9. The most frequent answer regarding the priorities for the next five years was investments in clean energy technology and their development. Respondents had less enthusiasm about the reduction of the impact made by energy on climate change and the reduction of the overall energy consumption in the EU.
The agreement to the statement that "it should be the EU's responsibility to encourage more investment in renewable energy" was chosen to reflect the social authorization and readiness for change. The following relationship was assumed: the socio-economic environment (the social agreement to investments and robust data on a country's standard of living measured by GDP per capita) influences CO2 emissions. A simple DEA model was used for total agreements only ( Figure  3a), and Rough DEA-for total agreements and tendencies to agree, thus creating a rough variable (Figure 3b).  The network Rough DEA model was also used to assess efficiencies of investments in lowcarbon development. The authors analyzed the transformation of agreement with investments and the standard of living into the value of investments in energy (the first stage) and the value of investments in low-carbon development (the second stage) (Figure 3c).
Referring to the equations of the models presented in the Methods section, as was taken attitude to policy, as GDP per capita, as GDP per CO2 emission in DEA and Rough DEA model. In network Rough DEA model, represented attitude to policy, GDP per capita, total public energy for research, development and deployment (RD&D) budget, GDP per CO2 emission. Equal weights w1 and w2 was assumed, and α = 0.6, and α = 0.8.
The data on the total public budget (in millions US dollars) allocated for research, development and deployment (RD&D) of energy technologies in 2017 was taken from IEA [10]. Not all data was available. For this reason, network Rough DEA analysis was performed on a smaller data set. Due to the limitation of the DEA model assuming a positive relationship between inputs and outputs, the degree of low-carbon economies was measured by the inverse of the classical CO2 value per GDP. The summaries of data covered by the analysis are presented in Tables 3-5      Apart from an obvious linear relationship between CO2, population and GDP, a small but statistically significant relationship was found between GDP per CO2 and the total agreement to encourage more investment in energy research and innovation. This relationship justifies the consideration of the opinion in the evaluation of a low-carbon economy. Also, there is a noticeable correlation between GDP per capita and RD&D per capita, which indicates that higher spending in wealthier states could achieve lower emissions per GDP.
The charts in Figures 4 and 5 visualize the relationships between considered variables of EU states. In both figures, SE positively stands out: high social awareness and the above-average standard of living are reflected in a low-carbon economy. However, LU demonstrates that GDP per capita does not always determine the level of GDP per CO2. FR shows that high GDP per CO2 can be reached having a moderate social agreement. MT demonstrates that the same can be done with a below-average GDP per capita. ES, CY, and SI are examples indicating that eco-awareness can precede the results of actions aimed at low CO2 emissions. At the other end are the countries of Central and Eastern Europe (PL, RO, CZ, SK) with a low agreement to the need for investment, low GDP per capita, and high current CO2 emissions per GDP. Figure 5 depicts the general relationship, i.e., the higher GDP per capita, the higher is RD&D expenditure. However, it also indicates that the current RD&D budget does not always directly relate to achievements (in the case of FI).

Data Analysis
The results of applied DEA models are presented in Tables 6 and 7 and Figures 6 and 7. BCC-O corresponds to model (1), Ratio DEA-(2), Rough BCC DEA- (7,8), and Network Rough BCC DEA-(9) with the rough concept (7,8). DEA model scores based on only hard data GDP per capita and GDP per CO2 without a qualitative variable was also presented (Ratio DEA only hard data).   Including additional variables in DEA increase or leave the rating unchanged. Assessment with both quantitative and qualitative variables is less strict. It is worth noting that the assessment based only on quantitative data is close to the pessimistic estimation from the Rough BCC DEA model integrating soft and hard data for α = 0.8.
Differences between nonlinear and traditional BCC approaches are insubstantial. The highest scores by these models (1) and (2)  LV-with an average agreement but low GDP per capita.
The incorporation of inconsistency in opinions broadens the range of scores. The 100% effective countries may also be countries, which by models (1) or (2) achieved high nearly 100% result, i.e., (i) IE or DK, but also those having average results, namely, (ii) BE and FI, or even low results, i.e., (iii) CZ or PL. It is significant that in the case of groups (i) and (ii), the dispersion between optimistic and pessimistic results is large, around 50%. The two-stage network evaluation shows a relatively high efficiency of the first stage-the transformation of GDP per capita and public awareness into the RD&D budget. Results of the second stage indicate the need for much improvement in expenditure efficiency. However, these conclusions should be made with caution as financial expenses often have an effect in the long term.
The provided example of assessing the performance of sustainable development of EU countries indicates that depending on the assumed interpretation of qualitative data, varying scores can sometimes be obtained. Four groups of EU countries can be distinguished based on the Rough BCC DEA model: The final assessment may consist of an average score with a range of possible results. The width of the assessment range represents its sensitivity to changes in data interpretation.

Discussion
The contribution of the article is two-fold. First, it presented a new hybrid model based on a combination of Rough Sets and DEA. The model is intended for the integration of hard and soft data in the object ranking task. It enables the inclusion of uncertainty in the underlying data. Second, the article demonstrated the use of the Rough DEA model by assessing EU countries in terms of their progress toward sustainable development objectives. It allows the assessment of physical data and socio-economic data and permits for a more multi-faceted and objective evaluation. The paper is also significant because both quantitative and qualitative data were used to appraise the performance of countries in the field of sustainable development.
Sustainable development goals address many global challenges, including those related to poverty and social inequalities [3], exploitation of natural resources, growing global population, and energy needs [92], and the aging society in the EU [93]. The article refers to the monitoring of clean energy goals considering investments in technology development and modernization at a given level of economic growth and social support.
The article is particularly relevant under the present circumstances. The period chosen for the analysis precedes the anticipated global economic crisis, which will hit many countries that were most affected by COVID-19. In their efforts to reduce CO2 emissions, government-level decisionmakers will have to focus more on the economic opportunities of individual countries and public opinion.
The assessment of the EU's progress toward sustainable development goals was based on hard and soft data integration. It can be treated as a preliminary stage of quantitative considerations regarding the need to increase ecological awareness, shaping the sense of responsibility and readiness to bear the costs of eco-development. The used approach allows broadening the perspective and provides more reliable sustainability rankings. The presented studies did not fully explore the broad research topics of the subject. In future research, it is worth designing a dedicated survey for the measurement of the degree of social readiness to incur expenses of transformation into a low-carbon economy. However, the most important extension of the presented models is the inclusion of a time delay between expenditure, social readiness, and quantitative indicators of sustainability. This task should comprise an adequate aggregation of data according to the schedule of research and investment processes.