Assessing the E ﬃ ciency of Sustainable Cities Using an Empirical Approach

: Sustainability is a multidisciplinary discipline posing a di ﬃ cult problem as a result of its integrated assessment. From a broad perspective, it considers the impact of human activities (using di ﬀ erent resources) and natural conditions on local environments. Urban development has been identiﬁed as one of the most important reasons for environmental and social degradation. To address the complexity of sustainability and its impact, policymakers need to be equipped with the right toolkit to foresee the integrated e ﬀ ect of projects and plans on urban sustainability more e ﬀ ectively in their policy design. In this paper, we propose a tool to assess the sustainable performance of urban areas through a common framework of indicators which provides an integrated measurement based on the relative e ﬃ ciency of key input variables on desirable and undesirable outputs. Using Data Envelopment Analysis (DEA), we propose a procedure for determining the relative e ﬃ ciency of relevant urban areas, proposing this method as a candidate for integrated sustainability measurement. The selection of variables is based on dimensions which can be addressed from a political perspective for achieving more desirable outputs, or reducing the undesirable ones, controlling for key resources as much as possible. Our analysis takes a comprehensive scope including an environmental and socioeconomic perspective. This will be useful to identify weaknesses and strengths to improve the integrated performance of cities. Our array of indicators, based on standardized key performance indicators (KPIs) will enable policymakers to gather an insightful impact of their proposals in urban sustainability carrying out a global sustainability impact assessment through DEA. The main goal is to gather the urban experience of transforming cities into smarter cities and putting technological progress at the service of their societies.


Introduction
The expansion of urban environments is linked to global challenges of sustainability, particularly in regions where the process of urbanization is still unfolding, or the urban metabolism is undergoing a thorough regeneration [1]. In urbanized regions such as Europe, where more than 70% of people are urban dwellers, sustainability is one of the most important challenges, especially concerning the use of energy, economic performance, de-carbonization of infrastructure, wastewater management, and other ecosystems from cities and urban communities [2]. The consumption of these resources can play a crucial role in the development of the UN sustainability goals [3]. dioxide, and wastewater that are associated with desirable production and whose reductions are made possible by effective operational management [13]. However, there are other applications that do not consider undesirable outputs [14].
Used carefully, DEA potentially can facilitate analysis of the main policy issues and improve business strategies to enhance the sustainability of cities. Yang et al. [15] evaluate regional environmental efficiency in China over 10 years based on the super-efficiency DEA model to observe regional disparities. Those super-efficiency models are also useful to assess benchmark performances. Recently, Zhao et al. [16] link the socio-economic and environmental perspective in the evaluation of cities with a linked parallel system of two subsystems to understand the operational process of the sustainable development system.
We study how efficient cities use their inputs to produce desirable outputs. Our main purpose is to evaluate whether cities are using efficiently their available inputs from an environmental perspective. Then, we can deduce some sociological consequences without causal endorsement. Amongst all the alternatives available, we opt for the Slack-Based Inefficiency (SBI) model because it relies on slacks. This enables us to determine the percentages that cities should reduce their inputs in order to reach total efficiency. Efficiency is obtained by subtracting the quantity of SBI against 1, so the best performers (in the benchmark) have an efficiency score equal to 100%. Decisions on what is efficient or what is not depend on the outputs expected to achieve using available inputs from the territory to address local goals. As Gottdiener and Hutchinson [17] conclude, the human ecosystem framework may look like a shopping list of system components. However, it is crucial to realize that the most significant feature of the framework is the fact that it points out the interactions among specific natural, social, and cultural components of the metropolis. This recognition prompts ecologists to be concerned with how people use and behave in the metropolitan ecosystem in a spatially explicit way.
Therefore, further to political decisions on efficiency, we emphasize that the human ecosystem is the result of a complex interaction in which social issues such as poverty, inequality, environmental justice, and public participation in decision-making and space production, in sum equities, must be taken into account. Ahern [18] addresses the dynamic interactions between nature and society, how social change influences the environment and how environmental change shapes society.
We control for only three inputs to simplify the analysis: population, water, and energy consumption. Potentially, this could be helpful for policymakers to tackle social problems and increase awareness within the inhabitants of the metropolitan ecosystem [19]. Population is important for several reasons. Firstly, as mentioned earlier, the growing urban population puts pressure on land and services. Secondly, climate risks and hazards are unevenly distributed and socially differentiated especially in cities where there are diverse populations, with different languages, culture background, age, sex, etc. [20]. Climate change injustice happens along ethnic, gender, class, and racial lines [21,22]. Thirdly, people must participate actively in reducing the impact of the ecological crisis in cities. They are fundamental stakeholders in front of risks of natural hazards in cities and special attention should be given to vulnerable population. Furthermore, it is important that people are properly sensitized, informed, and warned about risks and hazards. Finally, inequality leads to greater environmental degradation, and a more equitable distribution of power and resources would result in improved environmental quality [23,24].
The paper is organized as follows. In Section 1, we expose the main concepts and objectives of this work, which are framed within the 2030 UN goals. In Section 2, we explain our model and carry out a statistical analysis of data. Section 3 describes the main results we got from the analysis. In Section 4, we discuss results with the existing literature. Finally, we set out our main conclusions.

Materials and Methods
DEA is a non-parametric technique that evaluates the efficiency of each operational unit-called Decision Making Units (DMUs)-in the model and defines the operational targets or benchmarks of the inefficient ones. The concept of efficiency assesses the production capacity of the DMUs based on their available resources. The observed data defines the Production Possibility Set (PPS), known as technology, under different assumptions. There are three types of technology: Free Disposal Hull (FDH) considers free disposal of inputs and outputs; Variable Return of Scale (VRS) considers also lineal convexity of the observed DMUs; and Constant Return of Scale (CRS) comprises VRS technology and assumes that any observed operational unit can be scaled [25]. Then, the efficient frontier (EF) is a result of the subset of DMUs that performs best in the PPS. This subset dominates since no one can produce more outputs with a smaller amount of inputs. Each inefficient DMU is projected over the EF, thereby defining its benchmark. In each dimension, the distance of an inefficient DMU to the efficient frontier is called a slack (s). Those DMUs on the EF are efficient and, therefore, their slacks are zero.
Inefficient DMUs have different benchmarks depending on the DEA model specification used defining as benchmarks, the most efficient operational units. The choice of DEA specification depends on the goal that the decision maker wants to analyze. For example, input-oriented models focus on reducing the amount of inputs, output-oriented models prioritize the increase the production, and non-oriented models reduce the inputs at the same time that increase the outputs. Alternatively, the benchmarks depend on the metric used (radial, directional distance function, slack-based, etc.). The first class of DEA models are radial models, like CCR [10]. These models project the DMUs over the EF, measuring the technical efficiency of each units. However, they overestimate the technical efficiency when the nonzero slacks are present. Charnes et al. [26] propose an additive model to curb this overestimation while maximizing the slacks of inputs and outputs at the same time.
In DEA, multiple models can be used to measure the performance of the evaluated units. The additive model proposed by Charnes et al. [26] is able to discriminate between efficient and inefficient DMUs. However, the different properties between this model and the CCR model are explained by the different units that the sum of slacks of inputs and outputs follow. This could justify the use of other slack-based models like the Range Adjusted Model (RAM) developed by Cooper et al. [27], which normalize the slacks of inputs and outputs, and the Slack Based Measure (SBM) model which satisfies monotonicity and unit invariance with respect to slacks developed by Tone [28]. Later, Fukuyama and Weber [29] defined the Slack Based Inefficient (SBI) model to measure the technical inefficiency while considering all slack in the input and output constraint. The SBI model we use is related to the directional technology distance function [30,31] that seeks a maximum non-radial increment in outputs while reducing inputs for a given directional vector.
Our model estimates data from 45 cities and controls for three inputs (population, water, and energy) to produce outputs (desirable and non-desirable). Figure 1 illustrates the control variables we incorporate in our model to evaluate the efficiency of the cities. Our inputs are population (number of people living in the city), water consumption (m 3 ), and energy consumption (MWh). The desirable output is gross domestic product (GDP) (measured in US Dollars), and the undesirable outputs are PM2.5 (measured in average level in µg/m 3 experienced by the population), CO 2 (thousands of equivalent CO 2 Tons), and wastewater (%).  invariance with respect to slacks developed by Tone [28]. Later, Fukuyama and Weber [29]

167
Our model estimates data from 45 cities and controls for three inputs (population, water, and energy) 168 to produce outputs (desirable and non-desirable). Figure

177
We propose the utilization of the SBI Model to evaluate the efficiency of cities. SBI models are non-

178
oriented. This demands that the normalization of the slacks is performed with the observed values of 179 evaluated DMUs [32]. Indeed, this is what we analyze from an empirical approach. The non-oriented 180 feature of the SBI models reduces inputs and maximizes outputs at the same time. As SBI model assesses 181 the inefficiency, efficiency is obtained by subtracting the coefficient from SBI against 1. We apply this model 182 over each DMU, thereby maximizing the mean of their normalized slacks (1).

183
The analysis of efficiency assumes a set of observed DMUs { : = 1, . . , }, where each DMU We propose the utilization of the SBI Model to evaluate the efficiency of cities. SBI models are non-oriented. This demands that the normalization of the slacks is performed with the observed values of evaluated DMUs [32]. Indeed, this is what we analyze from an empirical approach. The non-oriented feature of the SBI models reduces inputs and maximizes outputs at the same time. As SBI model assesses the inefficiency, efficiency is obtained by subtracting the coefficient from SBI against 1. We apply this model over each DMU, thereby maximizing the mean of their normalized slacks (1).
The analysis of efficiency assumes a set of n observed DMUs {DMU j : j = 1, .., n}, where each DMU needs m inputs (x) to produce s desirable outputs (y). However, the production of these desirable outputs creates w undesirable outputs y b , which are linked to the process. These undesirable outputs can be modeled under the assumption of weak disposability, implying that undesirable outputs can be reduced, but at a cost which will require the reduction of the production of desirable outputs [33]. The left part of constraints (2-4) define the efficient frontier based on DMUs, while the right part shows the slacks of the evaluated DMU x 0 , y 0 , y b 0 . The variables λ j , µ j represent the weak disposability assumption proposed by Kuosmanen [33]; and constraint (5) specifies that technology under Variable Returns to Scale (VRS). Thus, the EF is a lineal convexity of the observed DMUs. The reason why we set VRS is attributed to the higher discrimination assumed among DMUs than under CRS [15]. Therefore, not comparing the DMUs with other scaled DMUs offers a more realistic comparison. As a consequence, the EF under VRS technology has a higher number of DMUs than under CRS technology.
s k=1 y k j λ j = y k0 + s + k k = 1, .., s In this model, all efficient DMUs have zero SBI value. Then, they are on the EF. Earlier, we emphasized that SBI model measures the inefficiency of DMUs. This inefficiency is defined as the average of the mean normalized slacks of the DMU grouped by inputs, undesirable outputs, and desirable outputs. Therefore, the efficiency of a DMU is determined by the parameter theta, which is calculated as follows: θ = 1 − SBI.
A further insight can be taken from Equations (7)-(9) which measures the normalized slack of inputs, desirable outputs, and undesirable outputs, respectively.

Descriptive Analysis
In our analysis, we identify similar cities around the world with similar sizes. Indeed, DEA and the SBI model provide a neutral background to measure efficiency as the common goal and offer an excellent tool for a comprehensive evaluation of sustainability regardless of the different realities, climate, societies, and interactions amongst cities. This is our main hypothesis.
We selected data from the OECD data repositories with additions from the World Council from City Data. This includes 45 cities, mostly from Europe but also covers the US, Chile, and Japan. We gathered information related to population, real GDP, air pollution measured in PM2.5, CO 2 footprint, energy and water consumption for each city. Table 1 summarizes the main descriptive statistics from the data source. In Table 1, Manchester has the largest number of inhabitants and Trondheim has the lowest number of inhabitants. Lyon consumes the highest volume of fresh water and Belfast consumes the lowest volume. Porto consumes the highest total energy in 2018 and Cartagena consumes the lowest. Portland is the richest city in terms of real GDP and Cartagena is the poorest. Cracow is the most polluted city (in PM2.5) in the sample and Portland is the least. San Antonio is the highest CO 2 emitter and Debrecen is the city with the lowest level of CO 2 emissions. Finally, Concepcion processes the highest level of urban wastewater and Cartagena possesses the lowest. We measure which dimensions each inefficient city should improve on to reach an efficient frontier based on available data.

Regression Results
Table 2 details our estimates from the SBI model. Coefficients report normalized slacks of each city for inputs (SBI X), desirable outputs (SBI Y), and undesirable outputs (SBI YB) and the overall efficiency indicator (θ) is shown in the last column. Almost half of the cities are efficient (20 cities). The remaining 25 cities are considered to be inefficient and only six of them (Hiroshima, Antwerp, The Hague, Nice, Lille and Bordeaux) have benchmarks that are able to produce more GDP than their current level. This means that the remaining inefficient cities could improve their performance by better managing their resources and reducing their undesirable outputs. Moreover, 56% of the inefficient DMUs have a higher normalized slack for the undesirable outputs than for the inputs.  For the inefficient units, we computed the slacks to identify the inputs each inefficient city should change to be more efficient in the management of their available resources (Table 3). Overall, inefficient cities have margin of improvement if they reduce their water consumption since the mean and the median of this slack is 54.92% and 55.64%, respectively. For population and energy, the medians are 8.96% and 11.16%, respectively and the means are 11.76% and 16.56%, respectively. Vancouver, Hanover Linz, The Hague, Toulouse, Gothenburg, Tallinn, Utrecht, Antwerp, Rotterdam, Helsinki, Tampa-Pinellas, and Pittsburgh are the only inefficient cities that can reduce their energy consumption after comparing their performance with their benchmarks. The mean of their Energy Consumption Slacks is 0.17. In this subset of cities, only six of them (Gothenburg, Hanover, Helsinki, Vancouver, Toulouse, and Tampa-Pinellas) have as a benchmark a city with a lower population (Table 4). These six cities are the only ones that have input slacks for the three inputs. Regarding the undesirable outputs, wastewater is the variable with the lowest values of slacks (0.26 as the median and 0.28 as the mean), while CO 2 is the variable with the highest slacks of all the undesirable outputs (0.42 as the median and 0.43 as the mean). DEA allows observing which are the benchmarks for any inefficient unit and the observed efficient cities that define those benchmarks. There are 10 cities (Bilbao, Bologna, Cracow, Florence, Glasgow, Lyon, Manchester, Trondheim, Turin, and Turku) that are efficient but do not act as a benchmark for any inefficient unit, since they are outliers. On the contrary, Aarhus, Belfast, Cartagena, Copenhagen, Cork, Debrecen, Marbella, Portland, Porto, and Zurich are peers, efficient cities that define the benchmarks for those inefficient units. Copenhagen, Debrecen, and Zurich are the most used efficient cities to define the targets of the database. Table 5 relates the influence of each efficient city (benchmark) over the inefficient cities. We show inefficient cities in rows and the efficient cities (benchmarks) that act as peers for any of the inefficient city in this dataset in columns. Table 5 shows estimates of λ j and λ j + µ j , which define the benchmarks. The variable λ j searches for the peers in constraints (3)(4) for the desirable and undesirable outputs due to weak disposability assumption, while the sum λ j + µ j tracks down the peers in constraint (2) for the inputs. The coefficient λ j takes zero value for all the inefficient units when they are using Debrecen as peer, except for Concepcion which has µ j = 0. This means that Concepcion has Debrecen as a peer due to its production level of desirable and undesirable outputs for its consumption of inputs, while for the rest of the inefficient cities that have Debrecen as a peer, its expertise as resource manager is a reference for them. On the contrary, Zurich acts as peer for most of the inefficient units not only for their resource management, but for their level of production too.

Discussion
In recent years, DEA has been widely used to assess urban sustainability [34][35][36]. From an empirical approach, the use of DEA in urban contexts assesses the performance of the cities comprising all the potential dimensions related to sustainability. DEA can be used in benchmarking, target setting, measuring returns to scale, measuring congestion, etc. Because of the capabilities of DEA models in evaluating and ranking DMUs [37]. Despite DEA being an excellent tool to guide policy makers to improve social and urban sustainability [13], it is important to acknowledge that it has to be used carefully and researchers must be aware of its limitations and strengths.
From an urban policy perspective that includes the analysis of KPIs, this paper comprehensively addresses one of the most important dilemmas in the assessment of urban sustainability. We benefit from the SBI model and assumed Variable Returns to Scale (VRS) in the production function (technology). This helps us to evaluate the performance of the cities in a more realistic way, contrary to most of existing empirical evidence [17]. A similar approach has been applied to the integrated sustainability performance assessment of Universities (based on UI GreenMetric ranking) by Puertas and Marti [38].
The result is that almost half of the cities of our sample are efficient. Half of the efficient cities perform as a benchmark for the inefficient units and the other half are outliers. Therefore, inefficient cities do not take any of those outliers as peers (benchmarks). Looking more closely at the information included in Table 4, any city can observe what are their benchmarks (and so replicate relevant policies) for any inefficient unit and the observed efficient cities that define those benchmarks. Overall efficiency is not relevant for decision makers if the city does not act as a benchmark for any inefficient unit (when efficient cities are outliers). On the other hand, Aarhus, Belfast, Cartagena, Copenhagen, Cork, Debrecen, Marbella, Portland, Porto, and Zurich are peers that define the benchmarks for the inefficient cities. Copenhagen, Debrecen and Zurich are the most commonly used efficient cities to define the targets for their independent benchmarks.
From a practical point of view, any city incorporated to the database can obtain a diagnosis first on its relative efficiency, and then on specific benchmarks from peers for setting their future (optimal) policy goals. When a city looks at its relative ranking position from the SBI efficiency model (Table 2). Later, a detailed comparison of their relative slacks and corresponding benchmarks provides information what policies have to be addressed for an optimal result on the given city. In sum, slacks inform about which variables inefficient cities should reduce (in the case of inputs or undesirable outputs) or which variables they should increase (GDP) in order to improve their performance and become more efficient cities. Bigger slacks mean that the efforts that cities should carry out are bigger in that variable. This paper is not exempted from some limitations. First, as the model obtained with the use of DEA and SBI is the best among the possible models with the available data, the restricted available data we have coped with mean that we still can improve our model a great deal. We are committed to looking for more data in order to make a better and more complex model which reflects the reality of cities meaningfully. Second, the use of DEA and SBI invariably leads to a specific final model. In our work, SBI model searches for the maximum distance of the inefficient cities to the EF, thereby reducing their resources involving undesirable outputs whilst increasing their desirable outputs. Other models define a fixed direction of all the DMUs to be projected over the EF, while in our model each DMU follows their own direction. Third, in this paper we have focused on a concrete period for the assessment of the efficiency of the cities. We consider that a multiple period analysis could be an interesting further research area to carry on.

Conclusions
This paper evaluates the performance of cities utilizing the SBI model to guide that process.
While this model has been tested in multiple applications, we have found none in the context of sustainability, and we are therefore excited to present our results in this forum. Apart from that, the application of this model using both desirable and undesirable outputs following the weak disposability assumption represents an excellent opportunity to give proper feedback to cities. Cities can benefit from this analysis to enhance their performance, even though there are evident limitations due to the DEA methodology, available data, and the fact that the goal proposed in the model affects the search of the benchmarks.
Looking closely at the influence of each efficient city (benchmark) over the inefficient cities (Table 5) specific policies from benchmarked cities can be monitored to ascertain their relevance on the measurement of efficiency for each city. Since all the selected cities were gathered in the data under the same standard (ISO 37120) and have similar population size, policies can be followed up to improve the decision-making processes.
The effect of specific urban policies can be explored by simulating the future evolution of inputs and outputs on this model, allowing insight into the overall effect of city decisions on the most efficient result for cities' future. Later developments could include exploring simulations on KPI evolution for verifying reasonable performance.