Assessing Impact, Performance and Sustainability Potential of Smart City Projects: Towards a Case Agnostic Evaluation Framework

: We report on a novel evaluation framework to globally assess the footprint of smart cities and communities (SCC) projects, being also expandable to the case of smart grid related projects. The uniform smart city evaluation (USE) framework is constructed upon three complementary evaluation axes: the ﬁrst one aims to weigh up the success of a SCC project based on performance metrics against pre-deﬁned project-speciﬁc target values. The second axis focuses on the project’s impact towards the sustainability of a city and it is bench-marked against national and international key objectives arising from strategic plans. This bench-marking feeds the third axis which provides a more inclusive evaluation against four pre-deﬁned and widely acclaimed sectors of interest. The steps to be followed for the uniform evaluation of each axis and corresponding index are presented in detail, including necessary key performance indicator (KPI) normalization, weighting, and aggregation methods. The resulting indices’ scores for each axis (namely project performance index, sustainability impact index, and sustainability performance index) can be post-processed with adequate data processing and visualization tools to extract important information on the extent to which the range of success of a SCC project contributes to the city sustainability progress. Illustrative examples from an on-going SCC project are provided to highlight the strengths of the approach. The proposed framework can be used to compare multiple projects within a city and sustainability and project performance in different cities, evaluate the interventions chosen per project against city needs, benchmark and design future projects (with, e.g., reverse engineering, projections), as well as evaluate various spatial and temporal scales.


Introduction and Motivation
Smart and sustainable cities have been receiving increased interest from scholars and municipal stakeholders over the past 10 years, in the hopes of contributing to sustainable development goals (SDG) [1] and realising a prosperous climate-neutral future and better quality of life [2]. Nowadays, over 55% of the world's population lives in cities and urban areas [3], while approximately 75% of the world's energy consumption and almost a similar share of global anthropogenic carbon emissions is attributed to cities [4]. The need for combating those major challenges of urbanization and climate change led to the rapid development of various smart city initiatives, such as the C40 Cities [5], Daring Cities [6], Covenant of Mayors [7], Smart Cities Information System (SCIS) [8], 100 Intelligent Cities Challenge [9], Energy Cities [10], etc. Their main focus is on policies and measures that aim to accelerate energy transition and reduce greenhouse gas (GHG), as well as other pollutant emissions at city level [11], while actively engaging citizens in urban regeneration schemes [12]. The design of transformative policies towards recovery from the current COVID-19 pandemic crisis is also of vital importance for urban sustainable development due to changes caused in the working environments and the use of buildings [13]. In the European Union (EU), several smart cities and communities (SCC) projects foster solutions pertaining to energy, mobility, and information and communication technology (ICT) in order to transform cities into smart, green, liveable, and sustainable, as well as exchange know-how in terms of project results, best practices and lessons learnt [14][15][16].
An important component for designing and implementing a smart city concept is the evaluation of the impact of the demonstrated solutions and actions [17,18] against a city's sustainability vision [19,20]. This assessment process needs a common and shareable evaluation framework which can measure the effectiveness of the interventions in relation to smart city development and sustainable performance progress [21,22]. Such an evaluation framework can be used as a decision support tool for policy makers and financiers to evaluate smart city interventions and compare how impactful each intervention is for a city or assess what types of interventions create a greater impact in different cities with varying contextual characteristics. A comprehensive and simplified framework can also act towards enabling the engagement of high power and highly interested stakeholders, i.e., governance, technology, and service providers, and energy utilities early on, which is critical for the successful implementation of smart city solutions, while it can also be used for raising citizen awareness on the impact of the proposed interventions. Currently, evaluation frameworks rely on the definition and use of key performance indicators (KPI) [22,23] that quantify project results based on quantitative or qualitative data in order to assess how close cities are to meet their goals and provide a comprehensive analysis regarding smart and sustainable performance and progress [24]. The development of an evaluation framework lies in individual indicators which either constitute an indicator set or they form a composite index. An indicator set is a group of single non-aggregated indicators relevant to pre-defined performance dimensions constituting a simple indicator evaluation framework for a project or a city performance. A composite index or system is a single and comparable metric based on normalization, weighting, and aggregation methods of multiple indicators [25], that provide as a final output a ranking of the overall performance levels of a range of evaluations [26].
Typically, sustainability is characterized by complexity and multidimensionality, and is expressed by various domains and typologies [27]. This is also the case for urban sustainability, due to the fact that cities are complex and multidisciplinary ecosystems [28] with many differences in a variety of features such as population, geography, climate and environment, natural resources, capital, culture, infrastructure, etc. In this regard, significant challenges and issues arise when it comes to the strategic planning and assessment of urban sustainability [29]. Some of the most common challenges pertain to discrepancies in the definition, theoretical modeling and scope of urban sustainability, while a variety of urban sustainability concepts have been contrived, namely the sustainable city, the smart city, the eco-city, the low-carbon city, the green city, etc. [30]. Other issues stem from deficiencies in the inclusive aspects, incompatibility of different solution approaches and limited connection between the concepts of sustainable cities and smart cities in the urban model, narrowing the effectiveness of sustainable transformation. We note here as an example the design of weak instead of strong sustainability schemes; a strong sustainability perspective assumes complementarity of human-made and natural capital in urban areas, while weak sustainability assumes that technology can substitute resources and ecology [31]; Moreover, deviations are also observed between the planned project and city goals and global targets set towards sustainability and smartness. More specifically, smart city projects tend to follow an evaluation approach based on specific project objectives, business models, and expected impacts of the engaged cities, rendering the comparability with other projects or cities difficult or unfeasible. The effective incorporation of key aspects that consider the urban smart-sustainability fix [32], address, and converge the different conceptual paths of urban sustainability [29] and link properly the concepts of smart city and sustainable city [33,34] in the urban ecosystem, will help projects and consequently cities to achieve multiple objectives. To this end, urban sustainability assessment can be designed following an integrated approach focusing on a solid and unified evaluation process of project-based and city-oriented progress capable of generating clear, sufficient, understandable, and ready-for-benchmarking results and conclusions for all projects and cities.
In this context, this paper builds upon the body of knowledge around the evaluation of sustainability in cities and concentrates on developing and proposing an easy-to-use and versatile evaluation framework for assessing and comparing the success of a project and the extent to which it contributes to the city sustainability progress against EU goals.
We aim to answer four research questions, furthering the state of the art as follows: 1.
What are the main frameworks used to evaluate cities in terms of sustainability and smartness and which are their attributes and shortcomings? 2.
How a SCC project can evaluate its performance against pre-defined targets? 3.
How a SCC project can holistically evaluate its contribution and impact towards the sustainable goals of a city? 4.
How results achieved by each smart city can be comparable against other ones, on a common basis, accounting though that for each city there are specificities 5.
How can one evaluate in a fair and transparent way multiple SCC projects, as well as cities as whole promoting comparability and replication potential?
Question 1 is being addressed in Section 2 where an extensive literature review and analysis has been performed. The proposed evaluation framework (the uniform smart city evaluation-USE framework) tackles questions 2-5 in a cyclic, inclusive manner under a three-axes approach. By adopting USE framework, the impact, performance, and sustainability potential of smart city projects can be assessed in parallel while the framework can be used to compare multiple projects within a city, as well as sustainability and project performance in different cities, evaluate the interventions chosen per project against city needs, benchmark and design future projects (with, e.g., reverse engineering, projections), as well as evaluate various spatial and temporal scales.
The remainder of this paper is organized as follows: Section 2 provides an overview of common urban sustainability and smartness indices, their main characteristics and problematic elements. In Section 3, we present our newly developed uniform smart city evaluation (USE) framework composed of three axes or indices, namely: (i) project performance index (PPI), (ii) project sustainability impact (PSI) index, and (iii) sustainability performance index (SPI). The necessary methods and tools for calculating the indices under the USE framework are detailed in Section 4 while the step-by-step calculation procedure per index is presented in Section 5. In Section 6, a set of illustrative examples from an ongoing SCC project is provided showing how the framework works in terms of weighting, aggregating, and scoring of indicators. We discuss on necessary steps and current limitations and shortcomings in Section 7 while concluding remarks and suggestions for future research are presented in the last section of the article (Section 8).

Review of Urban Sustainability and Smartness Evaluation Frameworks and Indices
In order to provide an overview about the main characteristics, inclusive aspects, methodological approaches, and limitations of the frameworks and indices developed for the topic of urban sustainability, an extensive literature review and analysis was carried out based on a series of latest research studies, and several indexing reports published in the field. The screening process of both scientific and gray literature was conducted with the aid of several search engines and online databases, e.g., Scopus, Web of Science, ScienceDirect, Google Scholar, etc., with a view to include a wide spectrum of journals, books, and technical reports with high relevance to smart city and urban sustainability assessment. The purpose of this bibliographical search was to identify the most well-known and widely-accepted city sustainability and smartness indices, frameworks, and studies from the last decade (after year 2012). A generic search was performed by typing keywords such as "urban sustainability" AND "assessment", "smartness" AND "assessment", "urban sustainability" AND "evaluation framework", "urban sustainability" AND "index", "smart city" AND "index", "smart and sustainable cities" AND "ranking" OR "concept", and so forth. Then, the abstracts and titles of the most relevant articles were examined in order to select potential references that align with the scope and the inclusion criteria of this research. The review of gray literature pertinent to the area of urban sustainability and smart city indices was also considered at this stage to extract available information and potential frameworks. The studies from both categories that passed the initial filtering process went through a full read and review, and only scientific or technical material with sufficient methodological content and contribution in the field was included in the final inventory of papers and reports which were analyzed in detail. A similar approach has been followed in Ref. [35], where the main drivers for increased city smartness have been studied. We note that even though sustainability goes beyond local and urban areas, and several composite indices tackle country or global scale evaluation (e.g., the Climate Change Performance Index 2021 [36], the Energy Trilemma Index 2020 [37], the Environmental Performance Index 2020 (EPI) [38], the Energy Transition Index 2021 (ETI) [39]), we have restricted our study to urban sustainability and smartness at district and city level. Our research indicates that a variety of models and tools have been developed for the evaluation and comparability of smartness and sustainability in urban areas [40,41]. These tools are based on composite indices that assess critical dimensions of sustainability and smartness, but only few of them combine features and qualities of both aspects. A good example of a composite index offered also as an interactive tool, that introduces both technology maturity and sustainability aspects in urban development is the networked society city index [42]. Another initiative on an EU level and under the European Green Capital framework, is the Green City tool [43], aiming to facilitate sustainable urban planning with a main focus on offering best practices and guidance. It provides a simple, straightforward tool but limited to generic qualitative inputs of self-assessment for cities. In general, composite indices provide some key outcomes, such as ranking and benchmarking of cities, facilitating research and analysis in the urban design [44] and assisting in sharing knowledge for the development of smart and sustainable cities [45]. However, given that city sustainability entails a multitude of aspects and domains [46], all these evaluation frameworks and indices present methodological gaps and conflicts, as they capitalize on different definitions of urban performance and development [47] while showing imbalance between smartness and sustainability [34]. Table 1 summarizes the most important composite indices associated with sustainability or smartness aspects at urban level, along with a brief presentation of their benefits and shortcomings.
Although there are many similarities among characteristics of evaluation frameworks, rating systems, or composite indices, they differ considerably in conceptualization, focus, and goals, due to the determined diverse city needs, boundaries and expected outcomes of the smart and sustainable cities under assessment, as well as the perspectives of the relevant stakeholders and experts. A majority of applications, experiments, projects and initiatives use as a guiding principle the "triple bottom line (TBL)" in order to evaluate sustainability performance which integrates social, economic, and environmental variables [48]. A good illustration of this is the China's urban sustainability indices (USI), the last version of which launched in 2016 and uses 23 indicators categorized into the three dimensions of TBL for ranking one hundred and eighty-five (185) Chinese cities of diverse sizes and development stages assessing their sustainability performance level between 2006 and 2014 [49]. A primary issue with the China USI indices is that they not adequately address smartness aspects. Furthermore, some indices represent strong sustainability while other weak sustainability assessment. A representative index with strong sustainability criteria is the sustainable development of energy, water, and environment systems index (SDEWES) that assesses the sustainable performance of one-hundred twenty (120) cities across seven (7) dimensions, while identifying also best practices for policy learning and adoption [50]. There are also indices that primarily focus on environmental sustainability, such as the European Green City Index which is grounded on 30 individual indicators to assess and compare the environmental performance of 30 big European cities from different countries [51], or indices that explore only specific urban aspects, such as urban mobility, air quality, business development, etc. [52], e.g., the index developed by Collins et al. (2019) that builds upon geographic, meteorological, and socio-economic data and K-means clustering to determine which out of 119 U.S. cities included in the analysis are bicyclingfriendly cities [53].
Several indices also have drawbacks that lie in the difference and multiplicity of the data sources used for results' comparison, owing to lack of data for some indicators or even due to inconsistency of the framework approach. In some cases, country-level data are utilized or extrapolation techniques are implemented, while data are also obtained from other indices to calculate a number of their metrics (e.g., References [54,55]). In the case of a city evaluated by two or more different indices, results lead to diverse type of rankings, implying an indication of subjectivity. A good illustration of this is the city of London when assessed via the IESE Cities in Motion Index 2020 and the IMD Smart City Index 2020. The city ranks top in the first index and on the fifteenth place in the second, due to the different approaches in the smart and sustainable city concept and its dimensions, as well as the number of cities and indicators of city evaluation between the two indices, leading to extremely difficult comparison of results. In addition, major differences and incoherences are observed among composite indices regarding the normalization, weighting, and aggregation methods used to evaluate performance.
At the project level of evaluation, an important issue is that, usually, the targets set by the local communities and cities vary significantly from key policies and objectives defined in the strategic plans which are in alignment with wider smart city, urban sustainability, and energy transition aspects, e.g., EU key objectives and directives. Each project uses their own assessment methods and capitalizes on sustainability aspects based on their needs. For example, the evaluation frameworks of the EU funded SCC projects +CityxChange [56] and SPARCS [57] differ noticeably in their methodological approaches: +CityxChange builds upon a framework that aims to evaluate the impact of the project interventions at demo-site level by a simplified SCIS-based (SCIS refers to the smart city information system-an SCC knowledge platform incorporating data reporting through pre-defined KPIs) and project-defined indicator set, while SPARCS includes, apart from SCIS, a set of KPIs selected from numerous frameworks and aims to use a data normalization methodology, in order to provide an objective assessment of the project results. This leads to problematic interpretation and decision-making when assessing project performance levels or comparing project results between cities or projects. In addition, the existing frameworks cannot be used uniformly to evaluate the smart and sustainable features of cities, i.e., the overall city sustainability, since they do not follow a similar monitoring procedure of their expected impacts in terms of sustainability. For instance, the proposed framework by SHARING CITIES project [58] entails a list of 129 indicators classified in six performance domains, whereas ATELIER project [59] includes a set of 44 indicators in six performance domains, domains which are not only different but also need to be normalized and aggregated properly to provide information for comparing smartness and sustainability levels of the participating cities. Recognizes that not all cities start from the same development level, nor with same set of advantages; Considers priority areas for cities Not evaluating key sustainability aspects, e.g., environmental, climate, and energy performance; Inconsistency of data in some cases The Global Cities Index (GCI) 2020 [64] Measures the international standing of 151 cities globally across five dimensions, resulting in city rankings in terms of business opportunities and economic innovation 29 Business Activity, Human Capital, Information Exchange, Cultural Experience, Political Engagement Outlines new challenges and priorities in the business sector; Provides a snapshot of how COVID-19 has shattered the status-quo; Reflects emerging geographies Focuses only on business innovation and economy; Not primarily based on smart city technology and urban sustainability pertaining to environmental issues Global Power City Index (GPCI) 2020 [65] Evaluates and ranks major cities of the world (48) according to their comprehensive power level in terms of average well-being and access to urban facilities, in order to attract people, capital, and enterprises 70 Economy, Research and Development, Cultural Interaction, Liveability, Environment, Accessibility Function-specific ranking; Focuses on the development state of cities including a broad set of factors; Examines changes in working styles and people commuting owing to COVID-19 Does not address sufficiently smartness aspects; Includes a limited number of indicators for energy performance, governance, etc.; Differs conceptually with the 3-pillar sustainability approach Urban Development Index (UDI) 2020 [47] Aims to measure the level of sustainable development in the city of Rio-de-Janeiro via benchmarking with other four cities, based on an equal weighting approach 32 Capitalizing on 4 knowledgebased urban development (KBUD) pillars: economy, society, environment, and governance Provides a baseline for how a city is positioned in relation to others and determines how to improve its urban performance; comprehensive structure of 8 target groups Does not entail resource efficiency, energy transition and climate change aspects; Renewable energy and air quality indicators not included; Data used from other indices Mercer's Quality of Living City Ranking (QoL) 2020 [66] Assesses living conditions for 140 cities against generally accepted standards and gives recommendations to potential employees for assigned destinations 39 Political and social, economic, socio-cultural, health, education, public services and transportation, recreation, consumer goods, housing, natural environment City-to-city index comparison that quantifies the difference in the quality of living between any two cities; Provision of data on quality of living that help employees sent to work abroad, students, etc.
Not suitable for evaluating overall sustainable development progress nor smartness; Not including aspects pertaining to resource efficiency, technology, energy, and climate targets

Index
Description # of Ind. Dimensions

Pros Cons
The Global Liveability Index (GLI) 2019 [67] Assesses which locations among 140 cities worldwide provide the best or the worst living conditions based on 5 evaluation areas 30 Stability, Healthcare, Culture and Environment, Education, Infrastructure Quantifies the challenges to an individual's lifestyle in any given location, and allows for direct comparison between locations Benchmarks the performance of 120 cities across energy, water, and environment systems towards promoting policy learning, action, and cooperation and bringing cities closer to sustainable development 35 Energy Presents a total score of smart city performance of 120 eligible cities across the globe, based on the Smart Cities while properly considering also sustainability issues 62 Economy, People, Mobility, Living, Governance, Environment Entails a multitude of indicators addressing most of the factors of smart and sustainable cities; Universal spatial scope in tandem with regional characteristics Lack of a unified approach-uses and combines data from individual indicators and other indices, e.g., Mercer, Innovation Cities Index; Indicator averaging at each level Sustainable City Index 2.0 (SCI 2.0) 2014 [76] Evaluates 403 Dutch cities showing at a glance the level of their sustainability 24 Economic Well-being, Environmental Well-being and Resources circularity, Human Well-being Examines thoroughly the correlation between various indicators; Results can also be aggregated to the provinces' level Does not include governance aspects; Several data sources UN-Habitat's City Prosperity Index (CPI) 2012 [77] Evaluates the degree of prosperity in the cities of the world based on the concept of the wheel of urban prosperity, a conceptual matrix that symbolises the well-balanced development across five spokes 17 Productivity, Infrastructure Development, Quality of Life, Equity and Social Inclusion, Environmental Sustainability, Urban Governance, and Legislation Aids the design of effective policy interventions; Allows to evaluate, and report on city progress towards the implementation of the SD Agenda 2030; Depicts the strengths or weaknesses of prosperity factors Does not address smartness aspects, e.g., smart infrastructure, technology, energy, mobility, etc.

CITYWEB Index 2012
(City-Card) [78] Evaluates and ranks performance of cities against city concepts. Each city receives a score out of 100 in five city development models and a total grade out of 500 21 Global Cities; Nice Cities; Knowledge-intensive networks; Intelligent Cities; Creative Cities; Highlights specific areas where a city can be improved; Equally useful to individuals, businesses, academics and governments; Includes both quantitative and qualitative factors (well-balanced) Weighting issues; Lack of consistency of the data Lack of data for some categories or cities; Overlap of some indicators Taking these critical points into consideration, the relationship between a project's success and the impact this success has over the city sustainability progress remains an issue that still requires further research. Cities' sustainable performance evaluation should on the one hand be reliant on the needs, visions, and strategies that address the peculiarities of every city ecosystem, but on the other it should be based on a harmonized, holistic, and unified approach able to benchmark the overall performance in terms of sustainability and smartness for all projects and cities. In the following section (Section 3), we present a newly developed framework (the USE framework) for the evaluation and progress assessment of SCC projects along with their contribution towards a city's sustainability targets.

The Uniform Smart City Evaluation (USE) Framework-A Common Framework for SCC Projects and Cities
In a typical approach, cities construct internal projects (here we constrict ourselves to SCC projects-projects that in one way or another contribute towards their sustainability) in order to upgrade the functionalities, infrastructure, and services provided to their citizens. These projects can be focused to a particular geographical area, or they can cover extensive city regions and districts. Moreover, such projects can focus on a specific intervention domain, such as energy consumption, grid flexibility, e-mobility, etc. It is thus evident that a simple city sustainability index based on a pre-determined set of indicators is inherently limited: it cannot track the progress and success of each project, neither can it track the contribution and impact of each project to the sustainability of each city. In addition, the aggregation of interventions and their impact from a building level to a whole district and eventually to the city level is still ambiguous. For example, how do we choose representative buildings in order to scale-up the evaluation of a city? A consensus needs to be reached on these aggregation approaches and on the definition of the spatial scales (we note here that a first attempt has already been made on an EU level-especially towards positive energy districts-PEDs [79,80] and we adopt these conventions in this work as we mention below).
To this respect, we propose a uniform smart city evaluation (USE) framework under a triple axis approach as illustrated in Figure 1. The key objective of this framework is to provide a method of identifying the extent to which the range of success of a SCC project contributes to the city's sustainability progress. It can be used to evaluate the interventions chosen against real-time city needs, benchmark and design future projects (based on reverse engineering, projections, etc.) and ultimately compare multiple projects within a city, as well as sustainability performance between different cities. The ultimate goal for the proposed framework is to propose global metrics that can be used as reference amid multi and cross-disciplinary projects, promoting comparability, transparency, and uniformity on various aggregation levels. To reach this goal, a series of policy-like decisions need to be made so that these metrics can be adopted in a wide-scale (e.g., under the EU umbrella) and we discuss the necessary steps and possible barriers in Section 7. Most importantly, we promote the notion of common KPI repositories for a holistic evaluation of future SCC projects that need to be determined. The USE framework consists of three evaluation axes that aims to tackle the aforementioned objectives in a holistic and self-containing manner.
The first axis corresponds to the lower level of evaluation: the project performance index (PPI). This index is being fed by the project success indicators (PSI)-KPI-like metrics which are used to assess the successful (or not) implementation of each project's interventions and their impact against pre-defined targets relevant to this specific project. PSIs typically involve only monitored values of performance at a specific temporal scale after the project's start and in the case where the target values are linked to baseline values (providing indications of percentage change for example), their estimation (prior to the project's start) rely on modeled and not actual up-to-date baseline values which should reflect the current status just prior the monitoring phase. The definition of the PSIs is a newly proposed concept [81,82], mostly relevant to EU-funded SCC projects with clear call impact targets. We propose here their adoption by any SCC project and link them with the PPI evaluation. To clarify this index, let us assume a district-wise project which aims to increase the uptake of public EV charging points inside the specific district. The project has already set a target for this uptake to, e.g., three "new public EV charging points installed inside the district by the end of the project". This indicator is project-specific, assessed only on pre-defined spatial and temporal scales and it is thus to be evaluated against this specific pre-defined target value and not against long-term sustainability goals. It is thus a PSI, in the sense that it should be measured and reported, and it provides a straight-forward interpretation of what is done versus what is planned: did the specific project meet its goals in the predefined SC focus, spatial and temporal levels? The aggregation of all PSIs lead to the project performance index. In Section 5.1, we provide details on the methodological approach for calculating the PPI with a step-by-step procedure and a complete flowchart. The second axis corresponds to the middle level of evaluation: the sustainability impact index (SII). Its assessment focuses on the multi-dimensional impact that a particular project has to the relevant to the project sustainable goals of a city and can be extracted on any spatial and temporal scale of interest. It differs significantly from the PPI in the following aspects: • The SSI is based on all the key performance indicators (KPI) defined in a SCC project in contrast to the PPI which incorporates only the (limited in number) PSIs. The KPIs are commonly linked to different dimensions (also called domains or categories depending on the project) such as energy, economic, social, ICT, mobility, etc. The SSI can thus assess the multi-dimensional performance of the project providing an inclusive evaluation while also providing the possibility of sub-indexing per project's KPI dimension; • Each KPI target value is defined based on sustainability goals as proposed by national or international goals and policies/best available practices; • The KPIs are assessed against actual baseline values (monitored before the project's start) and can, thus, directly showcase the progress of a city compared to the business as usual (BaU) scenario.
To clarify the SII's scope, let us consider again the aforementioned public EV charging points uptake as an example. A relevant KPI to the defined PSI of "new public EV charging points installed inside the district by the end of the project" would be "public EV charging points installed per 1000 capita". Let us restrict ourselves to a particular district A and let us assume that the baseline value of this KPI is 2-the number of charging points per 1000 capita inside district A before the project. The target value for this KPI can be considered as 5.7 public charging points/1000 capita inside the same district based on EU target values in the Road2Zero scenario [83]. Let us now assume that the project has reached its PSI target of 3 new EV charging points by the end of the project (which correspond to an increase in baseline value to <3 points/1000 capita if we consider that >3000 capitas are living within the defined district. It is evident that while the PPI has been achieved, the project has a mild contribution towards the sustainable goal of the district (and by aggregating of the city). It can be theorized that in order to achieve its sustainable goal the district needs to implement an additional number of 3 similar projects. Of course, this example is too simplistic and unidimensional, but it gives a first understanding of the scope of the SII. In Section 5.2, we provide details on the proposed methodology for calculating the SII with a step-by-step procedure and a complete flowchart, where all relevant input and initial conditions will be further clarified (in addition to the complete illustrative example of Section 6. The third and last axis corresponds to the highest level of evaluation: The sustainable performance index (SPI). It is important to distinguish two use-cases (UC) for the evaluation of SCC projects under the SPI.
• UC1: The first use-case deals with on-going SCC projects that are currently in the development-implementation-evaluation phase and have already identified the necessary KPIs to be monitored. Herein, the SPI aims to provide a cross-dimensional evaluation under four pre-defined overarching sectors. Each sector (see Section 5.3 for more details on the definition of each sector) encompasses the most important KPIs of the project extracted from all the KPI dimensions. In contrast to the SII which clusters the KPIs under each dimension, this clustering allows for a more holistic but also targeted evaluation of each project's results into the specific sectors of interest, leading to a cross-dimensional evaluation. For example, consider again a SCC project which focuses on e-mobility. It is straight-forward to conclude that this particular project aims to achieve impact in terms of sustainable urban transport (e.g., EV integration and adoption, car sharing schemes) urban infrastructure (e.g., EV chargers, V2G technology), climate change (e.g., reducing urban transport related emissions) in addition to any socio-economic benefits for the districts or cities of application. The categorization and evaluation including all project's KPIs and under KPI dimensions (as performed by the SII) although extremely important, might not be easily interpreted by the citizens and city authorities: KPIs can be quite technical depending on the particular focus of the project (e.g., a KPI on battery degradation rate is an important aspect for any EV charging system), hindering a straight-forward and layman-oriented understanding of the project's impact to the overall sustainability of the area. Moreover, KPI dimensions are not inclusive per-se, in the sense that each dimension is well-defined and not overlapping with the rest of dimensions (and associated KPIs). Despite the fact that in the end SII provides an averaged evaluation of all dimensions ("multi-dimensional evaluation"), it is essential to integrate the project's results in a categorization that reflects cross-dimensional aspects. To this respect, the SPI and its clustering to overarching, easily interpreted, cross-dimensional sectors provide the necessary flexibility and inclusivity reflecting the project's performance versus the city's needs. • UC2: The second use-case pertains to future SCC projects which have the flexibility in adopting their KPIs at a later stage. The necessity for the SPI under this use-case comes directly from the fact that SCC projects are extremely wide in scope. A small or largescale city project can focus on energy related matters, such as building renovations, RES penetration, and grid flexibility, while another can only touch aspects related to mobility and district level storage. The KPIs along with their dimensions defined for each of these projects are targeted to their specific interventions and the evaluation results of each project cannot be fairly compared (comparing the SII of project A to the SII of project B is unfair as their scope is different). In addition, even if the projects' scope is similar, such a comparison lacks a common framework that includes all aspects in which a city needs to progress in order to meet its sustainability goals. These aspects are rarely limited to energy or e-mobility related matters. Leveraging the UN's sustainable development goals, it is easy to conclude that a real smart city is (at least) energy efficient, clean, safe, just, citizen-centric, culturally rich, healthy, and self-sufficient. Therefore, it is essential to provide an evaluation of a specific project against universal, all-inclusive, overarching sectors on a higher hierarchical level than the level of KPI dimensions (which are pre-defined for each project). These sectors should be widely acclaimed and should ideally include a multitude of common KPIs belonging to multiple dimensions and covering all aspects of a smart city. The SPI then provides an index that can be used to compare reliably projects with different or similar focus under the same umbrella, while each sector is linked to relevant sub-indices for a more targeted assessment. The cross-dimensional nature of the SPI provides increased interest for a city, being able to assess self-consistently its overall performance and initiate targeted projects to progress further on. As noted in Section 2, the definition of a common KPI repository per pre-defined SCC sector is a matter of high-level institutional and international decision-making and thus such definition is outside the scope of this work. Nevertheless, once a consensus is reached on the common KPIs, the implementation of our proposed framework is straight-forward as described in Section 5.3.
In Section 5.3, we provide details on the proposed methodology for calculating the SPI with a step-by-step procedure and a complete flowchart, while also providing a definition of the four overarching sectors. We also distinguish the methodology between the two use-cases as described above.
In summary, the proposed framework builds upon the inherent needs of cities, as well as the particular nature of each SCC project. We need to reiterate here that the concept of sustainability in a smart city context is unfortunately not clear, and multiple definitions render its uniformity ambiguous [84]. In this work, we have adopted UNECE's definition [85] which states: "A smart sustainable city is an innovative city that uses ICTs and other means to improve quality of life, efficiency of urban operation and services, and competitiveness, while ensuring that it meets the needs of present and future generations with respect to economic, social, environmental as well as cultural aspects.". This quite broad definition includes all aspects that pertain to a SCC project but also touches issues relevant to UN's SDGs being able to provide a holistic approach that incorporates the necessary multi-dimensionality of smart and sustainable city concepts. To this respect, our framework approaches a strong perspective owed to the contribution of multi and cross dimensional indicators in the overall assessment. It is thus strongly recommended that in the weighting procedures this strong sustainability conception [86] should be taken into account from the relevant stakeholders when assigning weights to each indicator.
The calculation of each one of the three indices as defined above, requires a series of pre-evaluation and evaluation steps that will be detailed in the Section 5. These steps make use of several statistical and mathematical tools which are presented in the following section (Section 4).

Unit Normalization
Before moving on describing typical tools for developing composite indices (e.g., normalization, weighting, and aggregation) and the methods of choice under the USE framework, it is important to ensure that the indicators to be utilized are expressed in units that are meaningful and comparable. This is highly relevant for indicators that are expressed in absolute units (e.g., kWh of energy consumed). To deal with this challenge we introduce the term of functional unit (FU), inspired by ISO 14040 series on life cycle assessment. In our case we define as FU "a unit that supports fair comparability and benchmarking between two or more systems". All indicators to be included in the proposed indices need to be transformed first into FUs. This can be done by revising absolute values to be expressed for example per m 2 , per population, per total energy needs, as a % of increase and decrease, etc., depending on the type and special characteristics of every indicator. The specific process could be characterized as first-tier unit normalization, since it enables comparisons between different years and buildings, positive energy blocks (PEB), positive energy districts (PED), and cities of varying sizes (a more detailed definition of PEB and PED evaluation scales is provided in Section 4.2). Even then, SCC projects include several indicators that are expressed in different FUs thus a more universal normalization procedure is needed.

Value Normalization to a Uniform Scale
A critical question to be answered when developing composite indices is how a uniform evaluation scale can be developed, since most of the adopted indicators are expressed in different units, thus disabling data aggregation. Several normalization methods can resolve this problem and are available in literature such as min-max, z-score, percentage of annual variations over consecutive years, distance to a reference and categorical scales [87]. Zhou et al. have analyzed commonly applied normalization methods by variance-based sensitivity analysis, arguing that the distance to a reference method seems to be the optimum choice for sustainability performance evaluations [88]. Building upon this finding, as well considering that in the case of SCC projects, most of the times both quantitative and semi-quantitative (e.g., through Likert-Scale) indicators are applied, we suggest a hybrid normalization method, integrating the distance to a reference and categorical scale method. A similar approach has been applied to evaluate the sustainability of industrial facilities [89,90]. By utilizing the distance to a reference, we can compare the value of a given indicator to one or more reference points while the categorical scale assigns a score to every indicator using a numerical or qualitative scale. In this case we are adopting a 5-point (ranging from 1 to 5) semi-qualitative evaluation scale with the following conventions and margins adopted:
Achieved ≥X 23 and ≤X 34 (3 points are assigned to the examined indicator); 4.
Excelled ≥X 45 (5 points are assigned to the examined indicator) where X N(N+1) , N = 1, 4 is the boundary value for each of the 4 margins embedding neighboring scale points. Figure 2, depicts these points on the uniform scale. The adoption of such scale provides flexibility and adaptability for the evaluation procedure to each indicator and consequently index. The scaling is performed based on one or more reference points that can serve as the boundary values. The reference point can be a baseline value (i.e., energy consumption of a building before foreseen interventions) or a threshold value (i.e., something causing irreversibility of the system) [91]. Additionally, reference points can be extracted from best available techniques (BAT), national regulations, commonly accepted standards, or goalssuccess target values and expert judgements. The selection of a reference point depends on the attributes and aim of the KPI. The reference point can indicate both a positive (>X 23 , X 34 , X 45 ) or negative (≤X 12 , <X 23 ) "performance". For instance, if the examined indicator is "energy savings", a reference point for energy savings on a building level over 60% (aimed target for a renovated building to be considered as nZEB [92,93]) could be assigned to X 45 . On the other hand, a reference point of 0% or below (increase) of energy savings, could be assigned to X 12 . In many cases, the reference point could also be applied as the starting point for assigning the rest of the boundary values. This is largely applicable to the case where a target value is known (or set), such as when assessing the PPI through project specified PSIs (it is evident that these target values are then project specific). In this case, the target value could be placed in between the midpoint of the boundary reference points X 23 and X 34 and an index scoring of 3 denotes achieved performance. Combined reference points could additionally be applied if necessary (one indicating failed and the other excelled performance). In this way, the evaluation scale is built upon a distance to commonly accepted boundaries, thus increasing objectivity of the results. It should be clarified that the distance between the boundary values does not need to be necessarily equal. Several examples are provided in Section 6 and in Appendixes A and B.
Significant advantages derive from the adoption of the proposed normalization and evaluation procedure [89]. In many cases, it may be necessary for a city to include qualitative indicators in the analysis (especially in case of, e.g., social indicators). Common normalization methods, such as z-score and min-max require an adequate set of data to be efficiently applied. This may serve as a deterrent of application for newly operating city information platforms (CIP) that did not have an organized indicator tracking system until recently. Most methods would give the "best in class" building/PEB/PED/City the highest score, which seems fair at a first glance, it does not however ensure that the specific building/PEB/PED/City is developed in a sustainable way but rather exhibits better performance compared to other relevant initiatives (or the baseline scenarios). In this case, the "best in class" building/PEB/PED/City will still receive a better score in comparison with the benchmarked system; however, it will have to try more to reach the highest score, if the pre-determined performance thresholds are not met. This is more in accordance with the notion of sustainable development, according to which fundamental changes may be needed in various levels (institutional, legal, administrative, etc.). We further discuss the choice of normalization procedure in Section 7.

Weighting
In studies examining sustainability indices, the validity of the methods used to assign scores to indicators depends on the weighting methods used [27]. Although equal weighting offers a simple and replicable method, it has been questioned by scholars in terms of validity and transparency of indices results [94,95]. Equal weighting implies an implicit judgement on the weights being equal, without taking into account knowledge of causal relationships into a subset of indicators related to a dimension [96] and the relative importance of each indicator to a specific index or subindex categorization. Other statistical methods used in sustainability indices, is the principal component analysis (PCA) and factor analysis (FA), however those methods original scope is to examine relationships and not to weight variables. Therefore, weights determined with these methods may result in important variables assigned a lower weight due to statistically low correlations with other dimensions, instead of real correlations among assessed indicators [97]. Regression analysis assumes that there is no multi-collinearity (e.g., investments is often positively associated with energy efficiency and CO2 reductions, but all three are independently relevant for measuring sustainability). Unobserved component models [98] have been used in literature for constructing aggregate governance indicators [99]. This approach facilitates weighting, aggregation, and index construction. However, it assumes enough data are collected, indicators are not highly correlated, while it is quite sensitive to outliers of an indicator leading to low weighting [96] of this indicator. The analytic hierarchy process (AHP) although used for multiple-criteria decision-making and for weighting indicators, presents two main disadvantages which are the high number of pairwise comparisons required and the relatively short number of indicators in each dimension [96]. The budget allocation method (BAL) applies weighting on indicators based on expert opinion by distributing "n" points over a number of indicators [87,96]. This method's main disadvantages are that weights may be based upon current needs at policy level in a specific region, while it is also questioned if weightings are transferable to other regions with different context conditions [96]. Public opinion polling is a method where stakeholders express their "concern" towards a public agenda and weighting is based mainly on the respondents concern rather than importance, raising also questions on transferability to different local conditions. Finally conjoint analysis (CA) assigns weights to indicators based on individual preferences, ranking a set of alternative scenarios. This method focuses on the preferences of respondents and requires a large sample and large preference data set [100].
In the proposed framework the participatory method of budget allocation (BAL) is selected for determining indicator's weights due to its transparency, explicitness and short time of execution. In order to establish a transparent weighting system, the expert pool includes experts from multiple disciplines with a wide spectrum of knowledge, experience and concerns, e.g., experts in energy efficiency, climate change, mobility, ICT, technology providers, and financiers. Experts should also cover a wider geographical area to ensure that policy initiatives in a specific region do not determine the weights on indicators rendering those weights transferable to other regions. Experts are introduced in the BAL method and appointed "n" points, which they could then distribute over a set of indicators in the different dimensions and, if apply, in different spatial scales (e.g., building, city scale). Experts are advised to distribute more points to those indicators whose importance should be stressed [101]. In case of different spatial scales, the BAL method is advised to be implemented for each spatial scale separately, since indicators might have different importance in the different unit of analysis, i.e., (a) building level, (b) positive energy block (PEB) level (a) PEB is defined as a collection of at least three buildings of different uses, i.e., residential, tertiary in close proximity having an average yearly positive energy balance [102], (c) positive energy districts (PED) [79], and (d) city level. Figure 3 illustrates the BAL method as applied to the SPI for this current framework. The reader is referred to Section 5.3 for more details.

Aggregation
Aggregation methods are used for summing-up normalized values of sub-indicators to form sustainability indices, with the weighted arithmetic mean being the most commonly used method [27]. Additive aggregation assumes that there is no synergy or conflict between indicators and thence the contributions of all indicators can be added together to provide an overall value [96]. Weights used in additive aggregation methods mainly imply substitution rates in a compensatory logic, therefore no synergy between sub-indicators should apply when using this method [103].
Geometric aggregation methods are also used for sustainability indices, but less extensively. They use multiplicative instead of additive functions, with the weighted geometric mean being the most popular method used [27]. Geometric mean methods allow for limited compensability as similarly to aggregation methods, geometric aggregation methods are considered preferentially dependent [87]. However, unlike additive aggregation methods sensitivity analysis and uncertainty quantifications cannot be analyzed using measurement errors of indicators [104].
When compensation between sub-indicators for the construction of sustainability indices are not permitted, i.e., in strong sustainability indices, non-compensatory methods, i.e., conjunctive and disjunctive functions are used [105]. However, these methods are of limited use for decision makers, since when values of sub-indicators are not extreme, their information is undermined [96]. Multicriteria decision making methods (MCDM) are also used as noncompensatory methods, adopting a decision maker preference approach [106], however they have computation limitations when the number of indicators is increasing [107].
In the proposed framework, the weighted arithmetic mean aggregation method is selected for constructing the three different indices, i.e., (a) the project performance index (PPI); (b) the sustainability impact index (SII); and (c) the sustainable performance index (SPI). The selection is based on the fact that the aggregating methods allow for a compensatory logic when indicators' scores are low, and sensitivity analysis, as the bound for the sustainability index can be precisely defined if the relative measurement error of a set of indicators is already known. More details per axis index are given in Section 5.

Evaluating SCC Projects' Performance-The Project Performance Index (PPI)
The PPI leverages the project success indicators in order to provide a direct assessment of the project's success against its pre-defined goals. Figure 4, presents the workflow diagram for the calculation of the PPI. We divide the whole procedure into two main categories. The first one is the preevaluation (P) process which includes all the necessary steps (P1-P2) that a project needs to take prior to the evaluation. These include: (a) Step P1: The definition of the PSIs along with their target values (which serve as the midpoint of boundary reference points X 23 and X 34 on the evaluation scale). This step is typically set during the project's design phase; (b) Step P2: The calculation of the PSIs values based on the actual project's results. This step is set during the monitoring phase of a project. The second category is the evaluation (E) process which includes all the necessary steps for the actual evaluation of the project through the PPI (steps E1-E3). These include: (a) Step E1: The construction of appropriate margins, based on reference points, for the uniform evaluation scale per PSI. See Section 4. To calculate the project performance index (PPI) a non-weighted harmonic mean is used as an averaging measure. The reason the harmonic mean is selected is that the PPI is ultimately a ratio of the actual performance achieved by a project, to its targeted performance originally set. The harmonic mean is a measure that is dominated by the minimum of its arguments, offering a correct interpretation of the PPI. In the case, for example, where there is a large discrepancy in the scoring of PSIs, i.e., PSIs that are marked as "failed" drag with a larger weight the PPI towards the left side of the uniform scale to serve the ultimate purpose of a project, being to achieve all its PSIs. The harmonic mean, H, is given by: where x 1 , x 2 , . . . , x n , are the PSI values and n is the number of PSIs.

Evaluating SCC Projects' Sustainability Impact-The Sustainability Impact Index (SII)
The SII leverages the Key Performance Indicators as defined by each SCC project. The defined KPIs, clustered under KPI dimensions hold different weighting factors to showcase their relevant contribution to each KPI dimension. The final evaluation metric (the SII) is derived by aggregating the KPI dimensions' scores. Figure 5, presents the workflow diagram for the calculation of the SII.
We divide again the whole calculation and evaluation procedure into two main categories: the pre-evaluation (P) process which includes all the necessary steps (P1-P4) that a project needs to take prior to the evaluation. These include:  To calculate the Sustainability Impact Index (SII), a weighted arithmetic mean (using the BAL method described in Section 4.3 for calculating weights) is used as an averaging measure. The arithmetic mean in each dimension is calculated using the weighted functional values of the KPIs within the dimension, and the total SII is calculated by the arithmetic mean of all dimensions' values. The SPI can be calculated for different spatial scales, i.e., a building, a PEB, a PED, or a city. Additive aggregation functions are used to determine the weighted arithmetic mean at the spatial scale selected. Assuming a multiset of KPI functional values x 1 , x 2 , . . . , x n , with corresponding non-negative weights, ω 1 , ω 2 , . . . , ω n in a SII dimension, we calculate the weighted arithmetic mean, A, as: where ω i is the normalized weight of the ith KPI, obeying: and given by:

Evaluating Sccs' All-Inclusively Sustainability Progress-The Sustainability Performance Index
As described in the Section 3, we had to make a clear distinction of two use-cases for the SCC project evaluation under the SPI. Nevertheless, the first essential step for this evaluation is to define the all-inclusive sectors under which KPIs should be clustered-a step pertaining to both use cases. In order to define these sectors, we have first performed an extensive literature review on smart and sustainable cities, as well as of other urban sustainability evaluation frameworks, such as the urban sustainability framework (USF) developed by the global platform for sustainable cities [108], that builds upon an integrated city evaluation approach in order to deliver urban sustainability outcomes. The latter includes 4 sectors that rely on the outcomes that cities can achieve by addressing urban sustainability in line with the SDGs, namely: urban economies, natural environment and resources, climate action and resilience, inclusivity and quality of life. In addition, the proposed framework by the Belt and Road Initiative-Developing Green Economies for cities (BRIDGE) entails a similar-oriented approach of a sustainable city indexing based on four key principles promoting inclusive and sustainable urban-industrial development; namely urbanization-industrialization nexus; sustainable economy and social growth; shared prosperity; and resource efficiency and environmental sustainability; three of which coincide while all of them are linked to SDGs [109]. Second, we have based our definition on the nexus between SDG 11 "Sustainable Cities and Communities" and other SDGs, in parallel to global policy processes, commitments, roadmaps, and best practices related to smart city, urban resilience, sustainability, and climate neutrality, such as those reflected by the New Urban Agenda [110], the Coalition for Urban Transitions [111] and the European Green Deal [112]. The SDGs comprise a key instrument for cities to be smarter and more sustainable, while emerging also the need to develop robust assessment frameworks entailing all-inclusive sustainability areas that provide a shared vision in the way cities are evaluated. In particular, regarding EU strategy, the SDGs represent a priority towards smartness and innovation, low carbon future, climate resilience, job creation and poverty mitigation. Capitalizing on the objectives and outcomes of the sustainability policies and frameworks worldwide, it is clearly reflected that a city cannot be smart and sustainable without efficient use of resources, progress in technology, climate response, as well as social engagement and promotion of better quality of life for people. It was also observed that most of the assessment frameworks and indices presented in Section 2 include measures and cover impacts related to those areas.
As a result, a first attempt was made to define and use all-inclusive sectors linked with relevant to smart cities SDGs, that we believe they offer a holistic, thematic, and comparable evaluation approach of the smart and sustainable performance of projects. Figure 6 depicts the four sectors along with the related SDGs linked to each sector. The sectors adopted are the following: • Sector 1: Resource Efficiency. Given the continuous earth's population growth, an increasing global demand for resources is recorded which is also expected for the following decades. Resource efficiency is of utmost importance for smart cities striving to identify material, energy and human resources and link them properly in order to reduce environmental, economic and social risks and impacts and provide increased opportunities for sustainable living with greater productivity, lower costs, macroe-conomic stability, and feasible consumer choices. The sector of "Resource efficiency" is highly relevant to aspects of natural and energy resources pertaining to the smart city ecosystem and the built environment including, but not limited to, RES pene- In this context, this particular sector attempts to evaluate smart city projects in terms of public health, well-being, sustainable lifestyle, as well as economic development, i.e., whether the solutions and actions implemented can provide benefits, opportunities, and profits to the citizens. The most indicative aspects that should be covered by this sector are: air quality, reduced waste, water quality, reduced energy poverty, active transport and clean mobility, reduced noise, jobs creation and business opportunities, innovation uptake and in-city propagation, governance, citizen engagement, health and safety, and education. As a consequence, this sector has strong links with SDGs #1, No Poverty, #3, Good health and well-being, #4 Quality education #5 Gender equality, #8 Decent work and economic growth, #11 Sustainable cities and communities, and #16 Peace, justice and strong institutions. • Sector 4: Climate Change Adaptation and Mitigation. The fact that the future transformation of cities into liveable ones relies on an effective decarbonisation strategy at global and regional level sets the area of climate change adaptation and mitigation as a key pillar for sustainable results. The specific category is highly relevant to the successful performance of smart city projects regarding responses to climate targets pertaining to reduced GHG and pollutant emissions in compliance with common standards and strategies, as well as appropriate adaptation measures preventing climate risks, such as floods, etc., and considering local particularities and vulnerabilities. The sector is linked with SDG#13 Climate action, as well as SDGs #3, Good health and well-being, #7 Affordable and clean energy, #11 Sustainable cities and communities, and #12 Responsible consumption and production. The sector may include indicative aspects that focus on air quality, RES penetration, waste management, e-mobility, water and wastewater treatment, circularity and recycling, reduced pollution, land use and urban space, and climate resilience (also including nature-based resilience, green areas, trees, etc.).
The aspects addressed by each sector as mentioned above, were identified with a view to demonstrate and cover most of the key variables and measurable fields that describe and enable the assessment of technological, social, and environmental systems and their interrelationships in the urban context. These aspects are based on focused topics within smart cities and relevant SCC projects, as of energy efficiency in the built environment, green mobility, low-carbon energy, digital innovation and ICT, energy and transport networks, water and waste, citizen engagement, etc. Most of them are being addressed by a multitude of assessment frameworks that evaluate performance of smart and sustainable cities or SCC projects, e.g., SCIS [8], Citykeys [113], or may have been used for measurement purposes by composite city sustainability indices, e.g., the domains defined in the indexing of the European Green Capital Award [60]. In addition, a vast majority of those smart city aspects can be found as key focus areas or targets in global policies set by multi-governmental organizations like the United Nations and the European Commission or in the principal guides of relevant city initiatives, such as C40 Cities, ICLEI, etc., e.g., the guidance of reinventing cities to design a low-carbon, sustainable, and resilient project [114]. It is also clarified that the aspects can belong to multiple sectors emphasizing the cross-SDG (and cross-dimensional) nature of the SPI.
Having set the four overarching SCC sectors, we can now proceed in describing the methodology for evaluating the SPI under both use-cases: UC1: For on-going projects, the project's KPIs along with their clustering into KPI dimensions have already been defined. To this respect, each SCC sector is assessed by leveraging relevant KPIs from multiple KPI dimensions. Assuming a project with KPIs clustered under energy, ICT, economic, mobility, and social dimensions, the resource efficiency sector should include KPIs from the energy dimension (e.g., relevant to RES penetration), as well as KPIs from the ICT dimension (e.g., relevant to ICT measures in PEDs). KPIs belonging to a sector have different weights, emphasizing their relevant importance towards the sector's objectives. Moreover, numerous aspects and, consequently, various relevant indicators should be affecting more than one sector, thus those indicators could be assigned and included in all of these sectors. In this case, their weighting contribution is also different. A good illustration of this, is the RES penetration aspect which is covered by "Resource efficiency", as well as by "Climate change adaptation and mitigation" sectors. As a result, a KPI that covers the aspect of RES penetration and contributes to both sectors, such as the "degree of energetic self-supply by RES" will not have same weights within each sector. The weights characterize the contributions to the SPI and are defined according to the budget allocation method (BAL-see Section 4.3). We note here that, due to the restrictions of the BAL method that imposes a maximum number of 10-12 indicators per sector to reduce cognitive stress on the experts [87], only the most important KPIs from all available dimensions should be selected and assigned into the 4 SCC sectors (a fact that might be beneficial in a future attempt to pre-define common-and limited-KPIs per sector as required in use-case 2). The sectors are assessed by each sub-set of SPI-weighted KPIs and they can finally be condensed in a single SPI metric via equally aggregating, i.e., averaging their individual scores.
UC2: For future projects, the main objective of the SPI is to illustrate the value and contribution of a project towards the overall sustainability performance status of a city via a metric which evaluates the project against broad and cross-dimensional aspects pertaining to the smart and sustainable urban concept. This type of evaluation that promotes crosssectional integration of smart city focus areas and attributes into all-inclusive SCC sectors is able to analyze the sustainability progress of SCC projects beyond the KPI dimensions, thus assessing aspects affected by more than one dimension, while also laying the foundation for a fair, reliable, and comparative assessment between projects, unlocking the potential for assessing the sustainability performance of projects with different focus when using a pre-defined common KPI repository per sector. To this respect, each SCC sector should be assessed by leveraging KPIs attributed to each sector from a common-to-all-projects repository. This repository should include the most essential KPIs that cover all SCC aspects per sector. The definition of the common-KPIs along with their functional units, target values, evaluation margins and weights per sector is a prerequisite for the SPI evaluation under a use-case 2-an essential process which is outside the scope of this work, reducing the methodological steps for evaluating the SPI per SCC project. We note here again that due to the restrictions of the BAL weighting method, the common KPI repository per sector should include up to 10-12 indicators (i.e., up to a maximum of 40 KPIs should be included in the global repository for all 4 SCC sectors). Each project should choose all relevant to the project KPIs from this repository pertaining to each sector. Then, the SCC sectors are assessed by each sub-set of pre-weighted KPIs and they can finally be condensed in a single SPI metric via equally aggregating, i.e., averaging their individual scores. In case a KPI in a particular sector is not relevant to the project's scope, a zero score is assigned to this KPI, lowering its total SPI score, in accordance with the index scope of providing an inclusive evaluation of the project's impact towards the total sustainability of a city. Figure 7, presents the workflow diagram for the calculation of the SPI under both use-cases. The required steps per use-case are also indicated. Once again, we divide the whole calculation and evaluation procedure into two main categories: the pre-evaluation (P) process which includes all the necessary steps (P1-P3) that a project needs to take prior to the evaluation. Note that P1-P3 have already been performed under the SII pre-evaluation procedure and they are thus redundant in the case of a complete evaluation under all-three axis. (a) Step P1: The definition of the KPIs. This step is typically set during the project's pre-monitoring phase; (b) Step P2: The definition of baseline and target values per KPI. This step is typically set during the baseline monitoring phase of a project. Note that under UC 2, target values definition should be pre-set for all KPIs in the common-KPI repository and thus not required by each SCC project; (c) Step P3: The calculation of KPIs in each aggregation level of interest (e.g., building, PEB, PED, city level).
The second category is the evaluation (E) process which includes all the necessary steps for the actual evaluation of the project through the SII (steps E1-E7). Note that E2, E3, and E4 have already been performed under the SII evaluation (for all KPIs) procedure and they are thus redundant in the case of a complete evaluation under all-three axis. Nevertheless, we present them here again for completeness. Additionally, note that under UC 2, the assignment of the KPIs into the 4 SCC sectors is automatic as they should been extracted from the common-KPI repository. Moreover, functional units, margins, and weights are also pre-set and thus only steps E4, E6, and E7 are relevant, as noted in Figure 6.
The evaluation steps for both Use Cases of the SPI are as follows: (a) Step E1: The assignment of KPIs into the 4 SCC sectors. We note here again that up to 10-12 KPIs need to be assigned per sector as a prerequisite for the BAL weighting method. See Section 4. To calculate the sustainability performance index (SPI), the weighted functional values of KPIs defined in each sector are used to calculate the weighted arithmetic mean in each sector. The total SPI index is calculated by the arithmetic mean of all sector values. The weights in the SPI Index need to be calculated with the BAL method as explained in Section 4.3 and are not the same as those ones used for calculating the SII, since each sector in SPI is a different construct of indicators than the SII dimensions constructs.

Illustrative Examples and Discussion on Index Scoring
With a view to acquire an initial insight, test and pre-validate the applicability of the proposed evaluation framework, a provisional use-case has been defined building upon the preliminary outcomes of a positive energy city transformation framework (POCITYF), a European Horizon 2020 smart city project, approved for funding in 2019. POCITY identified and will orchestrate the demonstration of several solutions towards energy transition at two lighthouse cities (LH): Evora from Portugal and Alkmaar from the Netherlands; as well as the replication of solutions in six fellow cities (FC): Bari from Italy, Celje from Slovenia, Granada from Spain, Hvidovre from Denmark, Ioannina from Greece, and Ujpest from Hungary. A key characteristic of this project is its special focus on historical cities and buildings, attempting to demonstrate energy-oriented upgrades that are highly compatible with the respective challenges. During the first year of the project's implementation the definition of POCITYF KPIs has been realized through a detailed methodological process which strongly relates to the needs of LH cities and their citizens towards their energetic transition [17]. These needs include concerns on: (1) energy, (2) environmental, (3) social, (4) ICT, (5) mobility, (6) economic, (7) governance, and (8) diffusion and propagation dimensions which also relate to the various stakeholders participating or being interested in POCITYF's interventions. A list of 63 KPIs are included in the final KPI repository of POCITYF, categorized in the eight dimensions, offering a holistic framework to assess four different aggregation levels: (a) building, (b) block, (c) district, and (d) city level. From this list, 37 KPIs (characterized as core KPIs by POCITYF) were utilized to assess the SSI and SPI indices. Lastly, a series of PSIs (32 in total) have been identified which provide a global view of the project success and its impact towards green, smart, resilient and autonomous cities. All 32 PSIs were utilized to assess the PPI Index.

Key Assumptions
POCITYF is currently going through the second year of implementation and as a result the application of the proposed framework can only be fully performed in the near future, when the monitoring phase will be initiated. Nevertheless, POCITYF offers ready to be applied lists of KPIs, PSIs, dimensions, and aggregation levels that can serve as an excellent test-bed for extracting some preliminary results. Below a number of key assumptions applied to test the evaluation framework are summarized.
• SSI and SPIs indices were extracted on a city level only. In total, 31 out of 37 KPIs were applied to assess this level (the rest KPIs were focusing on a building, block or district level only). Respective results can be extracted also for these aggregation levels but for the sake of simplicity and space restrictions we opted for a city analysis only in this paper.  (3 points) whereas the rest of the evaluation scale has been marginalized based on the worst (<2) and best (>8) in class performant countries on this issue, assigning a score of 1 and 5 points, respectively. Setting solid and well-justified evaluation scales was found to be a very challenging but of high added value process. Developments on this subject are still on-going and a more detailed presentation of this issue is foreseen in the future. • Target values applied in PPI have been defined by POCITYF during proposal submission and reflect the project's own ambitions and expected impact. Minor modifications may apply in the future. • The BAL method was deployed for determining weights and evaluating the impact to sustainable goals, i.e., for the SII. The method was also applied for determining weights and evaluating inclusive sectors in SCC projects, i.e., for the SPI. For the latter (SPI), an initial process of choosing the most important cross-dimensional KPIs per sector was performed based on an BAL-like method-KPIs with zero points assigned by all experts were excluded while the maximum KPIs per sector was limited to 12, in order to comply with the BAL method requirements, reducing cognitive stress to the experts. In all of the above, the evaluation was performed by the authors of the paper, serving as a preliminary group of experts. A wide pool of experts from different fields and countries will be developed and applied during the final implementation of the methodology. • The KPI calculation (input of functional unit values by the user) were assigned indicatively by the authors considering a hypothetical performance of the POCITYF project by its end for one of the two LH cities.
Consequently, results presented in the following sub-sections serve mostly as an illustrative example of the potential applicability of the proposed framework rather than an actual evaluation of the POCITYF project. In that aspect, deep analysis and interpretation of results falls out of the scope of the specific paper. Note that all data used to create the analytics and graphs in the following subsections are provided in the Annexes (Tables A1-A4). No elaborated scripts have been used for the calculations needed at this stage (calculations have been performed in MS Excel). The authors plan to fully implement the proposed framework, not only for the case of POCITYF but also other SCC and sustainability-oriented projects, with a view to validate and re-adapt the proposed methodology if needed. To do so, relevant software-tool will be developed that will facilitate estimations and simplify procedures, thus increasing the potential applicability in several use-cases.

PPI Results
PPI results (project level) for the case of POCITYF are summarized in Appendix A, Table A1. The final PPI score was 2.6 (based on the harmonic mean of all PSI scores). This means that, overall, the project can be considered close to successful against the call expected impacts, having achieved a satisfactory performance (score ≥ 3 points) in most of the defined PSIs. More specifically, 23 out of 32 PSIs met or even surpassed the target value-three of which (V2G storage within PEBs; Total carbon dioxide emission reduction; Number of peer-reviewed publications due to POCITYF activities) even gained the maximum score (5 points) exhibiting an excellent performance much above the expected one. On the other hand, two PSIs (Batteries storage within PEBs; Number of new and feasible product ideas generated within the project duration) failed (1 point) to reach the target values. This can be attributed both to technical reasons and even unrealistic targets set during the design phase of the project. For instance, the expected number of new product ideas within the project duration was found to be overambitious since it is after the wider scale exploitation and penetration of solutions to the market that these ideas can be actually generated. Estimating the PPI on a regular interval (e.g., every 6 months or annually) can help identify problematic areas affecting the project performance and proceed to mitigation actions on time.

SII Results
SII results on a city level for the case of POCITYF are summarized in Appendix B, Table A2. The final SII score was 3.6. This means that, overall, the project can be considered successful, achieving an above average performance towards meeting the city's sustainability goals in most of the dimensions examined (Figure 8). The project achieves close to excellent performance in the social dimension (4.2 points) exhibiting the maximum performance (5 points) in KPI "degree of satisfaction" of the implemented solutions. Most of smart-city projects, as well as POCITYF, adopt a citizen-oriented approach trying to involve and actively engage citizens in the city transformation process. Coupled by extensive dissemination activities these projects are able to reach a very wide audienceenhancing social performance. The next best performing dimension was ICT (4.1 points) which is to be expected considering that ICT and respective measures are vital towards the "smartification" of cities. On the other hand, the project scored less (2.8 points) in the economic dimension. This result highlights a key challenge that most city projects need to overcome which is how to achieve cost-efficiency when applying innovative technologies, usually characterized by increased costs in comparison with conventional ones. For this reason, SCC projects usually also provide business models to support future exploitation but this can only be reflected in the SII score in a future (some years after the project's end) implementation. A significant margin for improvement is available if we consider that the project was not able to achieve a high performance in some KPIs that are characterized by increased importance (weight). An indicative example (Figure 9) is KPI "energy savings" for which a high weight has been assigned. A reduction in energy savings by 19% was achieved which leads to a score of 2 points (threshold value was 32.5%). This type of analysis, considering all KPIs, can help projects and decision makers with limited budgets to focus on the issues that will have the highest impact on the city's sustainability score.

SPI Results
SPI results on a city level for the case of POCITYF are summarized in Appendix C, Tables A3 and A4. The examined city exhibited a balanced performance in all four sectors: resource efficiency-3.7 points; smart and reliable infrastructure-3.7 points; quality of life and prosperity-3.4 points; climate change and mitigation 3.4 points. The city exhibits a very high performance regarding the utilization of local RES and relevant aspects but puts much less emphasis on increasing the energy efficiency of its building stock-which has been indicated as a key aspect for further improvement. The wide scale roll-out of EVs and is also another aspect that its potential improvement will increase the score in several sectors. The total SPI score was 3.6 points being equal to the SII score. This is an indication that KPIs selected by POCITYF project are well-defined and cover adequately key critical aspects affecting the overall sustainability on a city level. As expected, several KPIs were included in more than one sector, e.g., energy savings, increased system flexibility, degree of energetic self-supply by RES, carbon dioxide emission reductions, etc., but with a different weight. For example, KPI "energy savings" was found to be the most significant one in the resource efficiency sector (weight 0.209) but also contributes to the quality of life with less significance (weight 0.064) since energy savings can lead, among other benefits, to reduced energy costs and making it easier to ensure a comfortable indoor environment. This is in accordance with SPI goals and objectives according to which different KPIs belonging to different dimensions may affect different widely acclaimed all inclusive sectors.

Discussion
As elaborated throughout this work, the proposed evaluation framework presents strong benefits compared to existing ones. Nevertheless, below, we mention and comment on shortcomings and necessary steps to be taken before reaching its full potential: • The implementation of the current framework is foreseen to occur in the next years for several EU funded SCC projects. The illustrative example provided in Section 6 should be considered as a fictitious case-study due to the randomly assigned KPI values. As such, the authors plan to publish concrete and real data in the future as soon as the latter are available, illustrating the capabilities of USE framework into assessing the sustainability performance of SCC projects and elaborating on the results with thorough analysis. • Concerning the SPI, we have already mentioned that under UC2 (future projects), the correct implementation of this axis-evaluation requires a common KPI-repository per sector, in order to obtain comparable results and towards a consistent benchmarking between cities. The process of populating such repositories is not straightforward. It should involve a variety of stakeholders (city authorities, technology providers, research institutes, policy makers, citizens, etc.) with adequate expertise, as well as diversity in each sector, so that all relevant aspects are covered and each sector truly becomes an overarching group which contributes to the total sustainability of a city. Moreover, we are fully aware that setting the SCC sectors and clearly defining their key aspects, is a quite challenging and demanding process on its own. The authors have already started collecting data towards the definition of the common KPIs per sector while working on a more elaborated justification for the SCC sectors definition. This work is outside the scope of the current article while we plan to enhance the definition of defined herein preliminary sectors and their aspects in the near future. • The normalization procedure described in Section 4 requires a process of finding well accepted reference points for every indicator. This process is time and effort intensive, whereas a level of subjectivity is still involved. Reference points should be based on commonly accepted data and targets to increase objectivity as far as possible. In order to reduce uncertainty, it is proposed that these targets must be re-evaluated and modified regularly. Still, we consider this a step forward in comparison with business-as-usual practice where indicators are mostly assessed based on the increase or decrease in their value. Additionally, by utilizing a 5-point scale a lot of information is lost, which may lead to accumulation of scores into the same cluster (e.g., many buildings/PEBs/PEDs/Cities with the same score). Although for a single KPI it is very likely that scores will coincide, for a higher number of KPIs (usually applied by SCC projects) this is unlikely to happen. It is further proposed that the city should still perform the traditional indicator analysis (e.g., examine trends of absolute values over consecutive years) in order to identify more specific internal problems or opportunities for improvement. • The evaluation on different spatial scales of a city (e.g., building level, PEB level, PED level, etc.) is inherent inside the USE framework. Adequate aggregation techniques can be used to leap from one level to the next (e.g., summing up each KPI contribution of the buildings that consist a PEB can provide the required PEB value). Such aggregation might seem oversimplified and can not surely take into account particular aspects of each lower level component (e.g., buildings) that might contribute non-uniformly to the upper level of evaluation (e.g., PEB). Moving to even higher spatial aggregation levels, such approach becomes further complicated as typically in SCC projects, the union of different PEBs is not equal to a whole PED (and similar for several PEDs aggregated to a city level). Choosing representative components of each evaluation level is ambiguous, although averaging provides a simple solution. In any case, the clear definition of these spatial scales inside a SCC project is pending and thus we are planning to redefine if necessary the aggregation techniques inside USE framework to comply with the SCC standards.

Conclusions
The outcomes of the specific study, serve as a major first step towards the deployment of an inclusive and uniform evaluation framework that is able to assess in parallel the impact, performance, and sustainability potential of smart city projects. The proposed USE framework can support the needs of various stakeholders related with the development and implementation of smart city projects and initiatives, such as project managers, technical experts, public authorities, decision makers and urban planners, who wish to apply an all-inclusive evaluation and monitoring procedure. The utilization of widely accepted reference points, upon which evaluation scales are defined, supports strong sustainability assessments, since a distance to a sustainable target is integrated and reflected into the final evaluation. The paper also summarizes several insights on key characteristics and limitations of currently available urban sustainability and smartness evaluation frameworks and indices, and recommendations on normalization, weighting, and aggregation procedures. This info can be valuable for those who wish to develop their own or revise their index-based evaluation methodology. The preliminary application of USE framework in an on-going SCC project, confirmed its potential applicability for assessing and comparing the success of a project and the extent to which it contributes to the city sustainability progress against EU goals. The authors plan to implement USE in several case studies in the future to fine-tune the proposed steps and validate its applicability.