A Foundation for Measuring Community Sustainability

In order to understand the impact of individual communities on global sustainability, we need a community sustainability assessment system (CSAS). While many sustainability assessment systems exist, they prove inadequate to the task. This article presents the results of a systematic review of the literature on existing sustainability assessment systems; offers a definition of a sustainable community; provides a multi-scale, systems approach to thinking about community; and makes recommendations from the field of performance measurement for the construction of a CSAS.


Introduction
The measurement of features critical to the functionality and livability of communities is a fundamental question for local governments [1]. External benchmarking is crucial, and traditional areas of public service delivery, such as human resource management or social services provision, have some common benchmarks [2]. However, even in these areas, limited time, a lack of technical capacity, and financial restrictions challenge the ability of municipalities and other subnational units of the government to understand whether policies result in desired outcomes. Additionally, the scholarly reporting of these service provision efforts remains sparse [3].

Why Sustainable Communities?
The United Nations estimates that more than half of the world's population now resides in urban areas and that this percentage will reach 66% by 2050 [10]. If we want to understand global sustainability, we must also be able to measure the sustainability at the scale of individual urban areas. One challenge facing the sustainability analysis of urban areas, however, is defining the term. Definitions of urban areas can vary, and these differences impact the choice of the phenomena under consideration. For example, the United Nations Population Division estimates the global urban population using definitions of "urban" as utilized by individual countries. In their analysis, four countries include in their "urban" definition areas population sizes as low as 200 people. More commonly, countries use population sizes of at least 2000 residents (23 countries) or 5000 residents (21 countries). Other nations, however, use population density in their determination of what constitutes urban. The definition for urban areas can also be separated into different classes. The United States Census Bureau, for example, distinguishes between urbanized areas (>50,000 residents) and urban clusters (2500-50,000 residents) [11]. Thus, any metric about sustainability in urban areas must consider that the conditions under which urban is recognized will greatly vary by location.
In addition to being subject to arbitrary criteria, the problem with using a numerically driven definition of "urban" is that it fails to capture the degree to which these areas are environmentally, economically, and socially integrated-that is, the extent to which they function as a community. The term "community" conveys two important characteristics related to the study of sustainability. First, like urban, it refers to a geographic area-specifically, a geography in which people may be connected to each other. Most people live, work, and receive their education and health care within communities. While resources such as water, energy, and other goods that a community requires can be sourced from a distance, they are consumed locally, and much waste is treated or may even be disposed of locally. Second, unlike urban (and the related term "city"), community implies an amorphous boundary that is defined by its social and by functional relationships. For example, the community boundaries for understanding waste may be different from the community boundaries that are used to consider water, education, or health care. In this way, communities may be "local" or "regional". While local and regional are relative terms, we use local to refer to the smallest administrative entity relevant to a system (e.g., municipality or school district) and regional to refer to conglomerations of such entities. That allows us to distinguish between a local water authority and a regional watershed or between a local hospital and a regional health care system, for examples.
The focus on sustainability at the scale of the community avoids problems associated with the notion of "sustainable development". We argue that due to the importance of communities to the economy, the environment, social services, and governance, we need to move from a focus on sustainable development to sustainable communities, a notion grounded in the sustainability of a specific place. Such a shift has already begun. The UN Sustainable Development Goals (UN SDGs) [12], for example, address sustainable cities and communities (UN SDG #11), and sustainability at the local level was the focus of Local Agenda 21, which was conceptualized in chapter 28 of Agenda 21, adopted by 178 governments in 1992 at the Rio Conference on Environment and Development.
Despite the increasing emphasis on sustainable communities, knowledge gaps remain. According to the Advisory Committee for Environmental Research and Education [4], scientists, professionals, and policy makers lack the knowledge to identify the key levers of change in communities and the interactions between systems and scales to inform policy decisions. Consequently, there is a need for the development of a classification that can be used to comparatively assess how "environmental and social heterogeneities across cities influence sustainability outcomes" [4] (p. 20). To meaningfully compare communities for their degree of sustainability, metrics are needed. These metrics provide an empirically sufficient [13] foundation for constructing a classification for sustainable communities. The metrics also enable us to identify the strengths and weaknesses of communities in terms of their sustainability. Thus, we can subsequently use such findings to shape the policy and practice that will best lead to greater degrees of sustainability within the community.
There are myriad sustainability assessment systems (SAS) that measure sustainability. These range from the neighborhood (e.g., Leadership in Energy and Environmental Design-Neighborhood Development (LEED-ND)) to the national (e.g., Living Planet Index) and to the global scale (e.g., United Nations Sustainable Development Goals). While these SAS offer features that contribute toward understanding community sustainability, each has flaws that make it insufficient for evaluating sustainability at the scale of communities in ways that allow us to identify gaps or critical factors that contribute to their success. Thus, this paper is the first step in identifying the flaws in existing sustainability assessment systems for communities and in making suggestions for how they may be overcome.

Methods
In order to identify the literature on sustainability assessment systems, we first conducted Google Scholar searches for the following three terms: "sustainability assessment system", "urban sustainability assessment", and "sustainability reporting tool". We included in our review articles and books that were about assessment systems rather than applications of assessment systems. We excluded working papers and conference presentations. Additionally, because we were ultimately interested in constructing a community sustainability assessment system, we focused on sources about place-based assessment systems (e.g., neighborhoods, urban areas, states, nations, and the globe) rather than industry, buildings, or particular ecosystems. From this initial list of 23 sources, we used a Sustainability 2019, 11, 1903 4 of 16 snowball method using citations found within these sources to expand our search. In all, we identified 36 articles and books that we used in the critique and recommendations.

Categories of Systems
According to Ameen and colleagues [14], the existing sustainability assessment systems (SAS) can be usefully grouped into three sets based on their stated purpose: (1) systems for planning sustainable development or "projects", (2) "indices" for measuring sustainability, and (3) "tools" for promoting sustainability. We have adopted this classification for the purposes of this review. Each of these sets can then be characterized by the geographic scale for which they are applicable: local, national, or global (see Table 1). The first category of SAS includes what Ameen and colleagues [14] call "projects", although they may be better considered as guidance documents. In this category, Ameen and colleagues include the United Nations Sustainable Development Goals (UN SDGs). The UN SDGs can be seen as the culmination of a process that began with the Brundtland Commission [15], which popularized the term "sustainable development". Since 1987, achieving a global sustainable development and its three main components-economic growth, environmental protection, and social equity-has been a focus of the United Nations. In the area of assessment, the Brundtland Commission identified the need "to set benchmarks and to maintain human progress within the guidelines of human needs and natural laws" [15] (p. 27), which resulted in Agenda 21, adopted in 1992, as the first global project assessment system with strategic imperatives and a clear set of objectives. In 2000, this strategic planning approach to sustainable development initiated by the Brundtland Commission was followed by the creation of eight Millennium Development Goals. These goals included measurable targets, such as halving, by 2015, the percentage of people whose income is less than $1.25 a day. In 2015, the United Nations replaced the Millennium Development Goals with the SDGs, a system of 17 goals, with 169 associated targets and 245 indicators. The SDGs are described as a global indicator framework. As such, many of the targets are difficult to implement at the scale of a community. For instance, one of the targets under Goal 2: Zero Hunger is to "maintain the genetic diversity of seeds, cultivated plants and farmed and domesticated animals and their related wild species" [16]. This target is relevant primarily for scales above the community.
The second category of SAS comprises a group of indices that were developed by the United Nations as well as by academics, think tanks, and other entities. Unlike the projects in the first category, these are indices that are designed to measure sustainability or aspects of sustainability. They do not, however, include the strategic planning or action elements that are associated with the projects. In addition, the indices are not restricted to global-or national-level sustainability, and some can be used to examine communities at the regional or local levels. For example, the Human Development Index has as an indicator for the life expectancy in a nation but no subordinate measures of policy adoption, such as the percentage of the population with potable, that would lead to longer lives.
The third category of SAS is composed of tools for creating sustainable communities. Usually, these tools are not scaled-down versions of the global or national indicators but rather scaled-up applications of green building indicators. These SAS are typically applied at a sub-municipal level and are, therefore, often called Neighborhood Sustainability Assessments (NSA). The three most cited are LEED-ND, BREEAM-C, and CASBEE-UD. Although more than 20 of these NSAs have been developed, we focus on these three as they are the most often used and studied [17]. A fourth tool in this category is the STAR Community Index. Although the STAR Community Index provides an measure of sustainability based on a 5-star system, the primary purpose of STAR is to evaluate, improve, and certify sustainability at the municipal level rather than the neighborhood level. (As of December 2018, STAR merged with the US Green Business Council and became LEED for Cities and Communities, which is under development and piloting as of this writing).

Flaws in Existing SAS
Accompanying the proliferation of SAS is a growing literature that criticizes them, and no commonly accepted SAS for communities has emerged [18]. Critiques of the metrics tend to be narrowly focused, with authors evaluating only specific subsets of SAS. For instance, there is a substantial literature concerning neighborhood sustainability assessments (NSAs), particularly LEED-ND, BREEAM-C, and CASBEE-UD [19][20][21][22][23][24][25][26]. The STAR Community Index falls into the same category of SAS (i.e., tools) though it has been evaluated separately [27,28]. Other authors focus on the indices [18,29], and some focus on the United Nations SDGs [30][31][32]. However, others take a more general approach and critique not only these named SAS but also studies of sustainability by various authors or SASs made for specific communities [33][34][35]. Despite the separate critiques of these systems, they share many of the same problems. The flaws of the existing systems can be broadly grouped into three categories: (1) unstated or flawed definition of sustainability; (2) issues of scale; and (3) problems associated with implementing the SASs themselves. Below, we look at each of these flaws in turn (see Table 2 for a summary). First, it is difficult, if not impossible, to properly assess whether a community is sustainable when the term "sustainable" is not defined in the context of the system by which a community is being assessed. An important criticism of the NSA tools, for example, is that they do not explicitly state their definition of sustainability [19,24]. Lacking such a definition creates problems of comparability as well as makes unclear exactly what is being sustained.
Second, critics of various SASs point out that many do not address the three pillars of sustainability: economy, environment, and social equity [19,21,23,29,36]. Berardi [19], for example, notes that the major NSAs are dominated by approaches that focus on environmental sustainability but fail to assess social equity and economic sustainability. In a review of 69 sustainability assessments, Cohen [33] found only 22 that applied a three-pillar approach and only 26 that were organized around environmental and social sustainability. Of the indices, Mori and Christodoulou [18]  however, did not. Additionally, several authors argue that a fourth pillar, governance, should be added to the three already mentioned [26,29,37,38].
Third, some criticisms of SASs, particularly the NSA tools, have focused on the lack of recognition of the interactions that exist between the factors that are being measured to assess sustainability [21,26,28,33]. Cohen [33] attributes the absence of interaction factors to the lack of a theoretical framework for community as opposed to national or global sustainability systems. The lack of recognition of the complex interrelationships between sustainability factors can result in reduced sustainability in some contexts due to the trade-offs that exist between indicators [28,39]. At the same time, a covariance between indicators may lead to the double-counting and skewing of sustainability assessments [28]. LeBlanc [31] (pp. 176-177) argues that the "lack of integration across sectors in terms of strategies, policies, and implementation has long been perceived as one of the main pitfalls of previous approaches to sustainable development. Insufficient understanding of and accounting for trade-offs and synergies across sectors have resulted in incoherent policies, adverse impacts of development policies focused on specific sectors and not other sectors, and ultimately in diverging outcomes and trends across broad objectives for sustainable development." Allen and colleagues [30] (p. 979) argue that the UN SDGs, in contrast, take a "nested" approach "which places the economy within society, and society in turn within the Earth's life-support systems, also emphasizing linkages to crosscutting issues such as governance and means of implementation. [UN SDGs] also emphasize key sustainability science concepts such as limits and thresholds, integration, systems thinking, decoupling, and resilience".
Fourth, the trade-offs that are inherent in the interrelations between the three pillars of sustainability raise the question of whether it is acceptable to substitute natural capital for other forms of capital. Daly [40] differentiates between strong sustainability that seeks to maintain intact natural capital and weak sustainability that is based on the premise that manufactured capital can be substituted for natural capital. Berardi [19] (p. 1588) claims that the NSAs "promote a weak sustainability and accept that economic development can reduce natural capital. This choice reduces their capability to measure sustainability in the long term". The CDI, the only index that focuses on the community level, also takes a weak sustainability perspective [18]. Several evaluators [18,41] have argued that SAS should take a strong sustainability approach, while others [29] have argued that the creators of SAS should at least be explicit about the approach that they are taking.
Fifth, the SAS tools focus on prescriptive actions rather than measurements of outcomes. When using prescriptive measures, there is a potential that the goals of sustainability can actually be undermined by the prescriptions if there is not clear evidence to support that the prescription leads to a sustainability outcome [19,25,42]. For instance, Sharifi and Murayama [25] claim that all of the NSAs encourage an increased population density (i.e., the goal of minimizing the footprint of populations) without clear evidence that increasing density always increases sustainability.
Sixth, the method by which the various indicators in NSA tools are weighted to assess the overall sustainability has been criticized [19,23,28]. Berardi [19] (p. 1582) notes that the different weights given by urban neighborhood sustainability assessment systems are "probably influenced by the different scopes and applications of the systems." For instance, LEED-ND was developed for new communities, while BREEAM-C was focused on the rehabilitation of existing communities. The weights for such systems are not justified by the organizations that created them [28] and, without a stated definition of sustainability, it is difficult to determine what they are trying to maximize or represent.

Issues Related to Scale
There are two issues associated with scale. First, with the exception of the CDI, the SAS in our first two categories, projects and indices, are not aimed at measuring sustainability at the local community level but rather are aimed at the global or national levels. This mismatch between the levels of analysis for which the SAS were created and the need for analyses at the local scale poses an issue. Global or national indicators are not necessarily useful for studying local or regional communities. For instance, the EVI and LPI are meaningless in the context of cities [18]. On the other hand, tools for creating sustainable communities have also been criticized for being too localized and for favoring the parochial interests of the nation/region that created them [23]. Not all criteria for global and national projects and indices are relevant in all contexts [25]. Finally, the ESI tends to favor developed nations because it emphasizes capacity [18].
A second issue is that SASs focused at the subnational scale often face the problem of using unstated or flawed definitions for community. Reviewing the NSA tools, Siew [28] (p. 47) noted that "a large majority of these mainstream tools do not clearly define the dimension or size of development that can be assessed." While STAR Communities are clearly meant for cities, it and the NSA tools do not take into account the problem of "leakage" [20,29,36,43]. These leakage effects stem from the fact that few communities and, certainly, no modern urban areas are truly self-sufficient. Cities and urban regions rely on the global hinterland for resources [44]. Although the municipality is the lowest level at which sustainability problems can be addressed via policy [19], no municipality can achieve sustainability on its own. As a result, indices and tools that focus on the municipal level do not adequately measure the biophysical sustainability of the area [45]. Many of the sustainability indices, even those focused at a national level, are criticized for not taking into account this leakage. Mori and Christodoulou [18] found that none of the SAS in this category both took into account the triple pillars of sustainability (environment, economy, and equity) and addressed the issue of the "leakage". The Ecological Footprint and the Water Footprint SASs, for example, cover the leakage but are focused only on the environmental aspect of sustainability.

Problems Implementing SAS
This category of flaws has been directed primarily but not exclusively at the SAS (tools), which have been widely studied. For these neighborhood and municipal certification systems to be useful to communities, they must offer certification relatively soon after the initiation of the planning and development processes. This leads to the first criticism of these tools-that sustainability can only be assessed at a particular point in time and is rarely revisited [19]. The problem with a static measure of sustainability is that "sustainability is a dynamic process rather than a fixed state" [29] (p. 1176). Since NSAs are used during the development process, they do not necessarily reflect sustainability as it evolves after a community has been inhabited. Instead, NSAs tend to reflect a snapshot of a developer's ambition for a community and not its sustainability over time.
Secondly, these certification systems lack clear criteria explaining the reasons that certain indicators have been selected [35]. As noted by Allen and colleagues, "the key challenge is delivering simple (but not simplistic) messages that are based on evidence and easily understood by the target audience" [30] (p. 976). In a review of 17 studies that applied sustainable development indicators to communities, Tanguay and colleagues [35] found that "more than half of the studies use fewer than three criteria, whereas one study identifies up to 14 selection criteria for indicators. Of a total of 68 criteria noted, only six are frequently used. The criteria for measurement systems are found under the following headings: 'credible', 'universality', 'data requirements and availability', 'comprehensible', 'links with management', and 'spatial and temporal scale of applicability'." [35] (pp. 413-414). The authors proposed the following strategy for selecting sustainable development indicators: "(1) choose the most cited indicators; (2) cover the components of sustainable development and the pertinent predetermined categories; and (3) choose the simplest SDI to facilitate data collection, understanding, and dissemination." Huang and colleagues [29] echoed the sentiment of emphasizing the most widely used indicators and of clearly stating the reasons for why they are included in the SAS.
Thirdly, in an effort to be comprehensive, most SAS include too many indicators. Certainly, there are many from which SAS developers can choose: Xing and colleagues [46], for example, identified over 600 relevant indicators of urban sustainability. The UN SDGs included 17 goals, 169 targets, and 245 indicators. At an extreme end of the indicator number spectrum, STAR Communities contain over 600 separate measures. Even at the national level, where there are resources available for collecting the required measurements, Allen and colleagues [30] argue that it is not advisable or even possible for countries to monitor all 230 indicators proposed by the UN SDGs. Elgert [27] found that many cities engage with STAR communities only superficially because of labor and resource intensiveness due to the large number of indicators.
Fourth, and finally, once the data are collected and reported, the results of the SAS must be presented. Sharifi and Murayama [23] criticize the common NSA tools for their lack of a complete and detailed presentation of results. By simplifying the presentation to a "label", they argue that tools "cannot be regarded as a transparent representation of sustainability" [23] (p. 82).
As noted in Table 1, all of the SAS have flaws that limit their usefulness in terms of assessing the degree of sustainability of communities. The NSAs, designed for a "community" level analysis, have been heavily criticized for (1) not taking into account the three pillars of sustainability, (2) taking a weak sustainability approach, (3) not accounting for "leakage", (4) having a weak link between prescriptive actions and the sustainability outcomes and unsubstantiated weighting systems, (5) measuring "sustainability" at only one point in time, and (6) lacking transparency in the presentation of results. STAR Communities, while not studied as extensively, do not have as many documented criticisms in the literature. When criticisms are made, however, the critiques of STAR Communities tend to focus on the number of indicators, a number that limits its usefulness for researchers who are interested in comparing many communities. The City Development Index, the only index (as opposed to tool) created for use at the community level, does not take a three-pillar approach to sustainability, does not account for leakage, and takes a weak sustainability approach. The other indices and projects were not designed for use at the community level and, therefore, have indicators that are not applicable at that scale.

Building a Better Assessment Systems
Addressing sustainability at the scale of communities requires using a theoretical framework to identify the necessary and sufficient measures that capture critical aspects of sustainability and their interrelations. In this way, a community sustainability assessment system (CSAS) can take into account the issues presented above.

Defining a Sustainable Community
As a foundational point, we argue that the focus of sustainability must be connected to communities. By narrowing sustainability to the community, we are better able to isolate the phenomenon to be sustained. While creators and evaluators of SAS seem to be reluctant to make a declarative statement about sustainability as it relates to communities, such a step is essential in that the measures will reflect the phenomenon of interest. One defense used by authors [19,47] is to assert that no single definition of sustainability exists. Such a stipulation is unsatisfactory. While sustainability is a general property, the lack of an empirical unit for which it is evaluated makes the identification of "sustainable" problematic: what is to be sustained and at what level? Similarly, a definition of sustainability by itself is insufficient because it is not necessarily tied to any specific set of phenomena. This may explain why the Brundtland Commission definition of sustainable development has gained general acceptance while a definition of sustainability alone has not. Similarly, we choose to restrict our efforts to cases where sustainability will impact communities.
The definition of sustainable communities must address the three traditional pillars: economic vitality, environmental quality, and social equity. We argue that a fourth, governance, is needed. While indicators of governance are often embedded in the "equity" [35], others separate them into a fourth pillar [26,29,37,38]. Scholars of the commons, for example, have increasingly argued about the importance of governance generally [48] and of democratic norms particularly [49] for the successful long-run management of shared resource systems. Second, in order for a CSAS to be effective and useful, the definition of sustainability within it must address the issue of scale (i.e., the spatial magnitude of the community) and leakage (i.e., the extent to which actions within the community impact areas outside of the community). It is axiomatic that communities are interconnected with others through exchange and social interaction. In addition, communities rely on natural or other forms of nonlocal capital. These interactions (i.e., exchange, leakage, or input-output) between the community and the outside world must be included in sustainability metrics to be dynamically sufficient [13]. Finally, on this point, the definition of sustainable communities must recognize the long-term, dynamic nature of sustainability and cannot consist of a single snapshot in time but must rather reflect an ongoing, long-term summary of the functional and social relations of a community as well as the resources required to support that community.
We propose a definition of sustainable communities that builds on the Bruntland Commission definition. It reads as follows: A sustainable community is the aggregate of functionally and socially connected individuals and organizations that share collective resources in such a way that engages members in self-determination governance processes resulting in the equitable provisioning of the health, educational, and material well-being among its residents while not negatively affecting future generations or other communities' uses of these resources.
While sustainability often centers on issues of development, we avoid this notion since the concept of sustainable development tends to be skewed toward applications in low-income countries. In addition, we also avoid terms that imply growth or increased use of resources, even at a sustainable rate. As noted by Meadows and colleagues [50], linking sustainability to development tends to highlight affluence and overconsumption, features that impact the environmental aspect of sustainability in much of the Western world. Our definition also makes the community itself the active subject in the definition, recognizing its agency for creating the state of sustainability.
Of course, our definition also includes the definition of the concept of community. In our definition, the community is the aggregate of functionally and socially connected individuals and organizations that share collective resources. Thus, a community consists of individuals and the institutions created by individuals that operate collectively-e.g., governments, NGOs, businesses-with shared interests in their relations and use of resources. We assume a multilevel view of community. This multilevel view includes functional and spatial dimensions. Spatially, we recognize that communities are comprised of neighborhoods or even multiple municipalities. Functionally, the multilevel view assumes that communities sit within a complex web of actors that introduce capacity, regulation, coordination, and other factors to a particular community [39]. In this way, our approach to community draws from the ACERE Report's definition of urban systems as "geographical areas with a high concentration of human activity and interaction embedded within multi-scale interdependent social, engineered, and natural systems that impact human and planetary well-being across spatial (local to global) and temporal scales" [4] (p. 5) and encompasses both local and regional scales. From this definition, our derivation uses a generalized form that acknowledges that communities vary in activity and interaction density but also adds the fact that the success of members in terms of sustainability relies on their ability to make use of resources held in common.

Scales of Sustainability
We address the issues of scale and leakage in the study of communities by emphasizing that a sustainable community is achieved by the way it uses "resources" not simply by the presence of "its resources." Additionally, we state that a community cannot be considered sustainable if, in striving for sustainability, its actions directly or indirectly impede the ability of other communities to also become sustainable. Finally, we recognize the long-term, dynamic nature of sustainability by stating that the provision of human welfare is "ongoing" and that it must not negatively impact future generations.
In order to identify the appropriate scale for a sustainability component (e.g., water or the economy), the component's entire system should be considered. Several authors argue for a particular scale for measuring sustainability, from the neighborhood [32] to the region [51] or the Earth as a whole [52]. Huang and colleagues [29] argue that community sustainability must be assessed at multiple levels, up to and including the Earth. The use of such a multilevel approach that begins with local communities and extends to increasing levels (depending on the scale of the resources used in the community) allows us to understand what is under a community's control and what a community uses that may require consideration at greater scales.
For example, in examining the sustainability of Topeka, Kansas, various community scales can be considered. The Topeka Metropolitan Statistical Area is more likely to capture the relevant economic community than the city alone. A sustainability assessment of the city's drinking water (from the Kansas River) requires an examination of the entire Kansas River watershed and Topeka's impact on downstream economic communities as well as the impact on Topeka of upstream economic communities. As can be seen in the case of the sustainability of the water system, communities' actions are not independent of the world around them. This multi-scale, systems approach is consistent with a multilevel governance framework and recognizes that the sustainability of some systems may only be attained through a collaborative governance at the regional level [39]. The economic communities of the Kansas River Watershed may need to consider the alternate economic uses of the water and agree on the best, sustainable use of this scarce resource.
A systems approach to constructing a CSAS has several other advantages. First, unlike thematic approaches, the systems view captures leakage. Second, such an approach is less likely to be biased towards specific policies and, thus, to ignore critical elements outside the current policy realm [30]. Third, systems-based framing is not predicated on data availability. Volumes of data may be neither necessary nor sufficient and may result in an indicator selection that could make certain communities appear more sustainable than they actually are [33]. A systems view examines sustainability by looking at the output of interrelated parts and not necessarily the sum of individual indicators.

Recommendations for Implementation
While a CSAS can be used to compare communities, we recognize that sustainability measures are often difficult to use since sustainability and the practices involved are grounded in local resource limitations [18,50]. Because of this reality, a CSAS should be constructed with a choice of criteria that is as fixed as possible and as broadly applicable as possible. The STAR Communities tool, for example, allows communities to achieve its highest rating by accumulating points via alternate categories. This means that the same level of sustainability might be met by a myriad of different activities limiting the useful comparisons that can be made between instances. For this reason, we support a strong sustainability approach to a CSAS [22].
As described previously, a major concern to usability is the number of indicators included in the CSAS. The use of large numbers of indicators tends to make assembling the needed information difficult but also exacerbates the problem of unavailable data and makes trade-offs between various indicators more likely (weak sustainability). Although many authors comment on the limited availability of data, Karlsson and colleagues argue, "finding the best indicators of sustainability entails breaking away from data availability constraints and determining the appropriate phenomena to monitor and indicators needed. ( . . . ) Data gaps can initially be filled with pilot collections and sampling, the use of remote sensing or the use of proxies." [53] (p. 36). With hundreds of indicators of sustainability already identified [46], determining the final set of necessary and sufficient conditions for community sustainability is a challenge. We recommend taking a logic model approach to identify the most appropriate indicators. Sustainability, by definition, is a long-term concept. In a logic model one identifies the input, activities, outputs, and outcomes that proceed logically from one to the other to reach long-term impacts. There are numerous combinations of inputs, activities, outputs, and outcomes that can lead to the same long-term impacts. These combinations may vary by community (i.e., what works in one community may not work in another). Figure 1 illustrates one possible logic model for a community water system. In this hypothetical community, public funding (input) allows for the construction of water treatment facilities (activity) that produce treated water (output) that is of acceptable drinking water quality (outcome).
The community also passes conservation zoning (input) that allows the community to protect the watershed (activity). If a sufficient number of acres are protected (output), the community can ensure that there will be sufficient water supplies (outcome). Together, these two outcomes produce the long-term impact of clean, renewable, and accessible drinking water. However, another community may stress water conservation as a means to increase the number of people who can be served by the same water supply rather than conservation zoning. However, another dispersed, rural community may focus on subsidies for individual wells and water testing as opposed to a centralized water treatment. By focusing on the outcomes and long-terms impacts, a CSAS allows each community to take its most appropriate path to sustainability using the same basic indicator-drinking water quality and quantity.
proceed logically from one to the other to reach long-term impacts. There are numerous combinations of inputs, activities, outputs, and outcomes that can lead to the same long-term impacts. These combinations may vary by community (i.e., what works in one community may not work in another). Figure 1 illustrates one possible logic model for a community water system. In this hypothetical community, public funding (input) allows for the construction of water treatment facilities (activity) that produce treated water (output) that is of acceptable drinking water quality (outcome). The community also passes conservation zoning (input) that allows the community to protect the watershed (activity). If a sufficient number of acres are protected (output), the community can ensure that there will be sufficient water supplies (outcome). Together, these two outcomes produce the long-term impact of clean, renewable, and accessible drinking water. However, another community may stress water conservation as a means to increase the number of people who can be served by the same water supply rather than conservation zoning. However, another dispersed, rural community may focus on subsidies for individual wells and water testing as opposed to a centralized water treatment. By focusing on the outcomes and long-terms impacts, a CSAS allows each community to take its most appropriate path to sustainability using the same basic indicator-drinking water quality and quantity. Many of the indicators used by STAR Communities and the NSAs focus on the inputs or activities as opposed to the outcomes or long-term impacts, as the former are easier to count. Building a logic model helps sort the various indicators into inputs, activities, outputs, outcomes, and long-term impacts and exposes the inherent but sometimes unexpressed logic behind why particular resources are needed to conduct activities that produce an output that ultimately leads to an outcome. The CSAS will focus on long-term impacts and how policy outcomes shape those impacts. A focus on the long-term impacts also recognizes that there may be multiple pathways to sustainability and that what works in one community may not work in another. Community stakeholders are crucial to localizing the path to sustainability, and this long-term approach provides an opportunity for residents to coproduce a policy with experts [21,23,54]. It also limits the extent to which these existing tools can be used for "greenwashing" or making a community appear to be more sustainable than it actually is. Focusing on the long-term impacts allows communities to determine, for themselves, what actions must be taken in order for sustainability to be reached while maintaining comparability between communities [27].
A final challenge in creating a CSAS is the determination of the sustainability thresholds. Thresholds or criteria are difficult to identify as they are inherently intertwined with subjective views of sustainability as well as with uncertainties regarding the objective reality of outcomes. Many of the indicators used by STAR Communities and the NSAs focus on the inputs or activities as opposed to the outcomes or long-term impacts, as the former are easier to count. Building a logic model helps sort the various indicators into inputs, activities, outputs, outcomes, and long-term impacts and exposes the inherent but sometimes unexpressed logic behind why particular resources are needed to conduct activities that produce an output that ultimately leads to an outcome. The CSAS will focus on long-term impacts and how policy outcomes shape those impacts. A focus on the long-term impacts also recognizes that there may be multiple pathways to sustainability and that what works in one community may not work in another. Community stakeholders are crucial to localizing the path to sustainability, and this long-term approach provides an opportunity for residents to coproduce a policy with experts [21,23,54]. It also limits the extent to which these existing tools can be used for "greenwashing" or making a community appear to be more sustainable than it actually is. Focusing on the long-term impacts allows communities to determine, for themselves, what actions must be taken in order for sustainability to be reached while maintaining comparability between communities [27].
A final challenge in creating a CSAS is the determination of the sustainability thresholds. Thresholds or criteria are difficult to identify as they are inherently intertwined with subjective views of sustainability as well as with uncertainties regarding the objective reality of outcomes. Residents, local officials, and experts craft these together based on specific conditions within a community. For instance, in a hypothetical community, 10% may represent the target level for the poverty rate in the short-term (say, by the year 2025). The goal states that all residents will eventually have incomes above the poverty line. But how does one evaluate the actual poverty rate for a community when assessing sustainability? Against the short-term target level or against the ideal goal of 0%? We argue that these types of questions have no correct answer and, therefore, must be left up to communities themselves to decide with a broader multilevel framework that provides some overarching guidance so that communities do not become self-interested actors, which research shows negatively impacts the environmental quality, social equity, and economic development [55,56].
Once the possible range of indicators are narrowed to only sustainable outcomes, choices must be made to create the most parsimonious list of outcomes that make the CSAS applicable to the widest number of communities. Again, the performance measurement/management literature can be helpful when sorting through the possibilities. Based on Poister and colleagues [57], we propose these guiding principles as critical to the development performance measures for programs, organizations, and agencies. These principles serve as a basis for the creation of a CSAS.

1.
Be based on sound theory and evidence and reflect the long-term outcomes of critical sustainability systems.

2.
Be comprehensive and balanced, reflecting the four pillars of community sustainability while avoiding redundancy and tangentially related measures.

3.
Be meaningful to city officials, policymakers, academics, community members, and other relevant stakeholders.

4.
Have a high degree of face validity to city officials, policymakers, academics, community members, and other relevant stakeholders.

5.
Be valid and reliable measures of community sustainability, and everything else being equal, be the least problematic given their intended use. 6.
Be direct and not proximate measures unless necessary. 7.
Be resistant to the problem of goal displacement, or be balanced by the inclusion of indicators that will counteract efforts to game the system. 8.
Take into account the trade-offs between the quality of the indicator as a measure of community sustainability versus the cost of collecting the data. 9.
Clearly state the process by which the indicator is calculated. 10. Provide clear definitions of the data sources and data collection procedures to facilitate uniform reporting from decentralized sites. 11. Have a clear direction of preferred movement.

Conclusions
This article makes several contributions to the literature on sustainability assessment systems. First, most of the critiques of such systems focus on one level of analysis (e.g., neighborhood) or type of assessment system (e.g., index). This review considers the critiques of global, national, subnational, municipal, and neighborhood sustainability assessment systems that can be classified as either projects, indices, or tools. Second, it offers a definition of sustainability that responds to criticisms in the literature and provides an anchor for the design of assessment systems. Third, it provides a multi-scale, systems approach to community sustainability and its measurement. Fourth, drawing on the performance measurement literature, it makes recommendations for creating a parsimonious and generalizable community sustainability assessment system (CSAS).
Too often, sustainability has been viewed as a technical exercise that can be assessed using a one-size-fits-all approach for measuring progress at the community level. Our analysis indicates that a community sustainability assessment system should be a parsimonious set of indicators that measure outcomes using a process that suits the particular circumstance of a locale. At the same time, it must address the four pillars of sustainability (economics, environment, equity, and governance), their interactions, the non-substitutability of natural capital, and the multilevel nature of the system in which the community sits. This approach not only allows researchers to better understand community sustainability but also reveals to policymakers how community sustainability will require multilevel governance approaches.
Given the challenges inherent in defining, implementing, and measuring sustainability, building sustainable communities may represent the ultimate "wicked" problem. The various and interacting factors that add to or detract from a community's sustainability represent a complex system. Observing these factors in a coherent and useful way-for policymakers and researchers alike-can be daunting.
The quest for sustainable community metrics needs, over the next few years, to settle on a system that focuses on outcomes and can be applied in a variety of contexts so that comparisons and true measures of sustainability can be taken.