Sustainable Assessment Tools for Higher Education Institutions: Guidelines for Developing a Tool for China

: Higher education institutions (HEIs) in both early and mature stages of sustainable development (SD) have been moving toward sustainability. Methods for assessing SD have been developed from global and regional contexts to support sustainability e ﬀ orts. The purpose of this paper is to formulate guidelines as input to develop a sustainable assessment tool (SAT) for China based on the current SD stage of Chinese HEIs. Through desk research, SATs were selected and analyzed. Fifteen SATs consisting of more than 1000 indicators included in the analysis and based on components for developing SATs were identiﬁed, and then the components were selected and discussed through an online workshop engaging a 34-people Chinese research team, in order to formulate the guidelines for Chinese HEIs. The ﬁndings reveal that the emphasis of SATs mainly results from their contexts, purposes and stages, backgrounds or focus. Chinese HEIs are in the early SD stage, and the multiple purposes and components of SATs are identiﬁed to support local sustainability e ﬀ orts. Having a clear understanding of the current SD stages of SATs and selecting the components accordingly would enable them to fully reach their potential in practice, especially in the case of early SD HEIs.


Introduction
Sustainable development (SD) has become a central issue in higher education [1,2]. Higher Education Institutions (HEIs) are playing an increasing important role in advancing sustainability [3]. HEIs are regarded as large communities and their campuses as mini cities [4,5] when proposing and testing sustainability solutions [6]. The implementation of SD is carried out in various aspects, such as governance, operations, education, research, and engagement, which contribute to a sustainable campus model for living and working [7,8].
To better guide this SD process, a variety of sustainability assessment tools (SATs) have been developed in either a global or regional context, which offer efficient approaches to SD measurement and bring about organizational advancements toward sustainability [9]. SATs are reviewed regularly and their strengths and weaknesses are discussed in order to make them more adaptable and effective in practice. There is a call for a global tool, allowing for cross-institutional benchmarking using the same standard [9][10][11]. The framework of global SATs has been discussed, in addition to environmental topics, education and research [12], outreach activities [13,14], economic and social topics [15,16]. Although the holistic framework of global SATs was identified [17][18][19], in practice, most of the SATs are generally applied to the countries or continents where they were developed [20]. SATs to develop the new Chinese SAT, ranging from early to mature stage of SD, next to the Assessment Standard for Green Campus (ASGC). Following the criteria in Table 2, 14 SATs were identified (Table A2 in the Appendix A). In total, 15 SATs were selected. Table 2. Screening criteria of SATs for HEIs.

Criteria
Description Results (of 73) Accessibility A1-Main context available in published work or online 55 A2-Available in English 47 State of use U1-The SAT is still in use 33 U2-User feedback or case study is available 28 Content P1-Developed for HEIs 23 P2-Holistic framework for assessing SD, including at least environment, management, and education aspects 16

Representativeness
For SATs developed from similar background or using the same data source, the less used one is excluded. 14 A brief description of each of these 15 SATs is given.
(1) Assessment Instrument for Sustainability in Higher Education (AISHE) [44] was published by the Dutch Foundation for Sustainable Higher Education. AISHE was developed as "a strategic tool for the development of an Education for Sustainable Development (ESD) policy". Mainly used in Europe, AISHE has been applied to about 30 countries. (2) Adaptable Model for Assessing Sustainability in Higher Education (AMAS) [8] (p. 475) focuses on assessing HEIs' sustainability "within different implementation stages and data availability scenarios" according to the Chilean context. The tool was fully applied to five HEIs in Chile [45]. (3) Assessment System for Sustainable Campus (ASSC) [46] was developed by the Sustainable Campus Management Office of Hokkaido University and is run by the Campus Sustainability Network in Japan (CAS-Net JAPAN). ASSC has resulted from a joint research based on existing SATs of STARS, Value Metrics and Policies for a Sustainable University Campus (UNI metrics), Alternative University Appraisal (AUA), and GM. ASSC is a benchmarking tool and offers an online assessment system that "enables universities to discover criteria for its administrative policies". ASSC has been applied to universities in Japan and abroad. (4) Campus Sustainability Assessment Framework Core (CSAF Core) was published by the Sierra Youth Coalition (SYC). It is a simplified version of the CSAF [34] that focuses on assessing sustainability performance in Canadian Universities. CSAF Core is not run by any institution and has been applied freely by HEIs. (5) Graphical Assessment of Sustainability in University (GASU) [12] is a benchmarking tool that resulted from a modification of the Global Reporting Initiative (GRI) Sustainability Guidelines. The tool was updated in 2011 to align it with the GRI G3. GASU has been applied to 12 universities [47]. (6) GreenMetric World University Rankings (GM) [48] was initiated by the University of Indonesia.
This online ranking tool aims to bring "university leaders in their efforts to policies and manage behavioral change". A total of 779 HEIs from 83 countries participated in 2019. (7) People & Planet Green League (P&P) [49] is a university ranking that is published annually by the UK's largest student campaigning network, People & Planet. Focused on "meeting student calls for climate action", every UK university that receives public authority funding is ranked on their environmental and ethical performance. 154 universities were ranked in 2019.  (9) Sustainability Assessment Questionnaire (SAQ) [51] was published by University Leaders for a Sustainable Future (ULSF). SAQ is a qualitative survey of sustainability that aims to "raise consciousness and encourage debate" and "gives a snapshot of the state of sustainability". SAQ is published online for HEIs to apply. (10) Sustainability Tracking, Assessment and Rating System for Colleges and Universities (STARS) [52] was developed by the Association for the Advancement of Sustainability in Higher Education (AASHE). STARS is a benchmarking tool offering a voluntary, self-reporting framework and online reporting tool to measure sustainability. It originated in North America and is applied to Canada, Mexico, European, and Asian HEIs as well. By 2020, more than 1000 institutions have registered to use the tool. (11) Sustainable University Model (SUM) [53] was created with empirical data from about 80 HEIs around the world. SUM comprises four phases, following the Deming Cycle: vision, mission, university-wide sustainability committee, and strategies for fostering sustainability, which emphasize the continuous improvement of sustainability initiatives. (12) Sustainability in Higher Education Institutions (SusHEI) [54] was developed in Portugal.
SusHEI offers a framework considering education and research impacts on economic, environmental, and social levels and the community. The indicator selection is made according to the features and purpose of a specific HEI. The tool is illustrated by the Faculty of Engineering of the University of Porto (FEUP) as a case study. (13) Greening Universities Toolkit (Toolkit) [55] is a United Nations Environment Programme focusing on "transforming universities into green and sustainable campuses". Researchers from Africa, Asia-pacific, Europe, Latin America, and North American universities contributed to the program. Toolkit offers the Deming cycle strategies for implementation. It can also be used as an assessment tool and was applied to the IPB Dramaga Campus in Indonesia [56]. (14) Unit-based Sustainability Assessment Tool (USAT) [21] (p. 7) was supported by the Swedish/Africa International Training Programme (ITP). USAT was developed based on SAQ, AISHE, and GASU. Flexibly used at a partial or institutional level, USAT aims to "identify potential change projects/areas for future development and growth". The tool was applied to about 18 universities in African countries [57]. (15) Assessment Standard for Green Campus (ASGC) [58] was developed by the Chinese Society for Urban Studies (CSUS) and published as a national assessment standard by the MOHURD. ASGC is a benchmarking tool that aims to advocate the concept of sustainability and promote SD. It includes 75 indicators from four areas: planning and ecology, energy and resources, environment and health, education and spread.

Research Design
This research aims at learning from existing SATs and formulating guidelines of practical importance to develop the new Chinese SAT. First, an analysis was made to identify the characteristics and emphasis of the selected 15 SATs. Based on the analysis, the guidelines for the Chinese SAT were formulated in an online workshop.

Comparison of the Sustainable Assessment Tools
The basic characteristics of the SATs were analyzed, including context, purpose and stage, type of indicators, assessment and data validation, and result publication, to draw a general picture of how sustainability is measured among HEIs at both early and mature SD stages.
Then, the emphasis of SATs was analyzed using the structure displayed in Figure 1; six levels have been studied: from dimension to aspect, topic and issue; and finally indicators to identify the common and unique topics in the SATs when assessing sustainability. This analysis of emphasis was conducted through the following steps: Based on the method of Yarime and Tanaka [13] and Findler et al. [19], a total of 1051 indicators extracted from the 15 SATs were recategorized to dimensions and aspects, and then to topics and issues.
Inspired by Cronemberger de Araújo Góes and Magrini [20], combined with the findings of Alghamdi et al. [18], the key dimensions of HEI sustainability were slightly shifted to address the engagement and were identified as Governance, Operations, Education, Research, and Engagement.
(1) Governance-Vision and commitment, university scale policy and strategy, management structure and staff; (2) Operations-Consist of three aspects: environmental (environmental management, activities, and practices); social (healthy, safety, and quality of working and living); and financial (related to financial issues, including investments and budget, environmental issues, social issues, education, and research); (3) Education-Curriculum, teaching, and training for students and staff; (4) Research-Encouragement, support, and output of research; (5) Engagement-Consist of two aspects, "campus engagement (students with sustainability learning experiences outside the formal curriculum); Public Engagement (sustainable communities through public engagement, community partnerships, and service" [59] (p. 73).
To ensure reliability, the process of assigning each indicator into a dimension, aspect, topic, and issue was done in two independent processes.
Based on the analysis of the 15 SATs, important components for developing the Chinese SAT were identified.

Workshop
Next, an online workshop aimed at formulating the guidelines for developing the Chinese SAT was organized. The guidelines were based on the important components identified from the comparison of the 15 SATs. In a two-round workshop, these components were selected and their applicability for Chinese HEIs was discussed.
A 34-people Chinese research team was called upon to formulate the guidelines ( Table 3). The team aimed to include experts working in related fields of campus sustainability from HEIs in both relatively early and mature SD stages, research and design institutes, and planning bureaus in the Beijing-Tianjin-Hebei province. Therefore, invitations were sent to targeted experts of our network and experts who have published campus-sustainability-related papers in the last 3 years (2018-2020). A first invitation received 20 positive responses from our network (response rate: 80%). A second invitation was sent to our extended network and to experts identified from the published papers. It This analysis of emphasis was conducted through the following steps: Based on the method of Yarime and Tanaka [13] and Findler et al. [19], a total of 1051 indicators extracted from the 15 SATs were recategorized to dimensions and aspects, and then to topics and issues.
Inspired by Cronemberger de Araújo Góes and Magrini [20], combined with the findings of Alghamdi et al. [18], the key dimensions of HEI sustainability were slightly shifted to address the engagement and were identified as Governance, Operations, Education, Research, and Engagement.
(1) Governance-Vision and commitment, university scale policy and strategy, management structure and staff; (2) Operations-Consist of three aspects: environmental (environmental management, activities, and practices); social (healthy, safety, and quality of working and living); and financial (related to financial issues, including investments and budget, environmental issues, social issues, education, and research); (3) Education-Curriculum, teaching, and training for students and staff; (4) Research-Encouragement, support, and output of research; (5) Engagement-Consist of two aspects, "campus engagement (students with sustainability learning experiences outside the formal curriculum); Public Engagement (sustainable communities through public engagement, community partnerships, and service" [59] (p. 73).
To ensure reliability, the process of assigning each indicator into a dimension, aspect, topic, and issue was done in two independent processes.
Based on the analysis of the 15 SATs, important components for developing the Chinese SAT were identified.

Workshop
Next, an online workshop aimed at formulating the guidelines for developing the Chinese SAT was organized. The guidelines were based on the important components identified from the comparison of the 15 SATs. In a two-round workshop, these components were selected and their applicability for Chinese HEIs was discussed.
A 34-people Chinese research team was called upon to formulate the guidelines ( Table 3). The team aimed to include experts working in related fields of campus sustainability from HEIs in both relatively early and mature SD stages, research and design institutes, and planning bureaus in the Beijing-Tianjin-Hebei province. Therefore, invitations were sent to targeted experts of our network and experts who have published campus-sustainability-related papers in the last 3 years (2018-2020). A first invitation received 20 positive responses from our network (response rate: 80%). A second Sustainability 2020, 12, 6501 7 of 30 invitation was sent to our extended network and to experts identified from the published papers. It received 14 positive responses (response rate: 35%). As a result, 34 experts were selected, ranging from researchers, designers, engineers, senior managers, and faculty leaders to government officers from 14 institutes (8 HEIs,4 Research and design institutes, and 2 Planning Bureaus). The research team was randomly and equally divided into two groups. During each round of the workshop, shared online documents were used (Excel documents uploaded on the website platform, https://docs.qq.com/desktop/) to collect and exchange comments anonymously and iteratively. Within the group, each expert was assigned a sheet to score and make comments, as well as share responses.
The data collection was structured as follows: In the first round, opportunities and challenges of current SD of HEIs in the Beijing-Tianjin-Hebei province were discussed, questions on the components (purpose, type of indicators, assessment and data validation, result publication, emphasis) of the guidelines were proposed, and Likert scales (1 for "strongly disagree", 5 for "strongly agree") were used to collect responses.
From March 15 to May 30, 31 out of 34 experts described the opportunities and challenges of current SD in HEIs and scored and commented on the components of the guidelines. They showed agreement (scored 3-5 at 4.1-4.7, on average) on the descriptions on purpose and emphasis, but agreed less on the type of indicators (scored 2-5 at 3.4, on average). Then, the comments were collected to supplement the guidelines for the next round.
In the second round, questions on more detailed guidelines were proposed, including purpose and stage, scoring method of indicators, emphasis of dimensions and aspects, and topics for the Chinese SAT. The Likert scales were used to collect responses, and the topic selection was made according to the importance of the current SD in Chinese HEIs (1 for "not important", 5 for "very important").
From June 5 to June 20, 29 out of 34 experts reached agreement on the guidelines. They scored and gave comments to identify the emphasis and topics for the new SAT.
After two rounds of the interactive process, the guidelines were formulated.

Results
This section is organized in two parts. Section 3.1 presents the results of the comparison of the SATs, from the basic characteristics (context, purpose and stage, type of indicators, assessment and data validation, result publication) and emphasis (dimensions, aspects, topics, issues). This part identifies the important components for formulating the guidelines for the Chinese SAT. Section 3.2 describes the current SD of Chinese HEIs and the guidelines for the Chinese SAT determined through the online workshop.

Comparison of the Sustainable Assessment Tools
The basic characteristics and emphasis of SATs that contribute to positioning them in the SD stages are compared in this section.

Basic Characteristics of Sustainable Assessment Tools
The basic characteristics of the SATs are shown in Table 4.  Context Global and regional SATs are identified through their aims, backgrounds, and the countries they have been applied to. There is no absolute boundary between global and regional SATs; they can share information and benefit from each other. Global SATs could be redeveloped or modified to adapt to regional HEIs. Regional SATs could also apply to HEIs worldwide by adding global experience. This classification is used to better describe the characteristics of SATs.
Global SATs contribute to leading the world HEIs toward sustainability. SAQ, GM, and STARS were developed for world universities and have been applied to a number of countries. GASU, SUM, and Toolkit were developed based on the global context and are identified as global tools. AISHE is also a global tool. It is originally Dutch but was updated in AISHE 2.0, adding international experience, and applied to about 30 countries. Some SATs were developed specifically for supporting regional SD. AMAS, P&P, PSI, SusHEI, USAT, and ASGC are based on regional contexts and mainly applied to the countries they were developed in. ASSC and CSAF Core were developed based on regional context and applied to some HEIs abroad, but they are essentially regional tools, based on their backgrounds.
Compared to mature SD HEIs, early stage HEIs are faced with more challenges and are more in need of SATs to support their specific situation in SD. The recently developed regional SATs for early SD stages (AMAS, SusHEI, USAT, ASGC) are of practical importance in guiding local SD practice.

Purpose and Stage
SATs have been developed for various purposes in early and/or mature SD stages. Based on the initial goal of assessing SD, SATs offer references and solutions to lead universities toward increased sustainability ( Figure 2). In total, six different purposes have been identified in the SATs: (1) Ranking tools: For HEIs in both early and mature SD stages; ranking encourages HEIs to enroll and take responsibility to react to their rankings. GM is an entry-level tool for world universities, and P&P is for UK universities. (2) Raising consciousness: For HEIs in early SD stage; the SAT brings the debate and consideration for SD. SAQ offers a snapshot of the state and calls for action. (3) Identifying the overall sustainability picture: For HEIs in early SD stage, these SATs characterize, compare, and establish the SD performance of the individual HEI (AMAS, SusHEI) or of the whole region (USAT). (4) Strategic tools: Developed for HEIs in both early and mature SD stages, strategic tools contribute to guiding the policy-making or strategic managing process to activate and achieve HEIs' sustainable development goals (SDGs). SUM, AISHE, and Toolkit can be applied to early SD HEIs, while ASSC is for more mature stage HEIs. (5) Benchmarking tools: Developed more for HEIs in a mature SD stage, benchmarking builds up the baselines and allows for cross-institutional comparison. USAT and ASGC are early stage benchmarking tools, while GASU, STARS, CSAF Core, PSI, and ASSC are more mature stage benchmarking tools. (6) Transmission tools: For HEIs in a mature SD stage; the SAT serves as a platform in which HEIs could share their SD experience. ASSC acts as a platform for experience exchange in the campus and the community.

Type of Indicators
Many of the selected SATs include both qualitative and quantitative indicators, except for some SATs (AISHE, CSAF core, GASU, SUM, SUM, SusHEI, and USAT) that only adopt one of them. The number of indicators in the SATs can be divided into three levels: few (16-30 indicators), medium (39-83 indicators), and large (134-174 indicators).
Qualitative indicators-SATs (AISHE, part of GM, ASSC, and ASGC) adopt qualitative indicators in their assessment, using Guttmann or Likert scales. The Guttmann scale measures the stage of SD implementation in describing the extent or depth of the measures, which also provide guidance toward sustainability. The Likert scale is widely used in qualitative assessment, the responses developed by Lozano [12] are applied to the whole system of GASU, AMAS, Toolkit, and USAT, which assess the general status of the issues through information coverage and performance.
Quantitative indicators-SATs (CSAF Core, part of AMAS, GM, P&P, PSI STARS, and ASGC) include quantitative indicators, for they are a very empirical way of measurement when used properly [20]. Compared to others, STARS follows a stricter scoring method for some indicators by measuring both the status and percentage of the assessed issues. Besides, some SATs offer alternatives when lacking data: P&P offers part of the total credits for the lack of coverage of information in some indicators. ASSC adopts some indicators from STARS and offers bonus credits for providing detailed data.

Assessment and Data Validation
Almost all the selected SATs can be used as self-assessment tools. The clearly expressed methodology and transparent scoring method enable potential users to participate in self-assessment. Qualitative indicators-SATs (AISHE, part of GM, ASSC, and ASGC) adopt qualitative indicators in their assessment, using Guttmann or Likert scales. The Guttmann scale measures the stage of SD implementation in describing the extent or depth of the measures, which also provide guidance toward sustainability. The Likert scale is widely used in qualitative assessment, the responses developed by Lozano [12] are applied to the whole system of GASU, AMAS, Toolkit, and USAT, which assess the general status of the issues through information coverage and performance.
Quantitative indicators-SATs (CSAF Core, part of AMAS, GM, P&P, PSI STARS, and ASGC) include quantitative indicators, for they are a very empirical way of measurement when used properly [20]. Compared to others, STARS follows a stricter scoring method for some indicators by measuring both the status and percentage of the assessed issues. Besides, some SATs offer alternatives when lacking data: P&P offers part of the total credits for the lack of coverage of information in some indicators. ASSC adopts some indicators from STARS and offers bonus credits for providing detailed data.

Assessment and Data Validation
Almost all the selected SATs can be used as self-assessment tools. The clearly expressed methodology and transparent scoring method enable potential users to participate in self-assessment.
To encourage participation, online reporting tools are provided by SATs (STARS and ASSC), allowing for a direct and convenient self-assessment.
There are passive assessment tools, like P&P and PSI, that rank or benchmark the HEIs according to information from their official websites and authoritative database. Passive assessment is applied to HEIs in mature SD stage with accessible data. These tools allow for the comparison on a large scale or of all HEIs in a certain country, but are limited to the available database and may face challenges when adding issues from outside the database.
SATs use various methods to ensure data accuracy, such as the subscription from a high-ranking executive, analysts' or experts' reviews, a third-party validation, and an onsite survey.

Result Publication
The publication of results also contributes to validating the data, as well as sharing the achievements and experiences. GM, P&P, ASSC, and STARS publish partial or all the assessment results on their official websites, which raise national awareness and encourage HEIs to enroll.

Emphasis of SATs on Dimensions and Aspects
To analyze the emphasis of the SATs, both the percentages of indicators belonging to each dimension and aspect are calculated in two ways: (1) the emphasis is calculated based on the sum of credits or percentage given to each of the five dimensions. However, (2) some SATs like AISHE, CSAF Core, Toolkit, SAQ, SUM, and SusHEI do not have a quantitative calculation. Therefore, the emphasis is calculated through the number of indicators divided by the total of indicators. Each indicator is linked to a dimension and aspect, and some indicators belong to two dimensions or aspects, and the emphasis is scaled to 100%.
The result shows that the emphasis in dimensions varies greatly in SATs ( Figure 3). The Operations dimension plays the most important role, and the three aspects of Operations together contribute 56%, on average. More than half of the SATs show a strong emphasis on the Operations-Environmental dimension, with 36%, on average, and range from 11% (SusHEI) to 73% (ASGC). The emphasis on Operations-Social is 12%, on average, and ranges greatly from 0% (SAQ, USAT) to 36% (PSI, CASF Core). The emphasis on Operations-Financial is largely ignored by SATs, with only 7%, on average, and a range between 0% (SAQ, ASGC) and 21% (GASU).
Of the five dimensions, the emphasis on Engagement of Campus and Public ranks second, at 14%, on average. The emphasis on Engagement-Public is a little higher than that on Campus, at 8%, on average, and ranges from 0% (PSI, GASU) to 20% (AISHE). The Engagement-Campus is at 6%, on average, and varies greatly from 0% (P&P) to 24% (USAT).
The emphasis on the Governance dimension ranks third, at 13%, on average, and ranges from 2% (ASGC) to 34% (AMAS). More than half of the SATs contribute between 10% and 20% to this dimension.
The emphasis on the Education dimension is 10%, on average, and varies from 1% (PSI) to 23% (SAQ). More than half the SATs have less than 10% in this dimension.
The least emphasized is the Research dimension, at 7%, on average, ranging from 0% (PSI) to 20% (AISHE). More than half the SATs have less than 5% in this dimension.
In conclusion, SATs generally show great emphasis on one dimension and largely ignore the others. Only a third of the SATs cover all five dimensions, and some SATs (ASSC, SUM) show a more balanced emphasis. The Operations-Environmental dimension is greatly emphasized by most of the SATs, and the Social and Financial Operations are less covered, while the Engagement and Governance dimensions are part of some SATs, and less emphasis is given to Education and Research-especially the Research dimension. Sustainability 2020, 12, x FOR PEER REVIEW 13 of 34

Emphasis on Topics and Issues per Dimension
A deeper analysis of emphasis was made by grouping indicators to issues and then summarizing issues to topics. The analysis of the indicators has been done by studying the descriptions, questions, examples, rationale, and sub-criteria (if provided). The total indicators were grouped to 148 issues belonging to 44 topics (Tables 5-12).
The topics included in the SATs are identified as follows: The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
The topic is implied in the SAT. The topic is included and has at least two issues.
In the Governance dimension, 138 indicators are regrouped into 22 issues belonging to 10 topics. PSI and GASU cover almost all the topics. The most addressed topics are Strategic plan (13/15), Staff/expertise (10/15), and Management structure (9/15), while Transparency (3/15) is the least addressed.
In the Operations-Environmental dimension, 418 indicators are regrouped into 54 issues belonging to 14 topics. Toolkit, ASSC, and ASGC cover a large number of topics. Around two thirds of the SATs show a common emphasis on environmental topics. In the Social-Operation dimension, 167 indicators are regrouped into 20 issues belonging to 3 topics. PSI and GASU cover all the topics and offer more topics. In the Operations-Financial dimension, 82 indicators are regrouped into 12 issues belonging to 5 topics. GASU offers more topics and issues compared to others.
In the Education dimension, 84 indicators are regrouped into 9 issues belonging to 2 topics. STARS, USAT, and ASSC include a bit more issues compared to others. The topic Students sustainability education (15/15) is more or less addressed by all the SATs, while Staff sustainability training (9/15) is less included.
In the Research dimension, 57 indicators are regrouped into 10 issues belonging to 3 topics. GASU and ASSC offer a bit more topics and issues compared to others. The most addressed topics are Support for sustainable research (11/15), followed by Sustainable research (8/15) and Outputs and Implementation (7/15). SATs show an uneven emphasis and mostly include limited issues on the Research dimension.
In the Engagement-Campus, 67 indicators are regrouped into 12 issues belonging to 4 topics. USAT, ASSC, and STARS cover almost all the topics and issues. The most addressed topic is Activities (13/15), while topics such as Organizations (5/15), Orientation (5/15), and Recruiting talent (2/15) are less addressed. In the Engagement-Public topic, 82 indicators are regrouped into 9 issues belonging to 3 topics. ASSC almost covers all the topics and issues. The most addressed topic is Local and community service (14/15), while Public Participation (7/15) and Outreach programs (4/15) are less addressed.

Governance
Coherence of Communication-GASU (7) Process and procedures of Transparency-PSI (9) Operations Environmental

Asset and facility of Environmental management, Circulation design of Transportation-ASSC (3)
Site safety, Outdoor environment of Site-ASGC (15) Green office, lab, and IT of Buildings-Toolkit (13) Products and services of Purchasing and service-GASU (7) Social Guideline for earthquake of Working and living circumstances, Disaster prevention/support for local community of Social and environmental responsibility-ASSC (3)

Financial
Health and safety fines of Fines-PSI (9) Education -

Research
Graduate students of Outputs and Implementation-GASU (7) Engagement Campus -

Public
Disaster prevention/after strike education, Shared university assets of Local and community service-ASSC (3)

Similarities and Differences in the SATs
The topics and issues analysis gives a clear understanding of the content of sustainable assessment in SATs. On the one hand, much common emphasis is identified on Operations-Environmental. On the other hand, various emphases are addressed by SATs according to their characteristics. This analysis identifies the common and unique emphasis of SATs (Table 14). The emphasis mainly results from the following characteristics of the SATs: (1) The global and local contexts-The context contributes to identifying the purposes and stages of SATs, as well as offering unique issues according to the global trend in SD, or local SDGs; (2) The purposes and stages-The characteristic recognizes the main function of the SATs according to their current SD stages and challenges in practice. For early SD stage SATs, most of them tend to put much emphasis on a single dimension as the main driver for fostering SD (e.g., AMAS-Governance; SusHEI-Research; ASGC, GM-Operations-Environmental). For more mature SD SATs, they tend to put a more balanced emphasis by offering topics and issues on related dimensions (e.g., AISHE, PSI, CSAF Core, GASU, P&P) or showing a balanced emphasis on the five dimensions (e.g., ASSC, SUM). (3) The background or focus-This is more related to the SATs' own orientation in assessing SD (e.g., GASU: modification of GRI; PSI: focusing on environmental and social topics.). The local and global contexts, purposes and stages, and backgrounds or focus of the SATs contribute to characterizing their emphasis.  (5) (1) Ranking P&P (7) Focuses on environmental and ethical performance Multiple drivers (4) The dimensions or aspects are emphasized by the SAT (in topics and issues); The dimensions or aspects are much emphasized by the SAT (in dimensions and aspects); The dimensions or aspects are strongly emphasized by the SAT (in both dimensions and aspects, and topics and issues).

Guidelines for Developing a Sustainability Assessment Tool for China
The comparative analysis in Section 3.1 provides important components to include into the guidelines for the new Chinese SAT from the characteristics and emphasis of SATs. The components were selected from the existing SATs and discussed and evaluated from the perspective of Chinese HEIs. Through the online workshop, the current SD of Chinese HEIs was discussed, and then the guidelines were formulated based on the components.

Current Sustainable Development of Chinese HEIs
To begin with, the current SD of Chinese HEIs was discussed, taking the HEIs in Beijing-Tianjin-Hebei province as an example. Opportunities and challenges were identified from the online workshop, and three major requirements for the SAT were proposed.
Proposing a more balanced campus SDGs. National policies on SD campuses have been released to support HEIs toward sustainability (Figure 4). The transformation of environmentally sustainable to more comprehensive green campuses can be identified, which calls for more balanced SDGs in sustainability assessment.
HEIs. Through the online workshop, the current SD of Chinese HEIs was discussed, and then the guidelines were formulated based on the components.

Current Sustainable Development of Chinese HEIs
To begin with, the current SD of Chinese HEIs was discussed, taking the HEIs in Beijing-Tianjin-Hebei province as an example. Opportunities and challenges were identified from the online workshop, and three major requirements for the SAT were proposed.
Proposing a more balanced campus SDGs. National policies on SD campuses have been released to support HEIs toward sustainability (Figure 4). The transformation of environmentally sustainable to more comprehensive green campuses can be identified, which calls for more balanced SDGs in sustainability assessment.
Considering the uneven SD of HEIs and supporting campuses at all stages of SD, even though with the support of national policy and funds the implementation of SD practice in HEIs is not balanced. Only 100 HEIs (33%) have successfully constructed the CEMs by the end of 2017. According to Alexio's definition of the stages of SD in HEIs, HEIs that can immediately adopt most SD practices (e.g., Tsinghua University) are "Innovators", while HEIs that are the last to adopt are "Laggards" [60]. It is important to consider both the "Innovators" and "Laggards" and support their SD through assessment.
Bridging the gap between national policy, implementation, and assessment. As it has been concluded, a SAT that aligns its criteria for assessment with the procedure for implementation would bring practical benefits to HEIs [36]. Therefore, the SAT would be a tool for both the assessment and implementation toward current SDGs.

Guidelines for Developing the Chinese SAT
The guidelines were discussed and revised in a 2-round online workshop. The experts reached an understanding, and the following guidelines were suggested for the development of a new Chinese SAT. Considering the uneven SD of HEIs and supporting campuses at all stages of SD, even though with the support of national policy and funds the implementation of SD practice in HEIs is not balanced. Only 100 HEIs (33%) have successfully constructed the CEMs by the end of 2017. According to Alexio's definition of the stages of SD in HEIs, HEIs that can immediately adopt most SD practices (e.g., Tsinghua University) are "Innovators", while HEIs that are the last to adopt are "Laggards" [60]. It is important to consider both the "Innovators" and "Laggards" and support their SD through assessment.
Bridging the gap between national policy, implementation, and assessment. As it has been concluded, a SAT that aligns its criteria for assessment with the procedure for implementation would bring practical benefits to HEIs [36]. Therefore, the SAT would be a tool for both the assessment and implementation toward current SDGs.

Guidelines for Developing the Chinese SAT
The guidelines were discussed and revised in a 2-round online workshop. The experts reached an understanding, and the following guidelines were suggested for the development of a new Chinese SAT.

Purpose and Stage
The Chinese HEIs are still in an early stage of SD, as was identified by almost all the experts (97%). Therefore, the SAT for China not only needs to act as a self-assessment tool that identifies the current status of sustainability, but also plays a positive role in guiding further implementation for SD. The main purposes of the SATs are recognized as (1) Identifying the overall sustainability picture (90%), (2) Acting as benchmarking tools to build up the baseline for comparison (83%), and (3) Acting as strategic tools for guiding and managing implementation (72%). These purposes are linked to the early stages, but it is remarkable that the goals "ranking tool" and "raise consciousness" were not selected, as they are the first stages of SD (see Figure 2).
Identifying the overall sustainability picture is one of the first steps in sustainability assessment. The Chinese SAT is expected to assess the SD status through its application, as in AMAS, SusHEI, and USAT.
The purpose of benchmarking is also highly valued, as in the current Chinese tool ASGC. Benchmarking tools build up baselines for comparison, which are regarded as basic goals for SD. It can be seen that the reasonable baselines for comparison are of critical importance, being both leading and achievable for HEIs. More empirical case studies of HEIs would contribute to setting reasonable baselines for benchmarking tools.
The Chinese SAT is recommended to work as a strategic tool to guide the SD implementation of HEIs in different SD stages, as well as to bridge the gap between the national policy, implementation, and assessment. For this purpose, Toolkit, SUM, AISHE (early and mature), and ASSC (mature) could be the references. The Deming cycle of "plan-do-check-act" is applied to SATs in the frameworks (AISHE, SUM) or the issues level (ASSC, Toolkit), which offers a closed-circle implementation process. Based on the Deming cycle, it would be beneficial for the Chinese SAT to introduce implementation strategies, track continuous changes, and foster HEIs to propose new solutions in SD.

Type of Indicators
From the discussion, quantitative indicators are of practical importance in assessment, especially for measuring environmental issues. However, for HEIs that have not applied CEMs, it is still challenging to offer environmental operation data.
The experts agreed upon the following guidelines as scoring methods of quantitative indicators after two rounds of discussion: when quantitative data lack for assessment, especially for HEIs that have not applied CEMs, it is acceptable to offer alternatives, and detailed documents and calculation processes can be requested. The alternatives are (1) lowering the requirements of the quantitative data and offering part of the total credits (P&P); (2) encouraging the provision of more accurate and systematic data and awarding extra credits (ASSC).
These guidelines encourage HEIs to enroll in assessment without depending only on data from CEMs. More importantly, they foster HEIs to improve data collection and management capabilities to enhance the coverage and accuracy of the data.

Assessment and Data Validation
Self-assessment is a popular type of assessment. To support self-assessment, online reporting tools are recommended. They are mainly developed for mature SD stage SATs, like STARS and ASSC, but it is also a good option for early stage SATs to offer direct and convenient self-assessment tools.

Result Publication
Result publication on a website is recommended for the Chinese SAT. Even though it is used mostly for ranking tools (GM, P&P) and mature stages of benchmarking (STARS, PSI), it is an effective way to raise national awareness and encourage the exchange of experiences.

Emphasis on Sustainable Dimensions and Topics
It is necessary for the Chinese SAT to build up a more balanced emphasis on the dimensions and aspects. The current ASGC overemphasizes the Operations-Environmental (73%) dimension and shows little emphasis on Governance (2%) and Research (2%) and no emphasis on Operations-Financial (0%).
Based on the agreement that a more balanced emphasis is favorable, detailed questions were asked to the experts. First, the experts expressed the ideal emphasis when considering the Operations-Environmental alone; most responses indicated a 50-60% and 60-70% emphasis.
Secondly, when considering the emphasis of all five dimensions, a decrease in the percentage of Operations-Environmental was shown by one third of the respondents. The ideal emphasis of dimensions and aspects was then proposed (Table 15). A decrease in the emphasis of Operations-Environmental compared to ASGC (from 73% to 53%) was identified. As a result, Operations-Financial (+7%), Governance (+6%), and Research (+4%) increased. The new emphasis is more balanced compared to ASGC, but a gap remains with regards to other SATs. In general, the SATs in early SD stage show a more balanced emphasis.
Thirdly, when taking the next 5-year SD plan of an HEI as example, the priority of investment in the five dimensions was asked according to their importance (No. 1 important, No. 2 important, No. 3 important . . . ). The responses show that Operations-Environmental is still of primary importance (70%), followed by Operations-Financial (40%) and Governance (38%). As a result, the Operations-Environmental dimension is still the primary and greatest emphasis for the new Chinese SAT.
Next, topics were selected according to the importance of the current SD in Chinese HEIs. The result also points out the primary emphasis of Operations-Environmental. In general, no topics were excluded according to the average score. All the topics belonging to Operations-Environmental were highly scored (over 4), but some relatively unimportant topics (average score between 3.4 and 3.9) were identified from the Governance (Commitments, Network), Social (Human rights of student and staff, Social and environmental responsibility) and Financial (Fines, Fees and wages, Ethically and local development) Operations, Campus (Organizations, Recruiting talent) and Public (Programs) Engagement.

Discussion
The literature review provides a list of SATs reviewed in previous studies as the basis for analysis. The SATs consist of a holistic framework for assessing SD selected for comparative analysis. As has been discussed (e.g., Reference [8,18]), for HEIs in early SD stages, it is challenging to enroll in SATs for HEIs in mature SD stages and those which did not fully address their regional issues. Therefore, it is of practical importance to (re)develop or adapt SATs to support regional SD.
This research identifies the positive roles of SATs in both early and mature SD stages to support global and regional SD. The characteristics of SATs have been discussed (e.g., Reference [18,20]). Based on that, a further analysis was made to map the roles of SATs from context, purpose, and stage. There is no absolute boundary between global and regional or early and mature stage SATs.
The classification of the selected SATs contributes to a clear understanding of the characteristics and their impact on emphasis. It can be seen that for early stage SATs, multiple purposes have been developed to support SD in their current situation, ranging from ranking, raising consciousness, identifying the overall sustainability picture, and strategic concerns to benchmarking tools. However, for mature stage SATs, the main purpose was benchmarking. In general, early SD stage HEIs need multiple function SATs to support raising awareness, understanding, and the management of SD in practice. For this, a toolkit consisting of SATs for the multiple needs of HEI is recommend for future study (as state by Reference [60]).
The analysis of indicators in dimension, aspect, topic, and issue frames an overall picture of common and unique emphasis of SATs. As was proven before, the Operations-Environmental dimension is greatly emphasized by most SATs [19], and much common emphasis is identified from the topic and issue level. It is related to the common understanding of the environmental sustainability of HEIs, while imbalanced emphasis was shown in other dimensions (Operations-social, Operations-Financial, Governance, Education, Engagement), and especially Research (rather less emphasis) [20]. Although these dimensions have been underlined and regarded as important elements of HEI sustainability, less common topics and issues were also found. It would be beneficial to analyze the main emphasis and its impact on SD practice to update SATs for determining the next steps in SD.
The analysis provides explanations for the similar and different emphases of SAT result from their characteristics. The global and regional contexts, purposes and stages, and backgrounds or focus of SATs contribute to characterizing their emphasis. These characteristics respond to the current SD of HEIs, the challenges and solutions of SD practice, and SATs' own orientation. It can be seen that early SD stage SATs tend to put much emphasis on a single dimension as the main driver for SD, while more mature SD SATs tend to show a more balanced emphasis. With the progress of SD, this emphasis will continue to change to reflect its current SDGs. It would be beneficial to create a framework for the comparative analysis of existing SATs, considering their characteristics to map their positions and contributions in the global process of SD, as reference and database for SATs.
Taking the early SD stage Chinese HEIs as an example, this research identifies the multiple purposes and important components of the SATs. The trend of quantitative indicators can be identified in SATs [19], which is also favored in the Chinese SAT, especially for measuring Operations-Environmental topics.
However, the answer and scoring method of quantitative indicators are considered to match the availability of data. For HEIs at an early SD stage, it is necessary to offer alternatives to encourage participation in assessment and improvement in data collecting mechanisms.
This analysis also provides components for developing or modifying SATs, which could be applied to early SD stages from other contexts. Based on the overall picture of purposes and stages, a clear understanding of the position of the SATs could be identified according to their current SD. The components of SAT could be selected from this analysis and used as input to develop new SATs. It is recommended to learn from the components by looking at SATs of similar stages or context, and to SATs in a more mature stage. Then, the components can be identified according to the local context, purposes, and focus. It is important to make continuous improvements of the SAT to adapt to the current SD situation and support the SDGs.
The analysis has some limitations that could be explored in future research. First, it takes the HEIs in the Beijing-Tianjin-Hebei province as an example, which is limited to part of the regional SD of Chinese HEIs. Moreover, although targeted experts were included from our network and published papers, experts from HEIs that are not fully aware of SD or have not made their knowledge public might not have been included. Second, the comparative analysis of SATs was mainly approached from the relationship between the SD stages and characteristics. The characteristics and their impacts on SD practice were less explored. Third, the proposed guidelines might be limited to the components of the selected SATs, without a broader perspective.
Future research should further explore the Chinese HEIs and include experts from a wider range of HEIs to gain a more complete picture of SD. In addition, it would be practical to conduct a deeper analysis of SATs, considering the SD stages, characteristics, and their effects in practice to provide references for HEIs. Moreover, the study can be extended through empirical analysis to test the guidelines and propose components from the Chinese context.

Conclusions
This research aims to identify the important characteristics to develop SATs for China. To accomplish this goal, a comparative analysis of 15 SATs was made. This analysis resulted in components for developing the new Chinese SAT. These components were selected and discussed in an online workshop with a 34-people Chinese research team to formulate guidelines as input to develop a SAT.
Some important basic characteristics for developing SATs were identified, ranging from context to purpose and stage, type of indicators, assessment and data validation, result publication, and emphasis. The analysis mapped the positions of SATs regarding purpose and stage and identified the main characteristics and their impact on emphasis. In this way, the important components were identified for developing and updating SATs.
For the current SD stage in China, the three main purposes of the SAT are recognized: (1) Identifying the overall sustainability picture, (2) Benchmarking, and (3) Strategic managing. The quantitative indicators are highly valued in the Chinese SAT, and it is necessary to offer alternatives when quantitative data are lacking, especially for HEIs that have not applied CEMS. Besides, to support participation and information exchange, an online reporting tool and website publication are recommended.
Based on the analysis and discussion in the workshop, a more balanced emphasis including the five key dimensions is proposed for the Chinese SAT. A decrease in the emphasis on Operations-Environmental was identified, which led to an increase of emphasis on Operations-Financial, Governance, and Research. Even though the Operations-Environmental is still of the greatest importance in the current SD assessment, the more balanced emphasis highlights the importance of combining these dimensions.
From the comparison of 15 SATs and the discussion in the workshop, the recommendations for developing the SAT for HEIs in China are proposed, which also shed light on developing SATs in an early SD stage. With a clearer understanding of the characteristics and emphasis of the SATs, HEIs in both early and mature SD stages will be better equipped to support and lead regional and global sustainability.

Title and Abstract
Relevant topical areas (SD of HEIs) Irrelevant topical areas (such as SD of schools, institutions, and systems outside HEIs) Relevant topical areas (SD of HEIs as a whole system) Parts of the topical areas (such as SD of HEIs buildings, transportation, curriculums) Full-text Comparative analysis of SATs (at least 3 SATs) Figure A1. The screening process of articles.     Y for yes, the SAT is included. N for no, the SAT is excluded.
Three SATs were excluded: Cool Schools (No. 17 CS) was a "snapshot" of data institutions submitted via STARS. Refined Campus Sustainability Assessment Framework (No. 19 CASF) was excluded, for it is a modification of CSAF and STARS for Malaysian HEIs. The Uncertainty-based DPSEEA-Sustainability index model (No. 71 uD-SiM) was excluded, for it is a decision-making tool that does not assess overall campus sustainability.