Designing a Sustainability Assessment Framework for Selecting Sustainable Wastewater Treatment Technologies in Corporate Asset Decisions

: There is a growing demand for an integrated assessment to identify and select asset management options based on sustainability in the wastewater industry. However, water companies are often not equipped with a rigorous methodology and sufﬁcient resources to perform sustainability assessments. Although many frameworks and tools for sustainability assessment have been developed in academia, practical challenges such as feasibility and usability remain when implementing sustainability assessment methods to support corporate decision-making. This study developed a Multi-Criteria Analysis based framework to evaluate wastewater treatment processes from a sustainability perspective. This study ﬁrstly explored the decision and organizational context of a water company with preliminary interviews and then applied the Analytical Hierarchy Process (AHP) with composite scores to evaluate wastewater technologies at a sewage treatment works. The preliminary interviews with stakeholders highlighted that the existing investment decisions were primarily driven by ﬁnancial cost and compliance whilst calling for a wider consideration of other criteria. A selection of assessment criteria and indicators were then proposed to compare seven treatment technologies at a sewage treatment works. The results of composite scores indicated that the baseline activated sludge process (ASP) was the best option for this study. Experience from the development process highlighted usability, stakeholder engagement and the organizational context should all be considered as part of the design and implementation of the sustainability assessment. The insights from this study provide a valuable practical foundation for applying a multi-criteria approach to perform sustainability assessments and inform asset management decisions in the water company.


Introduction
Water sanitation is a vital service for individual and societal wellbeing. However, approximately 4.2 billion people globally do not have access to safely managed sanitation service [1]. The United Nations has included sustainable management of clean water and sanitation as one of the Sustainability Development Goals (SDGs). In the UK, approximately 16.6 billion litres of water are delivered to 63.9 million customers every day and the wastewater is collected through almost 400,000 kilometres of sewers and treated at around 7000 sewage treatment works [2]. Given that the majority of the water supply in the UK is abstracted from surface water sources such as river and reservoir, wastewater treatment is a crucial process to reclaim the quality of wastewater to a safe level to be discharged to the local environment. However, the UK wastewater sector is currently facing increasing challenges from regulation changes, population growth and climate change. Besides the existing treatment and discharge standards established in the EU Urban Waste Water Treatment Directive [3], there is increasing scrutiny on the environmental and ecological quality of catchments in the UK. Water and sewerage companies are required to comply with more stringent effluent standards as well as providing more resilient operation and services. Besides increasing regulatory pressure, there is also increasing service demand as the urban population continues to rise. For example, the population of the Greater London Area is expected to increase from 8.75 million to 15 million by 2100 [4]. The shortfall to meet future demand is further exacerbated by the impact of climate change. A 35% increase in rainfall in winter is expected by 2070 in the worst scenario of climate change projection [5]. For combined sewage network systems, excessive runoff and stormflow may compromise treatment capacity and quality, posing a greater risk of pollution incidents and sewer flooding during peak demand [6]. The combination of challenges raises concern over the sustainability of water and wastewater service for future generations [7]. This necessitates that sufficient analyses should be performed to evaluate different wastewater treatment processes or technologies before making the asset decision.
Sustainability assessments are increasingly popular tools used in management practices and decision-making processes [8,9]. Sustainability assessment includes a broad range of assessment tools and is fundamentally associated with the practice of impact assessment. Typically, sustainability assessments aim to evaluate the future consequences of current or proposed options and informs decision making [10,11]. Hugé et al. (2011) discussed three major elements of using sustainability assessment to assist decision making process, including "interpretation, information-structuring and influence" [12]. Specifically, sustainability should be interpreted and tailored to a particular social and organizational context. Information-structuring refers to developing an understanding of the complexity of sustainability in that context. Lastly, sustainability assessment should exert a strong influence on decision-making and implementation of sustainability [9]. Hugé et al. (2011) and Sala et al. (2015) provided a comprehensive list of assessment principles based on the current state of the art in sustainability science [12,13]. Based on this foundation, we provide an adjusted list of key criteria to be considered in sustainability assessment, including:

1.
Comprehensiveness: sustainability assessment should cover a holistic scope integrating the environmental, social and economic dimension of sustainability 2.
Stakeholder engagement: continuous engagement communication with stakeholders is highly suggested to understand and make trade-off effectively and openly [8,14] 3.
Pluralism: compared to the prescriptive process of other impact assessments, each sustainability assessment process should be designed and tailored to the specific context [15]

4.
Transparency: the assessment process should be transparent in terms of the data source, methodological design and its justification so it allows criticism and improvement [13,16]

5.
Intergenerational equity: wider and long-term impacts should also be assessed to ensure the decision demonstrates corporate social responsibility and value for future generations [17] There has been much development of tools and techniques to perform sustainability assessments. Gasparatos and Scolobig (2012) provided an overview of three families of sustainability assessment tools based on the underlying valuation perspective [18]. Biophysical tools evaluate the flow of natural resources and environmental impacts such as life cycle assessment (LCA) and ecological accounting. Monetary tools aim to provide valuations based on the subjective value preference of individuals such as the cost-benefit analysis. The third family is indicator-based tools which include the selection of indicators, weightings, scoring and aggregation. Although assessment tools focusing on a single valuation perspective are well developed, there is a growing demand for integrating assessment approaches incorporating a holistic sustainability perspective of the 'Three Bottom Lines', which include Environmental, Social and Economic sustainability [7,19,20]. From a managerial point of view, water companies are required to understand the complexity of sustainability in their relevant context and evaluate trade-offs between different sustainability criteria [21]. Multi-Criteria Decision Analysis (MCDA) is a collective term of methods that deal with multiple and often conflicting criteria and identify the most preferred option based on the preference systems of decision makers [19,22]. MCDA guides for a logical and coherent decision-making process by applying a standardised method. There is a variety of methods in MCDA such as Analytical Hierarchy Process (AHP), Multi-Attribute Value Theory (MAVT) and outranking methods [22]. MCDA has been widely applied in the areas of environmental science and management and comprehensive reviews were provided by Kiker et al. (2005) [23] and Huang et al. (2011) [24].
Sustainability criteria and indicators are commonly used in MCDA to provide a measurement system. Sustainability criteria can be generally defined as the requirement or standards to achieve sustainable services and products in a specific context. Indicators are the specific measurements or assignments of value to reflect the fulfilment of assessment criteria and the overall sustainability [25]. Balkema et al. (2002) provided a comprehensive list of indicators from previous studies to compare wastewater treatment systems based on the environmental, technical, social-cultural and economic dimensions [26]. Common environmental indicators include energy use and pollutant removal potentials. The amount of energy use to operate the wastewater system has a direct impact on the cost of operations as well as carbon footprint. Compliance with the local discharge standard is a paramount objective of sewage treatment works and therefore the pollutant removal potentials of the treatment process are key investment and performance criteria. In terms of economic indicators, both capital expenditure (Capex) and operational expenditure (Opex) are mostly used, though the scope of costings varies across different studies. Although environmental and economic indicators are well developed, social indicators are often overlooked due to difficulties in measurement and quantification [26][27][28]. However, social indicators can be converted into the quantitative format using a point-based scale [29,30]. Previous studies have also made attempts to integrate all sustainability dimension when assessing wastewater treatment systems [26,27,29]. As a complex multi-criteria problem, trade-offs between different sustainability dimensions are often necessary to reach a final selection among options [31]. As a result, weightings have been developed to aggregate indicators into a composite index for each option. Gherghel et al. (2020) used a Weighted Sum Model to aggregate the performance of six criteria into a 'preference index' to compare different wastewater treatment schemes [32]. Molinos-Senante et al. (2014) developed a 'Global Sustainability Indicator' to evaluate seven wastewater treatment technologies based on a set of indicators in environmental, social and economic dimensions [29]. Similarly, Plakas et al. (2016) used a Weighted Sum Model to aggregate the performance of a range of indicators based on the three pillars of sustainability to assess four types of tertiary treatment technologies [33]. These studies demonstrated that composite indicator can be a viable approach to integrate multiple sustainability criteria.
Although progress has been made to develop a sustainability assessment and indicators for the water industry, so far there has been no study to explicitly discuss the practical challenges of developing and implementing a sustainability assessment in a corporate environment (i.e., water companies). Given that a plethora of sustainability assessment methods has emerged from academic literature, water companies are not often equipped with a clear methodology and guidance how to adopt a sustainability assessment approach and make it usable and feasible. Additionally, in comparison to the abundance of assessment tools on water quality supply management [34,35], the implementation of similar tools specifically for the wastewater asset management is still limited. In light of such a demand, this study aims to develop and propose a sustainability assessment framework based on a MCDA approach to assess and compare the sustainability of different wastewater treatment options in a water company based in the United Kingdom. In comparison to the previous MCDA studies, the assessment framework developed in this study needs to be compatible with the corporate setting and the preference of stakeholders. Thus, the development process will also explore the decision making context in the water company.

Materials and Methods
The development process adopted a mixed-methods approach based on a pragmatism research paradigm. The development of the assessment framework was divided into five stages (Figure 1), namely: Understanding the decision context; Development of the criteria hierarchy; Development of weightings; Score aggregation and options ranking and Sensitivity analysis. The development process was applied to compare different wastewater treatment processes at a sewage treatment works owned by the water company. The results were then presented based on this structure and order.
tion of similar tools specifically for the wastewater asset management is still limited. In light of such a demand, this study aims to develop and propose a sustainability assessment framework based on a MCDA approach to assess and compare the sustainability of different wastewater treatment options in a water company based in the United Kingdom. In comparison to the previous MCDA studies, the assessment framework developed in this study needs to be compatible with the corporate setting and the preference of stakeholders. Thus, the development process will also explore the decision making context in the water company.

Materials and Methods
The development process adopted a mixed-methods approach based on a pragmatism research paradigm. The development of the assessment framework was divided into five stages (Figure 1), namely: Understanding the decision context; Selection of assessment criteria; Development of weightings; Score aggregation and options ranking and Sensitivity analysis. The development process was applied to compare different wastewater treatment processes at a sewage treatment works owned by the water company. The results were then presented based on this structure and order.

Understand the Decision Context
Semi-structured interviews were conducted with internal stakeholders as a preliminary study to establish an understanding of the current drivers and challenges of making wastewater asset decisions in the water company. They also served as an opportunity to engage with key stakeholders during the development process of the assessment framework. A stakeholder analysis was conducted to identify the key project stakeholders and departments for the recruitment of participants. Each interview consisted of three key questions: (1) What is your role and daily activities in the company?
(2) What are the current decision-making practices and drivers for wastewater asset investment? (3) What are the biggest challenges in making that decision?

Understand the Decision Context
Semi-structured interviews were conducted with internal stakeholders as a preliminary study to establish an understanding of the current drivers and challenges of making wastewater asset decisions in the water company. They also served as an opportunity to engage with key stakeholders during the development process of the assessment framework. A stakeholder analysis was conducted to identify the key project stakeholders and departments for the recruitment of participants. Each interview consisted of three key questions: (1) What is your role and daily activities in the company?
(2) What are the current decision-making practices and drivers for wastewater asset investment? (3) What are the biggest challenges in making that decision?
An Ethics Review was conducted and approval by the Research Ethics Committee of University of Surrey (Ref: UEC 2018 081 FEPS). Upon receiving consents from interviewees, interviews were recorded and transcribed for coding and thematic analysis using Nvivo 12 ® . Repeating words or phrases related to decision-making and sustainability were coded and then collated into different groups of themes. A thematic network was established as a decision mapping to visualise the nodes and themes emerged from the interviews. The preliminary insights from this part of the study were then used to tailor the selection of sustainability criteria and indicators.

Development of the Criteria Hierarchy
The development of a hierarchy with relevant assessment criteria and indicators is a critical step to guide multi-criteria measurements towards the overarching decision objective. For this study, a suite of relevant sustainability criteria and indicators were proposed based on information from literature review and the results from the thematic analysis of interviews. First, a round of literature review was done to collate indicators that have been widely used to evaluate wastewater treatment systems from previous studies (Table 1). Then, this list was further reviewed and refined to propose a set of relevant assessment indicators based on the key decision priorities mentioned in the preliminary interviews. AHP was selected as the methodological foundation for developing the weights. Developed by Thomas Saaty [40], it is the most widely applied MCDA approach by the number of its applications [41,42]. For this study, AHP was preferred due to its simplicity in practice and its developed theoretical foundation [43]. The operation of AHP is based on three components: anatomy of the problem as a hierarchical structure, pairwise comparisons and calculation of criteria priorities (i.e., weights) [44]. Pairwise comparison is the primary task of AHP. The fundamental question to be asked is 'how important is criterion a compared to criterion b?' Each comparison determines the direction and degree of importance between two criteria or indicators using a semantic scale (Table 2). For example, a scale number 3 refers to criterion a is moderately more important than criterion b whereas 1/3 refers to a reversed preference direction (criterion b is moderately more important than criterion a). Table 2. The semantic scale for pairwise comparisons in AHP [40].

1
Equally important 3 Moderately more important 5 Strongly more important 7 Very strongly more important 9 Extremely more important 2, 4,6,8 Intermediate values between two scale points Reciprocals The preference order is inversed In terms of the collection of preference judgement, stakeholders were invited via an online questionnaire to perform pairwise comparisons between criteria and indicators. Microsoft Form™ was used as the platform for conducting questionnaire due to its userfriendly features as well as compatibility with the company data policy. Each participant provided their preference judgements through a series of questions of pairwise comparisons. First, the comparisons were made at the parent level (i.e., criteria) of the hierarchy and then their corresponding lower level (i.e., indicators) of the hierarchy. A reciprocal matrix of m × m is constructed based on m number of criteria (or indicators) to be compared (Equation (1)). a 1,m indicates the judgement made between the first criterion and the m-th criterion, etc. In total, m(m − 1) number of comparisons are required per matrix given the property of reciprocity in AHP. m + 1 number of matrices are required to calculate weights for each stakeholder (1 for comparisons between all criteria at the top level and m for comparisons between indicators with respect to each criterion).
Once the judgement of pairwise comparisons were collected, the weights can be acquired by either calculating the eigenvectors of the matrix or the geometric mean of each row, which provides similar results [45]. The geometric mean was used for this study due to its simplicity and compatibility with Microsoft Excel. First, the geometric mean of elements in r-th row of a m × m matrix was calculated as, and then A r was normalized by the sum of geometric mean of all rows to compute the weights of r-th criterion so that ∑ m i = 1 w = 1.
A consistency ratio check was also conducted to determine the consistency levels of judgements from participants [40]. Responses with consistency ratios greater than 0.1 were considered unreliable and were excluded from further aggregation of the group weightings.

Group Weightings
Each questionnaire generated one set of responses from a stakeholder and thus one weighting profile. A group weighting is required to represent the collective judgements on the importance of assessment criteria and indicators. According to Belton and Pictet [46], there are three ways of developing the group weighting, namely 'sharing, comparing or aggregating'. Sharing refers to the exchange of opinions and preference among all stakeholders and then reach a consensus as one input of preference judgements for AHP calculation. 'Comparing' refers to the comparison of weightings developed from individual preference judgements and deciding which set of weights is the most representative. 'Aggregating' means the aggregation of individual weightings mathematically. Mathematical aggregation is considered more suitable for the setting of the online questionnaire (especially due to the prevailing COVID-19 social distancing conditions when this research was conducted). For this study, the geometric mean was performed to aggregate individual weightings into a group weighting as shown: whereas ω g k is the aggregated group weight of k-th criterion (or indicator) based on q number of stakeholders. Similarly to the normalization of weights from reciprocal matrix (Equation (3)), the aggregated group weights were normalized so the sum of all criteria equal to 1.

Score Aggregation and Options Ranking
A sewage treatment works (denoted as STW A) was selected for the application of the criteria hierarchy and group weightings developed from previous stages. The name and location of the sewage treatment work has been anonymised due to business confidentiality. STW A has a designed capacity of approximately 100,000 population equivalent and the following seven treatment technologies were considered for the implementation of the treatment process, namely: To make those options comparable, each technology needs to be included in a separate process design. Figure 2 shows the basic process configurations of the seven options to be compared, which have been pre-defined by asset planners in a previous study. A optioneering study on this STW was previously conducted by asset planners in the company. The same options were compared without using the sustainability assessment. The report of this study is not publicly available as it serves for internal use only. Detailed process designs are not included as the technical aspect of each wastewater treatment process is not discussed in this paper.
The criteria hierarchy and weightings developed in this study were then applied to assess the overall scoring of each option. Given that the empirical data of some treatment technologies were not available, the assessment of each indicator (except Opex and Capex) was provided in the format of performance ratings using a 5-point scale based on expert judgements. Opex and Capex of each option was estimated quantitatively by stakeholders using a business costing estimation model and they were converted into the same performance ratings using a linear transformation. Then, a Weighted Sum Model was used to synthesise performance ratings of indicators (v 1 , v 2 , . . . v n ) and their corresponding weights (w 1 , w 2 , . . . w n ) into a composite score S i for the i-th option, denoted as The assessment scope for this study includes pre-treatment, secondary treatment and sludge treatment and disposal of each options. The fulfilment of sustainability criteria and indicators was assessed based on this scope.
The criteria hierarchy and weightings developed in this study were then applied to assess the overall scoring of each option. Given that the empirical data of some treatment technologies were not available, the assessment of each indicator (except Opex and Capex) was provided in the format of performance ratings using a 5-point scale based on expert judgements. Opex and Capex of each option was estimated quantitatively by stakeholders using a business costing estimation model and they were converted into the same performance ratings using a linear transformation. Then, a Weighted Sum Model was used to synthesise performance ratings of indicators (v 1 ,v 2 ,…v n ) and their corresponding weights (w 1 ,w 2 ,…w n ) into a composite score S i for the i-th option, denoted as Visualisation of the composite scores and the performances in the individual sustainability dimensions were provided to assist the comparisons between options. Additionally, the ranking based on composite scores was compared to the previous ranking agreed Visualisation of the composite scores and the performances in the individual sustainability dimensions were provided to assist the comparisons between options. Additionally, the ranking based on composite scores was compared to the previous ranking agreed between stakeholder as a reference. Spearman's rank coefficient test was undertaken to examine the similarity of decision results based on rankings.

Sensitivity Analysis
It is a common practice in AHP to analyze the sensitivity of the composite score and the ranking of options to potential changes in criteria weights. This study included two elements of sensitivity analysis. First, the ranking of options was compared between using the aggregated group weightings and individual weightings to examine the consistency of rankings as a result of different weighting profiles.
The second element was to identify the most critical indicator by calculating the minimum changes required in weights to cause a rank reversal. As the aim of this case study is to identify the best option, only the rank reversal between the top two options were considered. The steps for identifying the most critical indicator were based on the theorems developed by Triantaphyllou and Sánchez (1997) [47]. If the i-th option is the best and the j-th option is the second best by their composite scores (S i > S j ), then the Sustainability 2021, 13, 3831 9 of 21 minimum change δ k,i,j in the weight of indicator C k to cause rank reversal between i an j can be calculated. If the performance of the j-th option is better than the i-th option with respect to the k-th indicator (v jk > v ik ), then If the performance v of the i-th option is better than the j-th option with respect to the k-th indicator (v jk < v ik ), then Additionally, the minimum change δ k,i,j can also be expressed in the relative term as: The new weights of indicator w * k (before normalized) can be expressed as:

Preliminary Interviews and Thematic Analysis
Fourteen stakeholders in the department of asset planning, strategy, delivery and operation participated the interview process. A thematic network was developed to highlight the key themes and codes developed from interview transcriptions (Figure 3). There were 37 nodes (sub-themes) in total and divided into 6 major themes, namely 'Assets', 'Finance', 'Social', 'Risk', 'Resource efficiency' and 'Compliance'. Although this was an interpretation based on fourteen interviews, the results of interviews identified some key decision drivers and challenges in the existing asset decision-making in the water company. In short, the key highlights of the results are:

•
The decision-making process in wastewater asset planning is complex and faces multiple challenges. Making the right balance between different decision-making criteria is difficult in practice.

•
The current investment decisions are primarily driven by the whole life cost and compliance risk whilst there is an increasing demand for an integrated system that incorporates wider dimensions of sustainability. This is similar to the results of decision mapping conducted by Ashley et al. (2008) which suggested that costs and risks are the main drivers of asset decisions in the UK water industry [21]. • Decision support tools should be understandable and communicable at a managerial level.

•
They also need to be flexible to accommodate different STW programmes and adapt to new business needs and priorities.
R PEER REVIEW 10 of 20 Figure 3. Thematic network derived from the preliminary interviews. There are 6 global themes and 37 nodes. Each node represents a specific element in the decision making process, such as a decision driver or challenge.

Criteria Hierarchy
Based on the list of indicators summarised from literature ( Table 1) and results of thematic analysis, the criteria hierarchy was proposed for the sustainability assessment of different wastewater treatment processes (Figure 4). The detailed definitions of selected indicators were summarised in Appendix A. The hierarchy consisting of three levels including the overarching decision objective, the second level of criteria based on the 'three pillars' of sustainability and the lower level of assessment indicators to measure towards those criteria. It is worth noting that the fourth pillar 'Resilience' was also added to the criteria structure to reflect the significance of long-term operational resilience and compliance. The significance of resilience was emphasised from the preliminary interviews with stakeholders as well as the current planning priority in the UK water industry. The latest guideline in the methodology paper published by water industry regulator Ofwat explicitly called for 'Resilience-in-the-round' as one of the key themes for future planning [48]. Therefore, it was deemed necessary to include 'Resilience' in the criteria hierarchy to assess different wastewater treatment options.
Environmental impact refers to the operational impacts of the wastewater treatment process. Energy consumption was commonly used in previous studies [27,30,33]. However, the results of stakeholder interviews also suggested that energy recovery from sludge digestion should also be considered. Energy neutrality is thus more suitable to indicate the net energy use from a process life cycle perspective. Total emission covers The findings indicate assessment tools that explicitly deals with complexity and the integration of multiple criteria would be useful. This supports the choice of MCDA as the methodological foundation of the proposed assessment framework. Additionally, the tool should be simple to communicate and engage with stakeholders.

Criteria Hierarchy
Based on the list of indicators summarised from literature ( Table 1) and results of thematic analysis, the criteria hierarchy was proposed for the sustainability assessment of different wastewater treatment processes (Figure 4). The detailed definitions of selected indicators were summarised in Table A1. The hierarchy consisting of three levels including the overarching decision objective, the second level of criteria based on the 'three pillars' of sustainability and the lower level of assessment indicators to measure towards those criteria. It is worth noting that the fourth pillar 'Resilience' was also added to the criteria structure to reflect the significance of long-term operational resilience and compliance. The significance of resilience was emphasised from the preliminary interviews with stakeholders as well as the current planning priority in the UK water industry. The latest guideline in the methodology paper published by water industry regulator Ofwat explicitly called for 'Resilience-in-the-round' as one of the key themes for future planning [48]. Therefore, it was deemed necessary to include 'Resilience' in the criteria hierarchy to assess different wastewater treatment options.
can be technically translated into the pollutant removal potentials and the reliability of th wastewater treatment technologies. Flexibility was also included to assess the readine of technologies to adapt to future changes such as the removal and upgrade of an ass unit in the system. Future proofing is particularly important for asset decisions in the w ter industry due to climate change, population growth and an evolving regulatory c mate.  Table 3 shows the weights of indicators based on the results of AHP from nine pa ticipants and the aggregated group weights. Based on the group weights, Compliance ha the highest weight (0.266) reflecting the paramount operational objective and responsib ity of the water company. As mentioned in the preliminary interviews, meeting local si consent is always regarded as one of the top priorities of the operation of wastewat treatment processes. Opex was ranked second with a weight of 0.126. This was close followed by Total emission (0.119) and Capex (0.111). Although some participants ind cated that they perceive Opex and Capex as equally important, the aggregated grou weighting showed that Opex was slightly more important than Capex, which reflecte the current strategy to reduce operational greenhouse gas emission and improve inves ment efficiency in the company. On the higher level of the hierarchy, the relative weigh for criteria were obtained as Resilience (0.354), Environmental impact (0.282), Econom viability (0.237) and Social impact (0.127). Similar to previous studies, criteria that impa the treatment compliance are often given the highest weights and social criteria have th lowest weights. Karimi et al. (2011) applied a fuzzy AHP to select wastewater treatme technologies and technical criteria (i.e., pollutants removal performances) were given th highest weight followed by economic criteria and then Environmental criteria [50]. Mor over, weights derived from the study by Molinos-Senante et al. (2014) suggested that e vironmental compliance were the most important criterion of sustainability following b economic and then social dimensions [28]. However, comparing weighting profiles acro different studies provides limited insight because decision priorities and contexts va among studies. Environmental impact refers to the operational impacts of the wastewater treatment process. Energy consumption was commonly used in previous studies [27,30,33]. However, the results of stakeholder interviews also suggested that energy recovery from sludge digestion should also be considered. Energy neutrality is thus more suitable to indicate the net energy use from a process life cycle perspective. Total emission covers both direct and indirect greenhouse gases emission from different stages of the wastewater treatment process. The direct emission mainly includes emissions such as methane and nitrous oxide from the treatment process whereas indirect emission refers to the power purchased outside the company to run the operation [49]. Although chemical consumption is not a common indicator used by previous studies of assessment on wastewater systems, the preliminary interviews with company stakeholder highlighted that it has a direct impact on meeting site consents (e.g., iron and aluminium) and operational costs (i.e., procurement, transport and disposal of an extra amount of solid). Under the resilience category, compliance indicates the overall performance and confidence of the wastewater treatment system to maintain site consents and effluent standards. This can be technically translated into the pollutant removal potentials and the reliability of the wastewater treatment technologies. Flexibility was also included to assess the readiness of technologies to adapt to future changes such as the removal and upgrade of an asset unit in the system. Future proofing is particularly important for asset decisions in the water industry due to climate change, population growth and an evolving regulatory climate. Table 3 shows the weights of indicators based on the results of AHP from nine participants and the aggregated group weights. Based on the group weights, Compliance had the highest weight (0.266) reflecting the paramount operational objective and responsibility of the water company. As mentioned in the preliminary interviews, meeting local site consent is always regarded as one of the top priorities of the operation of wastewater treatment processes. Opex was ranked second with a weight of 0.126. This was closely followed by Total emission (0.119) and Capex (0.111). Although some participants indicated that they perceive Opex and Capex as equally important, the aggregated group weighting showed that Opex was slightly more important than Capex, which reflected the current strategy to reduce operational greenhouse gas emission and improve investment efficiency in the company. On the higher level of the hierarchy, the relative weights for criteria were obtained as Resilience (0.354), Environmental impact (0.282), Economic viability (0.237) and Social impact (0.127). Similar to previous studies, criteria that impact the treatment compliance are often given the highest weights and social criteria have the lowest weights. Karimi et al. (2011) applied a fuzzy AHP to select wastewater treatment technologies and technical criteria (i.e., pollutants removal performances) were given the highest weight followed by economic criteria and then Environmental criteria [50]. Moreover, weights derived from the study by Molinos-Senante et al. (2014) suggested that environmental compliance were the most important criterion of sustainability following by economic and then social dimensions [28]. However, comparing weighting profiles across different studies provides limited insight because decision priorities and contexts vary among studies. Table 3. Individual weightings of 9 stakeholders and aggregated group weightings of the assessment indicators. The ranks of group weights were also shown. The weights in this study were also presented to stakeholders who participated in the AHP process for feedback. The majority of participants indicated that they were satisfied with their own set of weights as well as the group weights. Many found the online questionnaire with pairwise comparisons 'easy to use' and 'straightforward'. Some participants also commented that the questioning style was 'interesting' and 'thought-provoking'. This suggests that using online questionnaires is a viable and efficient way to elicit preference judgements for AHP. The drawback of combining AHP with online questionnaire was that participants were only provided with textual instruction and explanation, which is less effective than a verbal format. Additionally, the use of online questionnaire and mathematical aggregation of weights also means the interaction between different stakeholders may not be captured in the weights aggregation process. However, an empirical study found that weightings developed by aggregating weightings mathematically (i.e., geometric mean) or reaching a consensus by a face-to-face focus group led to very similar results [51].

Score Aggregation and Options Ranking
The group weights were then applied to the performance ratings of indicators for score aggregation. The average performance ratings of each indicator of each option were provided by a group of stakeholders (Table A2). First, the scores of indicators were aggregated into individual criteria based on the criteria hierarchy proposed in Figure 4 and then further aggregated into a composite score for each option ( Figure 5). This enables decision makers to rank options based on their overall scores and also to identify the options with the best performance in each sustainability criterion. The ASP option was scored as the best option based on its composite score (3.48) followed by G-ASP (3.25) and DAF (3.07). Specifically, ASP has the highest scores for the criteria of Social impact (0.49), Economic viability (0.96) and Resilience (1.28). This was reflected by high performance ratings given for ASP on the indicator of Operability and Compliance. ASP was considered as the least risky option given the vast experience the water company had of designing and operating this type of process. It appears that these factors were the substantial drivers of greater desirability of ASP compared to other options. In comparison, the Granular Activated Sludge Process (G-ASP) was the second best option. G-ASP scored the best in the Environmental impact criteria because it is designed to treat wastewater with a very low operational footprint. Although it scored very high on Energy neutrality and Opex, the overall score of G-ASP was compromised by a low score on Capex and a mediocre score on Compliance (Table A2).
ility 2021, 13, x FOR PEER REVIEW ure 5. Scores of seven treatment options of individual assessment criterion and the aggregated composite scores on ht of the bars. A higher score means a greater desirability.
The ranking of the composite score was compared to the previous decision between stakeholders without using the sustainability assessment proposed by this The result of Spearman's rank correlation showed a significant positive correlati 0.75, p = 0.052) between two sets of ranking, indicating a strong similarity (Table sides, the rankings of the top three options are identical. The largest discrepanc rank of De-ammo. In the previous study, De-ammo was perceived as a competitive with great energy savings potentials and a low footprint. However, the option wa a much lower rank in the previous decision due to the concern over land availabil supply chain risk. Overall, the comparisons of the two rankings provided strong ev that this assessment method has the potential to deliver similar decision results us proposed criteria hierarchy and the multi-criteria assessment method. Furthermo assessment method not only allows stakeholders to quickly identify the best op ranking composite scores but also provided a greater level of detail on the perfo of each sustainability criterion across different options. This gives them the optio plicitly consider the trade-offs between different indicators and criteria before mak The ranking of the composite score was compared to the previous decision made between stakeholders without using the sustainability assessment proposed by this study. The result of Spearman's rank correlation showed a significant positive correlation (ρ = 0.75, p = 0.052) between two sets of ranking, indicating a strong similarity (Table 4). Besides, the rankings of the top three options are identical. The largest discrepancy is the rank of De-ammo. In the previous study, De-ammo was perceived as a competitive option with great energy savings potentials and a low footprint. However, the option was given a much lower rank in the previous decision due to the concern over land availability and supply chain risk. Overall, the comparisons of the two rankings provided strong evidence that this assessment method has the potential to deliver similar decision results using the proposed criteria hierarchy and the multi-criteria assessment method. Furthermore, this assessment method not only allows stakeholders to quickly identify the best option by ranking composite scores but also provided a greater level of detail on the performance of each sustainability criterion across different options. This gives them the option to explicitly consider the trade-offs between different indicators and criteria before making the decision. Table 4. Rankings of seven wastewater treatment options based on the composite scores of the sustainability assessment compared to the previous decision made by stakeholders without using this assessment framework.

Sensitivity Analysis
By re-applying different weightings profiles of stakeholder developed from online questionnaires in Section 3.3, the new option rankings are shown in Table 5. Three out of nine weighting profiles led to a rank reversal between the best option (ASP) and the second best option (G-ASP). The results suggested that the rankings of options were generally consistent between the aggregated group weighting and individual weightings profiles. The other part of the sensitivity analysis was to calculate the minimum change in the group weight of each indicator to cause a rank reversal between the top two options. Table 6 shows the minimum weight change δ in both absolute and relative terms. In the absolute term, Capex was the most critical indicator, with the smallest value δ of 0.066. By the definition of Equations (7) and (9), if the weight of Capex (0.111) is decreased by any value larger than 0.066, the rank reversal between the best option and second best option occurs. In the relative term, Compliance was the most sensitive indicator as a 53% change in the value of its original weight would cause the rank reversal. However, given that the original weight of Compliance has the largest weight (0.266) out of all indicators, a 53% change in its original value is not so sensitive in the absolute term. This suggests that Capex is still the most critical indicator in terms of changes in the absolute value and its weight allocation should be revised in future studies. Overall, most indicators can withstand a value change in their weights without causing a rank reversal. However, sensitivity checks should be repeated whenever there are changes made in weights of criteria and indicators. Table 5. The comparison of option rankings between different weighting profiles using the composite scores. Group weightings  1  3  7  6  5  2  4  Stakeholder 1  1  3  6  5  7  2  4  Stakeholder 2  2  3  7  6  5  1  4  Stakeholder 3  1  3  7  6  5  2  4  Stakeholder 4  1  4  6  5  7  2  3  Stakeholder 5  2  4  6  7  5  1  3  Stakeholder 6  2  3  7  6  5  1  4  Stakeholder 7  1  2  7  4  5  6  3  Stakeholder 8  1  3  6  5  7  2  4  Stakeholder 9  1  2  7  6  5  3  4  Table 6. The minimum changes required in indicator weights to cause a rank shift between the best option (ASP) and the second best option (G-ASP). In this study, the combination of methods and the selection of criteria were carefully chosen to meet the purpose and requirement of the wastewater asset decision in the organizational context. The resources required to perform sustainability assessment are important criteria that determine its feasibility and usability [7,52]. It was found that the amount of resources such as expertise and time can influence the willingness of stakeholders to participate in the process. Based on our experience, AHP is simple to set up and pairwise comparisons can be easily made through online questionnaires. Compared to other MCDA methods such as Multi-Attribute Value Theory and outranking methods, AHP can be easily performed in the absence of specialised software. This is a practical benefit in the corporate environment because the provision of additional software implies greater cost and time required for training. In terms of performing pairwise comparisons, the format of online questionnaires was found easy and efficient to collect responses because it enabled quick distribution among the targeted decision makers and provides great flexibility to complete at their own time.

ASP DAF CAPS B-ASP SBR G-ASP De-Ammo
However, simplicity and usability compromise the comprehensiveness of assessment criteria and indicators. For example, some stakeholders mentioned they would expect more criteria and indicators to be included in the criteria hierarchy. Currently, there are only 9 assessment indicators included in the hierarchy, but the list could potentially be expanded for a wider representation of sustainability crtieria. However, practically, a greater number of criteria does not necessarily improve the assessment result and may also overwhelm the decision makers in terms of information processing capacity. Muga and Mihelcic (2008) suggested that the indicators should be easy to handle and limited in number [27]. Saaty and Ozdemir (2003) also pointed out that as the number of criteria increases so does the likelihood of logical error when performing judgements [53]. For example, to complete a 9 × 9 AHP matrix, 36 unique comparisons are required by each stakeholder and this inevitably increases the likelihood of making inconsistent judgements. To reduce the number of pairwise comparisons, this study applied a top hierarchical level of criteria so that indicators at the lower level were nested in one of criterion ( Figure 4). As such, stakeholders were only required to compare indicators within individual criterion of Environmental impacts (3 pairwise comparisons needed), Social impact (1 comparison needed), Economic viability (1 comparison needed) and Resilience (1 comparison needed). This greatly reduced the amount of time to complete pairwise comparisons. Therefore, it is important to achieve a good balance between usability and the completeness of the assessment criteria/indicators. Although the purpose of using a MCDA method is to simplify a complex problem to an operational and manageable structure, the selection of criteria and indicators should be sufficient and relevant to the key decision priorities in the water company. In light of this, the number of criteria and indicators proposed from this study will be continuously reviewed and optimised.
We observed some other practical limitations when using AHP in the water company. First, pairwise comparisons are sensitive to the number of items to be compared as well as changes in weights, which can lead to rank reversal [54]. This implies that a new AHP is required to develop a new set of weights whenever the priorities of the case study shift. For example, stakeholders may disagree with the weightings developed in this study if they are applied to another sewage treatment works with different operational or financial priorities. Therefore, AHP might not be the most flexible weighting method when there are hundreds of sewage treatment works managed by the water company. Secondly, the judgement scale in AHP (1-9 point) can be difficult to use for those who are unfamiliar with the method. The linguistic translation of each scale point can be fuzzy and a greater number of scale point may lead to greater inconsistency in judgements. During the trials of online questionnaires, it was observed that the extreme values of the AHP judgement scale (e.g., 8 and 9) were rarely selected. Additionally, when the full scale was used, the consistency ratios of their judgement were often much larger than 0.1 (i.e., the threshold value of acceptable consistency). Pauer et al. (2016) have also suggested that using a reduced scale may lead to more consistent AHP answers [51]. This may indicate that when decision makers are given more information to make a judgement, their responses become less consistent. A potential remedy is to apply a reduced scale such as a 5-point or 7-point scale.

Stakeholder Engagement
The development process of this framework included stakeholder participation, which has been increasingly recognised as a critical element of a robust sustainability assessment and MCDA [55]. The development process incorporated a variety of engagement methods such as qualitative interviews and online questionnaires. For this study, it was assumed that each stakeholder has an individual system of decision priorities and an interpretation of how the decisions are currently made. Although all stakeholders work under the same organization, their interpretation and understanding can be influenced by their roles in the organization as well as personal opinions. This necessitates a round of interviews with different stakeholders in the preliminary stage of the sustainability assessment to elicit their interpretations of the decision context. Although qualitative interviews and thematic analysis have not been commonly included in applications of sustainability assessment, they are useful for providing a comprehensive account of how the decisions were made and what were the drivers and challenges in that process [56]. This information can be used as a reference to tailor the selection of the assessment method and criteria, which is a key part of the implementation of sustainability assessment into the decision system of the company.
While including stakeholder engagement is crucial, the experience from our study highlighted that introducing a new assessment approach to inform decisions is inherently challenging. First, time availability of the stakeholders was a huge practical factor to be considered when developing the assessment framework. Ideally, engaging with as many stakeholders as possible would be useful for developing representative and generalizable results (e.g., for interviews and online questionnaires). However, stakeholders were often occupied with tasks in the business and the opportunities for engagement were not always available. To facilitate the engagement, the value of developing and using a sustainability assessment tool should be communicated to stakeholders in business terms. For example, the benefits of the proposed assessment framework were explained and justified to stakeholders as 'it can improve our analytical capability and inform wastewater asset investment decisions'. The second challenge was that introducing a new assessment method in the organization causes unfamiliarity. For example, a small number of stakeholders mentioned that the practice of pairwise comparison seems confusing and random based on the first impression of using AHP. Additionally, the use of online questionnaires to collect AHP judgements was whilst generally well received with several complementing its flexibility and simplicity, some found it lacked sufficient explanation in this format. This highlights a potential knowledge gap when translating a well-developed MCDA method from academia into an understandable format for stakeholders to operate in the water company. Despite there has been rigorous development of theory and the mathematics behind AHP, it is not always easy to make an unfamiliar process accessible and practically attractive. One way to improve usability is to embed all mathematic operations behind the user interface as a 'black-box'. For example, the weighting calculations and scores aggregation in this study was done in Microsoft Excel. The equations were all coded into cell functions which were hidden in the user interface. All the data and equations can still be retrieved for revisions and audits. Regardless, sufficient instructions and explanations should always be provided to guide stakeholder to use it and explain the basic mathematical rationale. Besides the previous feedback on the first impression on using AHP, some stakeholders expressed a strong interest in AHP and the composite score approach because it was a novel approach for them to make comparisons and it initiated deeper discussion beyond numbers themselves. Overall, it should be highlighted that introducing a new assessment tool to stakeholders may create unfamiliarity in the company, but the process should be supported by clear communications and careful calibrations of the methodological design. Additionally, stakeholder engagement should also be an iterative process to regularly collect feedback to optimize the assessment framework and aid its implementation.

Conclusions
Given that there is an increasing demand for integrated sustainability assessment tools in the water industry, this study developed a multi-criteria assessment framework for stakeholders in a water company to compare and select wastewater treatment options. The framework provides a user-friendly and simple approach for stakeholders to rank options by composite scores aggregated from multiple sustainability criteria and indicators. It was demonstrated that AHP combined with online questionnaires can be a viable approach to develop weights. The results of composite scores can be easily visualised and used to select the best alternative. The case study showed that ASP was identified as the best option for STW-A and the results were generally consistent in the sensitivity analysis. Our experience suggested that while introducing well-established assessment methods from academia seems attractive, usability should also be incorporated into the development and implementation of the assessment methods. Specifically, user-friendliness and time were important factors. This necessitates a careful selection and adjustment of methods to achieve a balanced design to address those practical challenges. Another highlight is that stakeholder engagement should also be included in the development stages of the methodology, enabled by qualitative methods. The use of interviews and thematic analysis can develop a basic understanding of how decisions are currently made and what are the decision priorities. This preliminary information then can be used to guide the selection of assessment methods and criteria so they are compatible with the stakeholder preferences and the decision context in the organization.
The development and optimisation of the sustainability assessment framework presented is an explorative and iterative process and it will be reviewed and updated progressively. Further implementations of the assessment framework will provide accumulative underpinning knowledge and validation to deliver ongoing sustainability decision support in the water companies. Although the criteria hierarchy and weights presented in this research were developed for a specific water company, the methodology and insight can be extrapolated to perform sustainability assessments in other water companies and corporate settings. The author also would like to thank the cohorts and team of the Doctorate programme as well as the colleagues at RD&I department in Thames Water Utilities for their support and insightful comments throughout the project.

Conflicts of Interest:
The authors declare no conflict of interest.
Research Ethical Statement: All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of Research Integrity and Governance Office at University of Surrey (Ref: UEC 2018 081 FEPS).
Appendix A Table A1. The basic definitions of all indicators used for the assessment and the type of indication. Positive indication refers to the preference or desirability increases with the value of that criterion whereas the negative indication refers to the opposite direction of preference.

Indicators
Criteria Definitions Type of Value Indication

Energy neutrality
Environmental impact Net carbon consumption of the wastewater treatment process (Consumption minus recovery from sludge) Negative

Environmental impact
Total of direct and indirect carbon emission associated with the wastewater treatment process Negative

Chemical consumption
Environmental impact The total amount of chemical use in the operation of wastewater treatment (e.g., chemical dosing and polymer) Negative

Economic viability
Cost related to materials (consumables), staff cost (operators), power consumption, hired and contracted services (e.g., transport; service contract for specific treatment process)

Economic viability
Capital cost related to the construction and commissioning of the treatment process or technology. Negative

Flexibility Resilience
The ability of technology/process to adjust or upgrade to adapt to climate change, population growth and regulatory changes.

Compliance Resilience
The ability and the overall confidence of technology/process to meet the site compliance such as flow and quality consents and risks to failure.

Social impact
The odour impact of the treatment process and sludge storage on the community Negative

Social impact
The ease to operate the process, which is associated with the manpower resource as well as the level of skills and training required for operators.
Positive Table A2. The average performance ratings of sub-criteria among six stakeholders. The lowest rating '1' refers to the poorest performance of that indicator whereas '5 refers to the best. It was assumed that the rating scale were of an interval scale. The ratings were provided from a previous internal stakeholder meeting.