Current State and Future Direction for Building Resilient Water Resources and Infrastructure Systems

Planning and developing resilient socio-technical and natural systems to cope with and respond to unprecedented changes has been one of the top goals of government bodies, researchers, and practitioners worldwide. This study aims to review how resilience is defined and evaluated in water resources and infrastructure systems (hereafter water systems) and propose a framework to analyze and incorporate resilience in the system. Two questions guide the review: How is resilience defined in water systems compared to other disciplines? What are commonly used resilience measures and methods applicable to water systems? Based on the review, a resilience analysis framework has been proposed. The framework uses a system of systems approach and applies hierarchical holographic modeling to address the complexity of interdependent systems. The resilience of the systems was analyzed using three questions: resilience of what, resilience to what, and resilience for whom. Two resilience measures selected for the analysis are robustness and rapidity. The framework also includes methods for uncertainty analysis, options for resilience strategies, and multi-criteria decision analysis methods to select optimal resilience options. The review is not exhaustive due to the broader topic but aims to present necessary background information to support the proposed framework.


Introduction
Planning and developing resilient socio-technical and natural systems to cope with and respond to unprecedented changes has been one of the top goals of government bodies, researchers, and practitioners worldwide. In the United States, resilience initiatives for defining and saving the nation's critical infrastructure systems started after the September 11 attacks in 2001 [1]. The need to incorporate resilience principles to respond to natural disasters and build communities' resilience was realized after Hurricane Katrina hit Louisiana in 2005 [2][3][4]. The UK government states that "resilience has long been an integral part of the UK's approach to national security and crisis management. We have well-tested risk assessment, risk management, and response and recovery measures in place to cover a wide range of scenarios" [5]. The European Union (EU) plans to strengthen its resilience further, to become more prepared for future shocks, and to emerge stronger by intensifying the transitions [6]. The policy statement of the Asian Development Bank's (ADB), one of Asia's largest development funding agencies, states that "ADB will take a holistic approach to enhance adaptation and resilience. ADB will invest in more projects with climate adaptation as their primary purpose while promoting strong integration of the ecological, social, institutional, and financial aspects of resilience across its operations" [7]. The World Bank Group, another development giant, also emphasizes that building climate resilience be front and center of the development agenda [8]. Moreover, resilience is set to remain a core part of the global agenda and policy recommendations of the United Nations [9,10].

Defining Resilience of Water Systems
Resilience definitions evolved since Holling first explained it in the 1970s [15]. Resilience is considered as the capacity of a system to absorb disturbance and re-organize while undergoing change so as to still retain essentially the same function, structure, identity, and feedbacks [16]. This definition, similar to the one forwarded by Holling [15], holds three characteristics of resilience: (1) the amount of change a system can undergo or the amount of stress it can sustain and still retain the same controls on functions and structure, (2) the degree to which the system is capable of self-organization, and (3) the degree to which the system expresses capacity for learning and adaptation.
Holling proposed two types of resilience: engineering and ecological resilience. Engineering resilience is based on the understanding of resilience in materials science. It describes the ability of a system, close to a stable point, to return quickly to this stable point after a shock [17]. The main focus of engineering resilience is on the state of balance to which it will return after having recovered from a shock. Engineering resilience is often interpreted as the system's robustness or resistance [16]. Ecological resilience describes the resilience of complex adaptive systems. Complex adaptive systems are formed with a large number of components or agents which can learn or adapt [18].
The resilience of a social or ecological system is the ability to absorb disturbances while retaining the same basic structure and ways of functioning, self-organization, and the capacity to adapt to stress and change [19]. Ecological resilience accepts the unpredictability of systems, and the system returns to one of the multiple possible equilibrium states [20]. Ecological resilience also assumes that the system is dynamic and the ability of the system to reorganize through unstable domains to a new equilibrium state [21]. Resilience could be viewed as the intrinsic capacity of a system, community, or society susceptible to shock or stress to adapt and survive by changing its non-essential attributes and rebuilding itself [22]. Table 1 summarizes resilience definitions from an engineering, social, ecological, economic, and disaster standpoint.

Discipline
Definition Key Attributes Engineering system (see Engineering resilience [23]; Engineering systems [24] Critical infrastructure [3,25]) Ability to anticipate, absorb, adapt to, and/or rapidly recover from a potentially disruptive event.
Engineering resilience describes the ability of a system to reduce the magnitude and/or duration of disruptive events.
Ability to anticipate, ability to absorb, ability to adapt, and ability to recover.
Social system (see Social resilience [26,27]; social and ecological resilience [21]) Ability of groups or communities to tolerate, absorb, cope with, and adjust to external stresses and disturbances as a result of social, political, and environmental change.
Ability to cope with stress/disturbances and ability to absorb change and retain relationships between people or state variables.
Ecological system (see Ecological resilience [15,20,21,28,29]) Ecological resilience describes the resilience of complex adaptive systems with a large number of components or agents which are able to learn or adapt. In the ecological resilience approach, the system returns to one of the multiple possible equilibrium states.
Ability to absorb disturbance; re-organize while undergoing change; adapt; and retain the same functions, structure, identify, and feedbacks.
Economic system (see Economic resilience [30,31]) Ability of the systems to withstand either market or environmental shocks without losing the capacity to allocate resources efficiently.
Capacity to survive, ability to recover, and ability to adapt.
Disaster (flood and earthquake related) (see Seismic resilience [32] [33]; climate risk [34]; flood resilience [6]) Ability of social units to mitigate hazards, contain the effect of disasters when they occur, and carry out recovery activities in ways that minimize social disruption and mitigate the effects of future disasters.
Ability to reduce chance of failure, ability to absorb shocks, and ability to recover and retain structure and functions.
As summarized in Table 1, resilience has been defined slightly differently in each discipline. However, the key attributes of resilience included in most disciplines are similar. They include the ability to anticipate, absorb, and reorganize while changing; adaptive capacity; and ability to retain the same functions, structure, identity, and feedback. More discussion on resilience concepts can be found elsewhere [21,26,[35][36][37].
Water systems are complex. They encompass infrastructure and social, ecological, and economic dimensions. A complex system is characterized as non-deterministic, dynamic, and as having functions that cannot be precisely localized; furthermore, the system comprises emergent properties which are not directly accessible through an understanding of its components [38,39]. The complexity in the system usually encompasses heterogeneous sub-systems or autonomous entities, which vary across space, time, and organizational units [11]. They also exhibit nonlinear dynamics with thresholds, reciprocal feedback loops, time lags, resilience, heterogeneity, and surprises [11,13].
There is no standard definition of the water system's resilience that is well accepted in the published literature. Water systems consist of social, natural, and engineered systems. The resilience attributes of this sector are closely represented by the resilience of infrastructure and ecological systems. Thus, we adopt the resilience definition proposed by NIAC [3], "ability to anticipate, absorb, adapt to, and/or rapidly recover from a potentials disruptive event." All the resilience features are essential for the water systems to adapt and recover from any shock or stress. This definition emphasizes reducing the likelihood of failure and the need to recover from unexpected disturbances in the operating environment.

Measures to Quantify Resilience in Water Systems
The resilience concept was first introduced in the water sector by Hashimoto et al. [40] to evaluate the performance of water resources systems. The three performance criteria used for the performance evaluation were reliability, resiliency, and vulnerability. The resiliency criteria explain how quickly a system is likely to recover or bounce back to a satisfactory state from failure once a failure has occurred. If failures are prolonged events and system recovery is slow, this may have serious implications for system design. The resilience measure was quantified statistically and defined as the probability of recovery (R) to the satisfactory state (S) at time step t + 1 once a failure (F) has occurred at time step, t (Equation (1)).
where resilience, R, is equivalent to the average probability (P) of a recovery from the failure. X t denotes a system's output state or status at the time, t; the set of all satisfactory outputs, S; and the set of all unsatisfactory (failure) outputs, F. Similar to Hashimoto et al. [40] work, other studies also considered resilience as one of the performance measures. Moy et al. [41] evaluated the operational measures of reliability, vulnerability, and resilience in water supply reservoir performance. They developed multi-objective mixed-integer linear programming to optimize these measures and traded them off one against the other. Fowler et al. [42] applied the three measures to assess the impacts of climate change in the water resources system in the UK. Kjeldsen and Rosbjerg [43] reanalyzed the same measures to select the best combination to assess the sustainability of the water systems. Jain and Bhunya [44] applied reliability, resiliency, and vulnerability to compare reservoir design and operation alternatives with Monte Carlo simulations. Similar measures were also considered in other water-related studies [45][46][47][48][49]. A review of quantitative resilience measures for water infrastructure systems is available in Shin et al. [50].
A detailed analytical framework of resilience-focused measures and methods was developed in seismic and disaster disciplines. Bruneau et al. [32] suggested robustness, redundancy, resourcefulness, and rapidity as the primary measures of seismic resilience in technical, social, economic, and organizational dimensions of a community. Robustness measures the strength or the ability of systems and other units of analysis to withstand a given level of stress or demand without suffering degradation or loss of function. Redundancy measures the extent to which elements, systems, or other units of analysis exist that are substitutable (i.e., capable of satisfying functional requirements in the event of a disruption, degradation, or loss of functionality). Resourcefulness measures the capacity to identify problems, establish priorities, and mobilize resources when conditions exist that threaten to disrupt some element, system, or other units of analysis. The objectives of enhancing resilience were to reduce failure probabilities; reduce consequences from failures in terms of lives lost, damage, and adverse economic and social impacts; and reduce recovery time. Bruneau et al. [32] proposed a conceptual view of system resilience that assesses the evolution of the system's performance over time, Q(t), in the aftermath of a perturbation. They defined the loss of resilience as where Q(t) represents the system's performance after the perturbation, which varies between the perturbation beginning time, t 0 , and the end of the recovery time, t 1 . The performance can range from 0% to 100% (or 0 to 1 for probability analysis), with 100% meaning no degradation in service and 0% meaning no service available ( Figure 1). Several authors used Bruneau's measure to evaluate resilience, with or without modification, according to their research purposes. Kahan et al. [2] proposed a comprehensive framework to protect critical infrastructure incorporating several dimensions of community and the systems. Cimellaro et al. [51] further elaborated Bruneau's framework to define disaster resilience and applied it in the hospital network. Cimellaro et al. [52] applied and defined a method to combine loss estimation and recovery models to evaluate the resilience of critical facilities. Bonstrom and Corotis [53] further extended robustness, resourcefulness, and recovery measures and developed a quantitative probabilistic framework for measuring seismic resilience for a building portfolio. They quantified resilience based on the robustness and rapidity of a portfolio system, which are related to hazard-induced losses and building recovery.
Other studies developed an indicator-based method of resilience analysis to account for and address the multi-dimensional issues of the community and interconnected infrastructure systems. Cutter et al. [54] proposed a set of indicators -selected in ecological, social, economic, institutional, infrastructure, and community competence dimensions, to assess disaster resilience at the local or community level. The resilience measures considered were preparedness, vulnerability, absorptive capacity, and adaptive capacity. Francis and Bekera [37] proposed a resilience framework with five components: system identification, vulnerability analysis, resilience objective setting, stakeholder engagement, and assessing resilience capacity. Table 2 summarizes commonly used resilience measures in water and other sectors. Further discussion on various resilience measures and analysis frameworks developed for the critical infrastructure system, transportation, power, railway, and disaster system is available in other studies [55][56][57][58][59]. Table 2. Commonly used resilience measures in water and other disciplines.

Application Area and Reference Resilience Measures
Water resources [40] Resilience as a system's recovery rate Seismic resilience of a community and infrastructure systems [32,33,51] Robustness, redundancy, resourcefulness, and rapidity Disaster resilience [60] Robustness and rapidity Ecological resilience [16] Latitude, resistance, precariousness, and panarchy Resilience of power and water system [61].

Robustness and rapidity
Economic resilience to disaster [62] Inherent ability and adaptive equilibrium Built-in system [63]. Diversity, efficiency, adaptability, and cohesion Water resources systems [47] Resilience against regime change, resilience for response/recovery, and resilience for adaptive capacity/management Supply chain resilience [64] Resistance and recovery Disaster resilience [65] Preparedness, vulnerability, absorptive capacity, and adaptive capacity Critical infrastructure system [66]. Absorptive capacity, adaptive capacity, and restorative capacity Urban climate resilience [34] Flexibility and diversity, redundancy andmodularity, safe failure, responsiveness, resourcefulness, and capacity to learn Resilience in energy sector [67] Plan and prepare for, absorb, recover, and adapt General framework applied for resilience assessment of electric power network [37] Adaptive capacity, absorptive capacity, and recoverability Resilience of railway system [57] Absorption, adaptation, and recovery The three approaches commonly applied to quantify selected resilience measures are qualitative, quantitative, or combined. Qualitative assessment uses conceptual frameworks and indicators, whereas quantitative assessment applies system modeling, system dynamic modeling, probabilistic analysis, empirical analysis, and machine learning algorithms. A combined approach utilizes both qualitative and quantitative methods. More discussion on methods of resilience quantification adopted in critical infrastructure systems applicable to water systems is available elsewhere [68][69][70].
The resilience measures applied to quantify and incorporate disaster resilience in critical infrastructure systems seem more promising than those available in the water sector. This study adopts robustness and rapidity as measures of resilience. The motivation for choosing two measures is to separately assess the system's pre-and post-disaster performances. Robustness represents the strength or the ability of systems to withstand a given level of stress or demand without suffering degradation or loss of function. Rapidity measures the system's capacity to meet priorities and achieve goals on time to contain losses, recover functionality, and avoid future disruption. The selected measure could be quantified by applying one of the three or combined approaches of resilience quantification. Examples include indicator-based, physically based system models (i.e., hydrology and hydraulic performances including floods, droughts, and wildfire), empirical approaches, agent-based methods, system dynamics-based approaches, theory-based economic approaches (e.g., input-output-based method), network-based approaches, and others (e.g., Bayesian network, hierarchical holographic modeling method). In several cases, analysis and modeling approaches are coupled with multiple models.

Proposed Framework
A framework to analyze and incorporate resilience in water systems is shown in Figure 2. Robustness and rapidity rate are two resilience measures proposed to evaluate and incorporate resilience in water systems. The framework has seven major steps, and each step is described in the following sections.

Establish Purpose and Scope of the Analysis
The first step of the resilience framework is to establish the purpose and scope of the study by answering three questions: resilience of what, resilience to what, and resilience for whom. The overall purpose and scope of the analysis are defined in consultation with experts and stakeholders using the DPSIR (Driver-Pressure-State-Impact-Response) framework.

Stakeholder and Expert Involvement
Stakeholders are individuals or groups that have a stake, or interest, in a particular issue, either because they can affect a decision or policy or because they will be affected [71]. Stakeholders are preferably local people who understand the consequence of failures to business, the economy, society, institutions, and the environment. By participating, they will provide specific knowledge and pragmatic guidance to identify critical systems; the main performance criteria to be evaluated; and minimum service levels required at the scope definition and implementation of standards, guidance, and plans for risks reduction. Stakeholder participation can help to explore cultural and local institutional contexts and determine the kinds of resilience strategies people utilize [72]. It is noted that consultation with experts and stakeholders is expected in all steps of the resilience analysis framework (see Figure 3).
The involvement of experts from academia, scientific and research entities, and networks will be instrumental for scenarios building; understanding emerging stressors; and increasing research with regional, national, and local application. The collaborative efforts support action by local communities and authorities and support the interface between policy and science for decision-making. The Delphi approach and other expert elicitation tools which engage experts in an iterative process of problem definition and analysis have proven helpful in eliciting opinions on complex climate issues, such as possible future changes in temperature and precipitation [73].

Application of DPSIR Framework
The DPSIR (Driver-Pressure-State-Impact-Response) framework allows for analyzing cause and effect relationships between interacting components of complex social, economic, and environmental systems. The European Environment Agency developed the framework to be used as a unifying platform for environmental data collection, categorization, and dissemination [74,75]. The framework has been widely used for system and model conceptualization and interdisciplinary indicator development in a complex system. The DPSIR framework list five parameters: driving force, pressures, state, impact, and response, as shown in Figure 3. The driving forces (or drivers) refer to fundamental social processes and underlying human activities that lead to environmental change. The pressures are the specific human activities that result from driving forces that impact the environment. The state is the condition or quality of the environment and trends in that condition brought about by humans or other pressures. The impacts are how changes in state influence human well-being, the economy, equity, and quality of life. Responses generally refer to institutional efforts to address state changes, as prioritized by impacts. Each of the parameters and their relationships is illustrated in Figure 3.
The DPSIR framework is used to identify answers for the three key questions, "resilience of what", "resilience to what", and "resilience for whom", interacting with stakeholders and experts. Table 3 presents application examples to water systems. As shown, driver and pressure parameters will help understand the "resilience to what"; the state and impact parameters help set the thresholds of the system's performances and explore "resilience of whom". Similarly, response parameters help discuss the "resilience of what" part of the analysis. It is noted that "resilience to what" defines disturbances to be included in the analysis; "resilience of what" sets the boundaries of the system of interest, and "resilience for whom" defines the system to be preserved or changed to meet the overall resilience in the system. 2. The resilience of water availability: capacity maintain the minimum water quality standards in a given period and total time required to restore to its expected normal quality in the future.
Gradual type forces: • Similar sources as listed for point 1.
• Quality degradation of water bodies including river, reservoirs, lake, and wetlands. • Increase saltwater intrusion in groundwater aquifers due to sea level rise.
Rapid type forces: • Similar sources as listed for point 1.
• Water quality degradation of water bodies, including rivers, reservoirs, lakes, and wetlands. • Physical (sediment load), chemical (increasing nutrient concentrations), and biological load to the water supply and wastewater treatment systems. • Cross-contamination due to structural failures. • Cross-contamination from the damages of water distribution infrastructures such as dams, canals, pipes, and valves due to flood.

Analyze Performances
In this step, analysis approaches, methods, and computer models are selected to analyze the water system's performance. Decisions about the components and sub-systems of the water systems to be analyzed and the minimum level of performances or benchmarks to be met from them are also made.

Hierarchical Holographic Modelling of Systems
Two approaches are proposed to deconstruct interdependent systems. Firstly, large water systems are decomposed into sub-systems based on a system of systems approach [76][77][78][79][80][81]. The system of systems approach allows us to understand the entire system and its dependencies, thus propagating the risks of the sub-systems to the overall system. The sub-systems are further decomposed into different dimensions, such as social, technical, and economic dimensions, in the principle of hierarchical holographic modelling [82,83]. The system of systems and hierarchical holographic modelling methods accounts for all the essential elements of the resilience assessment process, particularly highlighting the vulnerabilities in a system that can be tractable and representative. Figure 4 depicts a water supply system that has been deconstructed into separate sub-systems: water sources, water treatment systems, and water distribution systems. Each element in the hierarchy is referred to as a "node". The topmost node is the "root node" and represents the highest level in a system-of-systems hierarchy. Any node stemming from the root node is a "parent node". Parent nodes are further deconstructed into "child nodes". Any nodes that terminate in the hierarchy-and thus do not have any child nodes-are known as "leaf nodes". The framework is operationalized by evaluating performances in each of the components. For example, water supply failure risk at the supply point is calculated as the combined risks of the three sub-systems (i.e., water sources, water treatment, and water distribution systems) (Figure 4). Water supply failure in the system could occur -if a failure occurred in one or more of the three main sub-systems. It is possible that if a failure occurs in one sub-system, another may compensate and prevent supply failure. The hierarchical framework allows for tracking vulnerable components of a system with different sources of the hazards and the process of propagating the risk to the system performance [78]. Thus, the system approach allows us to identify a source of failure in each component and analyze if a system is robust and the recovery rate is acceptable.
The failure analysis is done either by system modelling or by network analysis (e.g., fault tree and event tree methods) or by using the performance indicators and assigning the weights to each branch. The same analysis approach is applicable for analyzing flood risks due to failure of the hydraulic structures and drainage systems and water quality risks due to failure of the treatment systems in a system of systems.

Selection of Methods and Approach for the System Analysis
Three approaches commonly applied for system performance analysis are qualitative, quantitative, or combined. Qualitative assessment uses conceptual frameworks and indicators [65,84], whereas the quantitative assessment applies system modelling [85][86][87], agent-based modelling [88,89], system dynamic modelling [90,91], probabilistic analysis [37,92], empirical analysis [93], and machine learning algorithms [94,95]. The combined approach utilizes the data and information of both qualitative and quantitative methods. Selection of any analysis methods will depend on the objectives of the analysis, data availability, and technical expertise.

Identify and Analyze Uncertainty
Multiple sources of uncertainty exist in water systems, including (i) imperfect knowledge of the physical process, (ii) nonlinear systems and nonlinear interactions between different sub-systems, (iii) poor knowledge of system models (e.g., model parameter uncertainty), (iv) ignorance (e.g., future policy, technology, radiative forcing), and (v) natural and spatial variability. The framework suggests uncertainty analysis in every step of the analysis, starting from the scope analysis (step 1) and thoroughly during the robustness and recovery assessment of any components, processes, and functions of the water system (steps 4 and 5).
There are multiple methods for uncertainty analysis applicable in water systems. For example, probabilistic method [96,97], scenario analysis [98,99], expert elicitation [100,101], sensitivity analysis [102], and extended peer review by stakeholders [103]. The uncertainty can be described in the form of a distribution function, random set, fuzzy membership function, or scenarios. Uncertainty can be propagated by an analytical method or by random simulation such as Bootstrapping, Monte Carlo simulations, Latin Hypercube sampling, or fuzzy alpha cut technique [76].
Water resources and urban water systems are complex and may require a more integrated uncertainty analysis approach to describe and propagate uncertainties. A scenarios analysis and robustness analysis [104] are commonly adopted methods for a complex system. An analyst can select a method considering the source and type of uncertainty. The selection of an approach, alone or in combination, should be guided by the sub-system, data availability, techniques of modelling, and confidence necessary from the analysis.

Analyze Resilience
The resilience of a system is evaluated in step 5 and compared with the resilience thresholds in step 6. The robustness and rapidity are analyzed using methods and models selected in step 2 and addressing uncertainty in step 3.

Quantification of Resilience
The method to quantify resilience used in this study is adopted from methods applied in critical infrastructure systems resilience, community to disaster resilience, and seismic risks resilience [32,51,53,55,61]. Figure 5 shows a schematic representation of resilience in water resources and infrastructure systems. The vertical axis is the system's performance, Q. For example, it may be minimum water supply pressure in the water distribution system or the minimum environmental flow rate in the stream. The performance is usually normalized to a scale of 0 to 100% with Q d representing a nominal pre-disturbance performance level of a system. The performance of the system is reduced to a level of Q a at the time a stressor of magnitude, F, acts on the system. The rapidity of recovery, r, is the time between the disturbance event, t 0 , and the time at which a pre-disturbance level of performance is recovered, t r . Figure 5. Schematic to measure resilience of water systems (modified from Cimellaro et al., [52]).
The robustness measures the strength or ability of a system and process to withstand given stress or demand without suffering degradation or loss of functions. It represents the residual functionality right after any perturbation event. The robustness also means the system's resistance against failure and indicates the residual performance of the system after any hazardous event.
The robustness (ρ = Q d − d) is defined in terms of the ratio of total performance losses resulting from a stressor and potential total loss associated with a 100% reduction in system performance and given by The rapidity measure or recovery rate is the rate or speed at which a system can recover to an acceptable level of functionality after the event's occurrence. It represents the slope of the functionality curve during the recovery time, and it can be expressed as follows: Resilience (R) is defined as the normalized shaded area under the dimensionless performance function. This is mathematically defined by the following equation: The loss of resilience (R l ) is the difference between a normalized resilience of 100% and the reduced resilience given an earthquake event: As defined by Chang and Shinozuka [60] a probabilistic measure of engineering resilience is the probability of satisfactory performance, Q, and recovery, r, given the severity (magnitude) of an event, F: where d m and r m represent the magnitude of minimum acceptable performance at the time perturbation. The performance threshold for each component and system is identified in steps 1, 2, and 3. The robustness and rapidity measures are evaluated by identifying and addressing multiple sources of uncertainty (step 3).

Incorporating Resilience
The resilience of the system is analyzed in step 5. If the quantified results of resilience measures are in an acceptable range, the analysis process is forwarded to step 7. In contrast, if the quantified values are not within the acceptable range, the resilience of the system has to be improved by making necessary changes in the components or processes of the systems (step 6). For this, the resilience features are added after careful evaluations of all components, processes, functions, and overall performances of system measures (step 4) and comparison with the threshold values obtained in step 5. This phase requires a systematic, disciplined approach for assessing and improving the resilience in the system.
The overall goal of resilience-building is to ensure acceptable robustness and rapidity in the system. The resilience in the water systems is achieved by adding and improving key resilience attributes such as flexibility, diversity, redundancy, modularity, and adaptability [24,34,[105][106][107]. A flexible system could meet service needs under a wide range of future conditions. Adding modularity with a spatially distributed layout will link the service areas and reduce affected areas in case of failures. Similarly, redundancy, multi-functionality, and adaptability features will improve the resilience of a system by absorbing the potential stressors and ensuring service delivery. Moreover, decision-makers could adopt several strategies at the project planning phase. Examples include no regrets strategies, soft strategies, adaptability and multi-functionality strategies, safety margin strategies, and safe failure strategies. A brief description of each of the resilience strategies proposed for the water systems is presented in Table 4. Table 4. Strategies for incorporating the resilience in water resources and infrastructure systems.

Resilience Strategy Strategy Examples Applicable for the Water and Infrastructure Systems
No-regret strategies: No regret strategy yields benefits even if a system and its components do not experience the expected stressors. This type of strategy addresses current development priorities and keeps open or maximizes options for future drivers of change.
• Long-term strategic planning of water supply and demand, water quality, and ecosystem services on a basin scale. • Establishment of early warning systems of the weather and natural hazards, including extreme rain and snowfall, heatwave, cyclones, landslides, wildfires. • Integrate water and land use planning at the beginning of the urban development planning process and evaluate options for water conservations, water infrastructures development, and operations. • Adopt nexus approach (water-food-energy-climate) during future water demand assessment. • Update standards and guidelines towards water-saving approaches and practice innovative technologies.

•
Research in water-saving technologies, utilize green infrastructure, and drought-resistant plants and crops. • Practice risk analysis and risk management while planning and developing interdependent infrastructure systems.
Soft Strategies: Soft strategies apply the institutional or financial tools for building resilience against stressors. The advantage of "soft" options implies much less irreversibility than structural or hard intervention measures.
• Strategic planning of a water supply and demand, and water security • Apply nexus (water-food-energy-climate) and integrated approaches during institutional, financial, and economic policy formulations.

•
Price adjustment in water uses, reuse, and recycle. • Stakeholder engagement in planning and development.
Adaptability and multifunctionality strategies: Adaptive management is iterative feedback and learning-based strategy to cope with risk in decision making in a context of uncertainty. The multifunctionality of a system supports response diversity in the process and functions provided to expedite the recovery rate. The adaptation pathway is shaped by the evolving scientific evidence and societal attitudes to stressors. The main emphasis is on the process and continuous trial and error, small step-evaluate-adjust strategy.
• Build a prototype of the innovative technology, approach, and method before applying it in local conditions. • Test natural, technological solutions in a small area to evaluate its performance to protect water quality and combat flooding impact.

•
Monitor and assess the direct and cascading failure of other interdependent infrastructure systems to the water system • Plan for alternative flood defence options for sea-level rise, tidal surge, fluvial flooding, and urban flash flooding. • Test appropriate technology and practices to improve productivity and water efficiency in agricultural areas • Adopt green infrastructure systems and update performances on water conservations, water quality, and flood accommodations.
Safety margin strategies: The strategy aims to modify a system structure in the design phase to make the implementation tasks easier and inexpensive. This strategy helps to improve infrastructure resilience by accommodating expected or unexpected future stressors. Often modifying a system structure after it has been built will be difficult and expensive.
• Increase storage capacity of the newly built reservoirs.

Resilience Strategy Strategy Examples Applicable for the Water and Infrastructure Systems
Safe failure strategies: This strategy aims to build a system so that failure in one part of the system will not lead to cascading failures of other elements or related systems; if a system fails, the recovery rate will be rapid, and risks of failure will be minimum. It is noted that water systems are natural and engineered. An option to improve the resilience will vary according to the system and local conditions. Any addition or modification made to the system would incur additional costs, trade-offs, or externalities. Therefore, feasible options are selected after multi-criteria decision analysis (MCDA).
Multi-Criteria Decision Analysis (MCDA) is the process of reaching a decision through the consideration of available alternatives, guided by various measures, rules, and standards called criteria. The criteria to be used will be decided after consultation with stakeholders and experts. For example, cost-benefit ratio, social acceptability, life-cycle cost, and performance levels. The MCDA process involves the definition of objectives, arranging them into criteria, identifying possible alternatives, and then measuring the consequences. More detailed discussions on selecting an MCDA method and procedure are available elsewhere [78,[108][109][110][111].

Decision Making and Documentation
The system must be continuously monitored to ensure the resilience of the existing system, identify new sources of stressors, and evaluate the system's responses. New system stresses, if perceived, are incorporated into current system analysis. This step recognizes that resilience analysis is a dynamic process and in need of updating sources of hazard, vulnerability, and system performances.
This step will also define the roles and responsibilities of agencies and decision-makers in implement the resilience measures identified from the analysis, as the water systems are managed and operated by multiple private and public agencies. The decision-making and documentation phase will also evaluate available resources, capabilities, plans to implementation, policy imperatives, legal issues, and potential impact on stakeholders. The role of stakeholders and decision-makers is crucial in this step.

Conclusions and Recommendations
This study was guided by three key questions: (1) how is resilience defined in the water sector compared to other disciplines? (2) What are commonly used measures of resilience that can be applied to the water sector? (3) How can we develop a holistic modeling framework that allows us to analyze and incorporate resilience in water systems? Some of the key findings of each of the three questions are summarized below.

Defining Resilience in Water Systems
There is neither a universal definition of resilience nor a widely accepted water sectorspecific definition of resilience. Resilience holds positive connotations, and the definition varies from sector to sector, system to system, type of disruptive event, and analysis objective (see Table 1). Water systems are comprised of both natural and engineered systems; therefore, we adopted the resilience definition for water systems proposed by NIAC [3]: "ability to anticipate, absorb, adapt to, and/or rapidly recover from a potentially disruptive event." This definition emphasizes reducing the likelihood of failure and the need for fast recovery from unexpected disturbances in the operating environment.

Resilience Measures to Assess and Incorporate Resilience in the Water Systems
There are no well-accepted resilience measures (see Table 2). Earlier resilience studies in water systems considered resilience one of the performance measures of risk or sustainability [40,[43][44][45]. Most resilience studies in water systems ignore the multifaceted interactions between human, natural, and built systems.
We considered robustness and recovery rate as two measures of resilience. Similar measures are suggested in other studies [55,60,61]. When the system is loaded by an external stressor, the system will attempt to withstand changes. If the system is robust enough, it will bounce back to its normal state when the stress is released from the system. The bouncing back or recovery rate will be either as per the design expectation of the engineering system or by the process of adaptation in a natural system. The robustness, in this case, measures the intrinsic nature of the resilience before any perturbation takes place in a system. If the system is further loaded, it could bounce back usually or adapt at a certain limit. The rate of recovery will depend on the system's robustness and indirectly indicates the redundancy and resourcefulness measures considered in the disaster resiliency community. Therefore, the recovery rate measures the after-disaster response of a component or system of consideration.

A hierarchical System-Based Resilience Framework
We developed a resilience analysis framework to evaluate and incorporate resilience in water resources and infrastructure systems. The framework includes seven steps (Figure 2). The evaluation process starts by setting the purpose and scope of the study by analyzing the three main questions, "resilience of what", "resilience to what", and "resilience for whom", to formulate the objective and scope of the resilience analysis.
The main attributes of the framework are acknowledgment of the complexity, uncertainty, and multi-sector involvement in water systems. The framework seeks consultation with experts and stakeholders at every step of the resilience analysis to address multidisciplinary issues. The system-of-systems approach enables analysis of the performances of sub-systems and assessment of their contributions to the overall performance. Several resilience strategies in water systems are recommended in the framework, such as no regrets strategies, soft strategies, adaptability and multifunctionality strategies, safety margin strategies, and safe failure strategies (Table 4). Uncertainty analysis is required in every step of resilience analysis.
Resilient water systems can be achieved by adding flexibility, diversity, redundancy, modularity, and adaptability. Any interventions to the system would incur additional costs, trade-offs, or externalities. Therefore, feasible options are selected after multi-criteria decision-making analysis. It is noted that the framework can be applicable in any scale, which could be in a smaller watershed to river basin level to a regional scale. The temporal and spatial selection will be defined at the first time of defining the purpose and scope of the analysis.

Reccomendations
This study proposed a framework to analyze the resilience of water systems considering their complexity, uncertainty, and multi-dimensionality. The framework can be applied to analyze and incorporate resilience by utility operators, municipalities, and water resource planning agencies responsible for planning, managing, and building water and infrastructure systems. To operationalize the framework using real case studies is beyond the scope of this paper. Our ongoing work will demonstrate a few applications in the future. We recommend interested parties to test and further improve the developed framework in different case studies. In addition to this specific recommendation, we outline the following challenges in resilience analysis and recommendations for future study.

Defining Clear Objectives of a Resilient Water System
Any disastrous event will damage infrastructure systems and have a cascading impact on public life and property. Analysis approaches vary with the boundary and scope of analysis. For example, a water resilience analysis for meeting the water demand and ensuring water resource security will be different from the analysis of the resilience of communities to water-induced disaster. In this context, it is difficult to define the "resilience of what?" and the "resilience of whom?" There is not a clear boundary to stop the resilience analysis. Future studies should define the realistic objectives and boundaries of water systems resilience.

Defining the Measures and Metrics of Resilience
Maintaining both the functions and structures of water systems is essential to achieving resilience. Water systems are complex, dynamic, and constantly evolving due to human interventions and climate change. Natural and human systems can adapt to dynamic changes to a certain extent. In dynamic systems, both the stressors and resistance are changing. Therefore, selecting thresholds or benchmarks for each component of a system or sub-system to evaluate the resiliency goals and deciding the measures and metrics of the resilience are challenging undertakings. In addition, it is not easy to define and select a standard measure that is applicable for different environmental and economic settings. We recommend future studies to evaluate threshold values and metrics for resilience analysis.

Developing Methods Dealing with Complexity and Uncertainty
Water resources and infrastructure systems have multifaceted interactions between human, natural, and built systems. The systems are managed by multiple operators, regulators, and users. Natural and built sub-systems, such as dams, reservoirs, surface water, and groundwater, have interactions with the hydrological cycle. Climate changes can shift and alter future hydrologic events and water demand. There are challenges in measuring the scale of the system solely from the examination of parts.
Similarly, multiple sources of uncertainties in water systems exist due to imperfect knowledge of physical processes, non-linear systems, non-linear interactions between different sub-systems, and poor knowledge of system models. Feedbacks and nonlinearities of any complex system are generally perceived as the source of surprises. Surprises that have low frequency and high consequences are the source of unexpected functionality that unlocks a shift into a new regime. There are ongoing studies to resolve these challenges; however, the quantitative evaluation of resilience may not be possible unless we have well-established methods to analyze complexity and uncertainty.
Funding: This research received no external funding.