The Resilience of Critical Infrastructure Systems: A Systematic Literature Review

: Risk management is a fundamental approach to improving critical infrastructure systems’ safety against disruptive events. This approach focuses on designing robust critical infrastructure systems (CISs) that could resist disruptive events by minimizing the possible events’ probability and consequences using preventive and protective programs. However, recent disasters like COVID-19 have shown that most CISs cannot stand against all potential disruptions. Recently there is a transition from robust design to resilience design of CISs, increasing the focus on preparedness, response, and recovery. Resilient CISs withstand most of the internal and external shocks, and if they fail, they can bounce back to the operational phase as soon as possible using minimum resources. Moreover, in resilient CISs, early warning enables managers to get timely information about the proximity and development of distributions. An understanding of the concept of resilience, its inﬂuential factors, and available evaluation and analyzing tools are required to have effective resilience management. Moreover, it is important to highlight the current gaps. Technological resilience is a new concept associated with some ambiguity around its deﬁnition, its terms, and its applications. Hence, using the concept of resilience without understanding these variations may lead to ineffective pre- and post-disruption planning. A well-established systematic literature review can provide a deep understanding regarding the concept of resilience, its limitation, and applications. The aim of this paper is to conduct a systematic literature review to study the current research around technological CISs’ resilience. In the review, 192 primary studies published between 2003 and 2020 are reviewed. Based on the results, the concept of resilience has gradually found its place among researchers since 2003, and the number of related studies has grown signiﬁcantly. It emerges from the review that a CIS can be considered as resilient if it has (i) the ability to imagine what to expect, (ii) the ability to protect and resist a disruption, (iii) the ability to absorb the adverse effects of disruption, (iv) the ability to adapt to new conditions and changes caused by disruption, and (v) the ability to recover the CIS’s normal performance level after a disruption. It was shown that robustness is the most frequent resilience contributing factor among the reviewed primary studies. Resilience analysis approaches can be classiﬁed into four main groups: empirical, simulation, index-based, and qualitative approaches. Simulation approaches, as dominant models, mostly study real case studies, while empirical methods, speciﬁcally those that are deterministic, are built based on many assumptions that are difﬁcult to justify in many cases.


Introduction
Modern societies excessively depend on highly interconnected technological systems such as water supply, electric power supply, telecommunication, and transportation networks that provide vital services. Such a system is known as a critical infrastructure system to ineffective pre-and post-disruption planning. A well-established systematic literature review can provide a deep understanding regarding the concept of resilience, its limitation, and its applications. The systematic literature review provides a systematic approach using a search protocol that is not a part of a traditional literature review. Thus, in this approach, selecting the studies is not by a chance and it increases the accuracy and credibility of the obtained results [26]. Therefore, this paper implements a systematic literature review to explore how resilience is defined in the discipline of technological CISs, what types of CISs' resilience have been used, what types of concepts affect CISs' resilience, and which approaches have been used for resilience analysis of CISs.
The rest of the paper is organized as follows: In Section 2, the systematic literature review technique and its phases are described. Section 3 shows the application of the systematic review of CISs' resilience. Section 4 illustrates the discussion about the results of the systematic review. Finally, Section 5 concludes our findings.

Systematic Literature Review
Typically, researchers turn to the literature review of a particular subject to answer the questions that arise in their minds. In such cases, the researcher searches for relevant articles and studies only according to his previous idea of the subject. This search then continues until reaching the desired publications and selecting the appropriate ones and ends by summarizing the obtained results. This type of literature review is called a nonsystematic (narrative/traditional) literature review approach. Such review is frequently limited to literature either already known to the researcher or is found by doing a quick search [26]. This approach is not clear about the methods used for finding and selecting studies [27].
Moreover, it is not repeatable, which makes them challenging to follow by other researchers. However, according to a series of exact and pre-arranged stages, a systematic literature review (SLR) approach was developed to perform all the steps mentioned. SLR was firstly proposed by Kitchenham [28]. It is a means of recognizing, assessing, and explaining all accessible literature related to a specific research question. This technique requires considerably more effort. However, due to the well-defined methodology, the likelihood of biased results is lower than with the traditional method.
The application of SLR has spread from medicine and health care into other fields. Recently, its application in some engineering fields, specifically software engineering, transportation systems, manufacturing, etc., has increased [29][30][31][32][33][34]. Later on, Kitchenham and Charters [35] modified this technique and provided a guideline to facilitate its application. Based on the new procedure, SLR can be summarized in three phases: (i) planning, (ii) conducting, and (iii) reporting the review. Figure 1 shows the SLR phases. As this figure illustrates in the first steps, it is necessary to identify the need for SLR. For example, if someone needs to identify the knowledge gaps in a specific field, then SLR is a justified approach.
Then, based on the study aim, research questions must be defined clearly and properly. Later on, the whole review methodology will be carried to answer these questions. The related search terms (keywords) should be specified based on defined research questions to have an effective SLR. Therefore, these keywords will be used to find scientific publications in the defined databases (search sources). Digital databases like ScienceDirect and Springer-Link are some of these search resources [35,36]. To facilitate the review process, some screening criteria need to be defined as inclusion and exclusion criteria. For example, the year of publication or excluding the book chapters from the collected studies can be an exclusion criterion. Finally, a data extraction strategy must be established for data collection [35,36]. The data extraction strategy determines how the relevant data items needed from each study will be collected. To collect the needed data items, a data extraction form should be designed. This form is used to collect data (e.g., full reference, research topic, definitions, applied methods, publication data, etc.) which are necessary to answer the defined research questions. If the data need manipulation or assumptions and inferences to be made, the procedure should state a suitable validation process [35]. In phase 2, the review is conducted. Based on the defined keywords in the defined scientific database, the possible documentation and studies are identified in this phase. After that, using the inclusion and exclusion criteria, the relevant documentation will be selected. These remaining studies are named primary studies. Later, the required data should be extracted from the primary studies using the data extraction strategy. Finally, these data should be synthesized to answer the research questions [35,36]. Finally, at the end of the review process, the previous phases' obtained results should be documented.

SLR on Technological CISs' Resilience
In this study, the developed approach in Figure 1 was used for SLR on CISs' resilience. As was explained above, the need for SLR should be justified. Since our communities highly rely on interdependent CISs such as electric power supply systems, natural gas processing plants, communication networks, transportation networks, and water distribution networks, it is crucial to assess the resilience of such CISs against various disruptive events. However, for efficient application of the concept of resilience, this concept should first be understood properly. Without this awareness, managers would not be able to minimize the total service loss (e.g., energy supply cut) and enable expeditious recovery of the CISs. Thus, it is necessary to systematically review the available resilience-related research studies that can increase our awareness and knowledge about the concept of resilience, its assessment approaches, and those factors that influence the CISs' resilience. Moreover, for years, as mentioned in the introduction, planners' and managers' main goal was to design a CIS that can resist various disruptions. However, based on the abovementioned experiences, this goal is still unattainable, and attention has recently shifted to the concept of resilience. To have an effective shift to resilient CISs, our awareness and knowledge about its concepts and current practices and research should be increased. However, it should be emphasized that knowledge and practice of resilience should be inline with the risk management process.
After this step, the research question (RQ) needs to be developed. This stage is the most significant part of the review process. Based on this paper's purposes, the first author developed the research questions and then discussed them with all authors to confirm. The defined research questions are as follows: • RQ 1 :Is resilience an emerging topic in the discipline of technological CISs?
To answer RQ 1 , it would be useful to examine the research output per year. Based on the amount of published resilience-related studies per year, we can determine how the importance of resilience is changing with time. Sometimes a research area may lose its appeal over time or can no longer conduct further research.
• RQ 2 : How is the term "resilience" defined in the discipline of technological CISs?
Without a comprehensive definition of the concept of resilience, its application will be challenging. Therefore, RQ 2 is defined to understand the current definition of the concept of resilience in technological CISs' discipline using the literature review.
For what type of technological CISs has the concept of resilience been used?
This question identifies the most dominant CISs in which the concept of resilience has been applied. The identified studies should be classified based on their type of CIS for answering this question.
• RQ 4 :What types of terms define/determine the resilience of technological CISs?
Resilience is a macro-concept; thus, to use it correctly, it is essential to identify and understand all terms (e.g., reliability and redundancy) that determine it. Without this understanding, it will be complicated to apply the concept of resilience in the CISs. Moreover, exchanging the knowledge and best practices between different operators and users will be almost impossible. Therefore, by reviewing the definitions and approaches for interpreting and analyzing technological CISs' resilience, the associated terms will be identified.

•
RQ 5 : What types of approaches have been used for technological CISs' resilience analysis?
Many approaches can be used for CISs' resilience analysis. Hence, a literature review should be conducted to identify and classify all of these methods based on their essence

Developing a Review Protocol
In this stage, the search terms, search resources, and primary studies selection criteria have been identified as follows:

Identify the Search Terms
Based on the addressed research questions, "system resilience", "resilience analysis", "resilience metric", "resilience assessment", "resilience definition", and "resilience concept" are considered as search terms. Figure 2 shows the ultimate search string that was applied in this SLR. This string uses Boolean operators (AND, OR).

Identify the Search Resources
Here, well-known digital resources were considered as search resources. They include ScienceDirect, Springer-Link, IEEE Xplore Digital Library, SAGE Journals, MDPI, and Wiley Online Library. Moreover, other databases such as Web of Science and Google Scholar can be used for the search process.

Identify the Selection Criteria
A large volume of studies can be collected by using search string and research resources. The defined inclusion and exclusion criteria decide which studies need to be processed further. The considered inclusion and exclusion criteria are listed as follows:

Results and Discussion
The search was conducted on 2 October 2020 using search sources and the search string. Accordingly, 1454 resilience-related studies were found. Figure 3 shows the search process, from which 182 primary studies remained for final review. Furthermore, according to the references in these included studies and complementing studies of relevance known by the authors, an additional 10 published studies were added. In total, 192 primary studies were hence included in the review process. As Figure 4 illustrates, in the boundary of the inclusive and inclusive criteria, selected primary studies were 13% of all resilience-related studies. In the search process, around 51% of studies were irrelevant research studies, which were not linked to the CIS. Furthermore, 9% of the studies were related to nontechnological domains such as social and ecological domains. In such domains, researchers occasionally used the term "system" for social or ecological entities. This caused these types of research studies to emerge from the initial search process.
Identified primary studies have been studied to recognize which resource these studies belong to. As Figure 5 shows, most of the primary studies were published in ScienceDirect. The selected primary studies are presented in Appendix A.

Is Resilience an Emerging Topic in the Discipline of Technological CISs?
The number of published papers and their trends can reveal how the importance of the concept of resilience varies with time in the discipline of technological CISs. Accordingly, all of the 158 primary studies were classified based on their publication date. The results are illustrated in Figure 6. According to this figure, the concept of resilience was first presented to the technological CISs' discipline in 2003 by Bruneau et al. [12]. As can be seen, after 2003, this concept has gradually found its place among the technological CISs' managers and researchers. Since the concept of resilience has a high impact on technological CISs' safety and productivity, research is expected to continue to grow in future years.
As shown, since 2011, the volume of publications in this area has been increased significantly. The 2011 Tohoku earthquake and tsunami in Japan, we believe, could be one of the specific drivers for such rapid growth in the resilience-related researches volume of technological CISs. Furthermore, terrorist or cyber-attack increment (e.g., the Ukraine power grid cyber-attack that affects 225,000 people in 2015) led to the recognition of the necessity to address the increasing threat that these attacks can pose on our societies' CISs, especially those that supply vital needs like energy. This issue has been led to commence large-scale research projects (e.g., starting of European IMPROVER project in 2015) and establish many CIS's resilience research centers worldwide (e.g., the establishment of the US Critical Infrastructure Resilience Institute (CIRI) in 2016). Such activities have motivated researchers and led to the growth of publication volume about the CISs' resilience since 2015.    The application of the concept of resilience is new in the discipline of technological CISs. Moreover, the complexities of these CISs and disruptive events are increasing day by day. Hence, many research gaps need to be addressed to make the best use of the concept of resilience. This indicates that the concept of resilience is an emerging and important topic.
Energy systems are designed to supply the required energies of the public and industries. They can be divided into three main categories: electric power supply system (electric grid), gas distribution system, and heating grids. As shown in Figure 8, the electric grid is the dominant area in energy systems' resilience-related studies. The electric grid can be categorized as electric power generation, transmission, and distribution systems. Figure 9 shows the share of these categories in the review studies. As can be seen, 45% of the primary studies in the field of electric grid cover all the components of this grid. Moreover, electric power transmission systems have received more attention compared to the other components of the electric grid.   For the electric grid classification, the frequency of research conducted in various transportation modes (i.e., airway, railway, roadway, and waterway transportation networks) is categorized in Figure 10. As shown, the roadway transportation networks (e.g., traffic networks, road network, highways, and bridges), waterway transportation networks (e.g., ports and inland waterways), airway transportation networks (e.g., airports and aviation navigation systems), and railway transportation networks (e.g., subways, railway stations) have received more attention, respectively. Moreover, 23% of primary studies have worked on the transportation networks' resilience, generally without focusing on a specific transportation mode. According to the studies conducted in the water supply network field, six studies have worked on water distribution systems' resilience. In contrast, the rest of the studies have focused on the entire water supply network (i.e., purification, storage, pumping, and water distribution systems). As shown in Figure 7, the resilience of energy systems has attracted enormous notice., because these systems supply the vital needs of today's societies and their proper functioning is essential in many respects, such as national security, health, and social welfare. Most of the energy systems are network-oriented, and their disruption can lead to cascading failure in other systems. For example, disruption of an electric power supply system can lead to loss of not just electricity, but also water and communication in affected areas [2]. As a real example, in 2011, the Tohoku earthquake and tsunami in Japan damaged much of the energy systems, leaving 4.4 million households without electricity and 1.5 million consumers without access to safe drinking water. Moreover, about 10 days after the event, there were still 2.43 million households without electricity and 1.04 million consumers without water [91]. Moreover, the reduction of CO 2 and sustainable development have recently increased the importance of the energy systems transition from traditional systems (e.g., fossil-fuel-based systems) to systems based on renewable energies (e.g., wind turbines) [37,[92][93][94]. Therefore, researchers have paid more attention to energy systems' resilience, especially in countries prone to natural disasters (e.g., hurricanes, ice storms, and earthquakes) or sabotage. However, as shown, some technological CISs have not received a lot of attention. For instance, the health care infrastructures of many countries were seriously affected by COVID-19 in 2020. The pressures of this pandemic have exceeded the capacity of health care in many countries. To manage such pressure, attempts need to be made to address our communities' health care infrastructures' resilience in various respects such as medical devices, organizational, and human resources. Failure of health care CISs in the early stage of any disruptive event, can trigger panic in the community and lead to functionality loss in other CISs in the community.
Furthermore, telecommunication systems, supply chains, and industrial processes are also fields that their resilience needs to study. The proper functioning of these CISs is essential in many respects, such as national security, health, and social welfare. Therefore, the severe consequences of the disruption of such CISs should lead researchers to pay more attention to their resilience, especially in countries prone to natural disasters (e.g., hurricanes, ice storms, and earthquakes) or terrorist attacks.

How Is the Term "Resilience" Defined in the Discipline of Technological CISs?
Based on the collected literature, there is a wide range of resilience definitions. In this section, the presented definitions of resilience in the discipline of technological CISs are described.
Bruneau et al. [12] asserted that a resilient system "reduced failure probabilities, reduced consequences from failures, in terms of lives lost, damage, and negative economic and social consequences, and reduced time to recovery". Youn et al. [95] proposed a definition for the engineering system's resilience: "The conceptual definition of engineering resilience is the degree of a passive survival rate plus a proactive survival rate". Liu et al. [96] described resilience as the "system ability to prepare and respond quickly and reduce its losses". Barker et al. [97], Shafieezadeh and Ivey Burden [98], and Kong and Simonovic [99] defined resilience as the "proportion of affected performance of the system after disruption to the normal performance of the system without disruption". Teodorescu [100] defined resilience as the "ability of a system to recover its normal performance level". He believed resilience is a function of the recovery process duration and the system performance level after the recovery process. According to Nan et al. [101], resilience is the "system's ability to withstand a change or a disruptive event by reducing the initial negative impacts, by adapting itself to them and by recovering from them". Jin and Gu [102] defined resilience as the "system's ability to resist potentially high impact disruption events, and it is characterized by the ability of the system to mitigate or absorb the impact of disruption and quick recover to normal conditions". Hosseini and Barker [17] described resilience as the "ability of a system to adjust its functionality in the presence of disturbances, external threats, and unpredicted changes, and to withstand internal and external disruptive events without letting the system become discontinuous by performing system functionalities". Nan and Sansavini [103] defined resilience as the "system's ability to resist the effects of a disruptive force and to reduce performance deviations". According to Zhao et al. [104], resilience is the "system's ability to recover to the desired level from disruption event within an acceptable time, which is measured by the total satisfaction level under the interactions of resilience capacities, dispatch strategies, and disruption events". Yoon et al. [8] defined resilience as the "ability of a system to maintain its required functionality by resisting and recovering from disruptive events". Lu [105] defined resilience as the "system's ability to recover rapidly from the worst performance under disruption to its original state". According to Karangelos et al. [46], resilience of a power system can be defined as the "system's ability to withstand a much wider range of atypical, yet not entirely unlikely, operating conditions characterized by the simultaneous failure of several system components, or at least to deteriorate moderately and promptly recover". Furthermore, in Zhang et al. [54], a resilient power system is defined as a "system that could anticipate possible disruptions, adopt effective actions to reduce the losses of system components and load before and during disruption, and recover power supply quickly". Faber et al. [106] defined resilience as "an aggregate characterization of systems encompassing their ability to maintain their main modes and levels of services, and on their own to develop and mobilize resources to adapt to and sustain disturbances over time". Wang et al. [39] defined resilience as the "ability to resist all kinds of hazards, withstand the consequences of initial incident, and quickly restore back to normal operation". Rehak et al. [107] defined CIS's resilience as a "quality that reduces vulnerability, minimizes the consequences of threats, accelerates response and recovery, and facilitates adaptation to a disruptive event". Cutini and Pezzcia [108] defined urban networks' resilience as the "capacity of an urban grid to maintain the operation of urban functional assets by redistributing movement after a physical perturbation". Mostafavi [59] defined CIS's resilience as the "ability of an infrastructure asset to maintain its performance to serve the required functions before, during, and after the occurrence of a natural hazard". Some studies described resilience as the "system's joint ability to resist (prevent and withstand) any possible hazards, absorb the initial damage, and recover to normal operation" [109][110][111][112]. The US National Academies of Science (NAS) defined system's resilience as the "system's ability to prepare and plan for, absorb, recover from, and successfully adapt to disruptive events" [77,113,114]. According to the US Presidential Policy Directive (PPD), resilience can be defined as the "system's ability to prepare for and adapt to changing conditions and withstand and recover rapidly from disruptive events" [88,115]. The American Society of Mechanical Engineer (ASME) defined resilience as the "system's ability to sustain external and internal disruptions without discontinuity of performing the system function or, if the function is disconnected, to fully recover the functions rapidly" [75,116]. The US National Infrastructure Advisory Council (NIAC) described resilience as the "system's ability to anticipate, absorb, adapt to, and rapidly recover from a potentially disruptive event" [15,117,118]. The United Nations Office for Disaster Risk Reduction (UNISDR) defined resilience as the "ability of the system, community or society exposed to hazards to resist, absorb, accommodate, adapt to, transform and recover from the effects of a hazard in a timely and efficient manner, including through the preservation and restoration of its essential basic structures and functions" [60,117,119,120]. As said, European project IMPROVER also proposed a definition of resilience. Based on this definition, resilience is "the ability of a CIS exposed to hazards to resist, absorb, accommodate to and recover from the effects of a hazard in a timely and efficient manner, for the preservation and restoration of essential services" [121]. The CIGRE C4.47 Power System Resilience Working Group defines power system resilience as the "ability to limit the extent, severity, and duration of system degradation following an extreme event" [2]. The UK Energy Research Center defined the resilience of energy system as the "capacity of an energy system to tolerate disturbance and to continue to deliver affordable energy services to consumers. A resilient energy system can speedily recover from shocks and can provide alternative means of satisfying energy service needs in the event of changed external circumstances" [122].
The variation of definitions presented in the discipline of technological CISs' resilience suggests that there is still no agreement between researchers about the definition of resilience. Here are some highlights regarding the available definitions of resilience:

•
The majority of the definitions emphasized the CIS's ability to recover to the normal operation state from a disruptive event. Therefore, recoverability can be considered a critical part of the technological CISs' resilience.  [59], Faber et al. [106], UNISDR, and IM-PROVER's definitions, a resilient CIS preserves its needed functionality after a disruptive event. This means the CIS's performance level should not be less than a certain value after the disruption.

•
Hosseini and Barker [17] and ASME's definitions show that a resilient CIS should not let its functionality become discontinuous.

•
The CIS's ability to anticipate the disruptive event is only considered explicitly in Bruneau et al. [12], Zhang et al. [54], and NIAC's definitions.

•
None of the definitions considers the positive influence of the CIS's learning capacity on its resilience. Learning from previous experiences would be helpful in the upcoming disruptive events.
Based on the reviewed definitions of resilience, a resilient technological CIS possess the following five abilities:

•
The ability to imagine what to expect; • The ability to protect and resist a disruptive event; • The ability to absorb the adverse effects of a disruptive event; • Adaptability to the new conditions and changes of disruptive events; • The ability to recover the CIS's normal performance level after a disruptive event.
Some pre-disruption (preparedness) and post-disruption (recovery) activities need to be established to achieve these abilities. Pre-disruption activities target the CIS's ability to protect, resist, absorb, adapt, and recover, and post-disruption activities target the CIS's ability to adapt and recover [7,117,120,[123][124][125][126]. According to Faturechi et al. [123], investment costs on the CIS's preparedness are mandatory, whether a disruption occurs or not.

What Types of Terms Define/Determine the Resilience of Technological CISs?
According to the reviewed studies, 20 terms were identified as contributing factors to CISs' resilience. The types and percentages of application of the identified terms are shown in Figure 11. As can be seen, robustness is the most frequent term among the reviewed primary studies. The definitions of the identified terms are described in the following.
Robustness: Robustness is the ability of a CIS to resist a certain disruption level and absorb its primary effects without significantly reducing performance. A CIS with high robustness maintains its central function in a disruptive event [3,12,80,123,127,128]. Robustness is measured by the CIS's amount of residual performance after a disruption [75,115]. Furthermore, survivability, resistant ability, and stability have a similar definition to robustness [75,126,129,130].
Recoverability: Recoverability is defined as the CIS's ability to recover its capacity and performance promptly under certain conditions and using available resources [117,129,131]. In some studies, such as Barker et al. [97], recoverability is defined as the CIS's ability to recover its performance quickly. In this definition, the importance of the resources required to perform the recovery actions is not considered.
Rapidity: The rate at which a CIS can recover a satisfactory performance level is called rapidity [3,115,132,133]. Rapidity refers to the CIS's performance curve slope during the recovery process and is often known as the recovery rate [80,134,135]. Absorptive Capacity: The absorptive capacity is the degree that the CIS can absorb the negative impact of the disruptive event automatically. This capacity is considered an intrinsic CIS's characteristic to minimize the disruptive event's adverse effects. For example, for those CISs (e.g., electric power supply systems) that have been located in areas with a harsh environment, it is very hard to repair the disrupted components in severe storms. Therefore, they should be capable of absorbing various component failures [46]. Absorptive capacity includes a set of proactive actions (e.g., by safer design) that should be considered in the CIS's preparedness phase [136][137][138][139]. Robustness is commonly used to quantify the adsorptive CIS capacity [101].
Adaptive Capacity: The degree to which the CIS can arrange itself and use temporary and often non-standard actions to prevent CIS downtime during and after the disruption is called adaptive capacity. This capacity can prevent sudden collapses in the CIS's performance level [17,128,136,137,140]. However, as mentioned, these actions have a temporary nature, and for the CIS's performance recovery, permanent actions should be taken as soon as possible.
Restorative Capacity: This capacity is the degree to which the CIS can effectively restore its damaged performance and is typically affected by available budget and resources (e.g., materials, human, etc.) [17,139,141]. Therefore, this capacity is affected by the CIS's supportability. Restorative capacity provides permanent solutions to damages caused by the disruption. Therefore, the cost of restorative capacity is much more than an adaptive capacity [136]. Rapidity is commonly used to quantify the CIS's restorative capacity [142].
Vulnerability: This denotes the degree of the CIS's sensitivity to disruption [140]. Vulnerability is an inherent feature of the CIS, even before any disruptive event. Paying attention to the CIS's vulnerability would help identify the weak points of the CIS. These are pos-sible points that would cause the most damage during disruption [97,129,136,140,143,144]. During the disruptions, more vulnerable CISs will experience more severe failure [144]. Although there is no consensus about the relationship between vulnerability and resilience, it seems that higher vulnerability of the CIS leads to lower resiliency and vice versa [9,107].
Early warning and predictability: This refers to the early detection of disruptions in the CIS and directly influences the CISs' recovery [81,117]. In some reviewed studies, the Prognosis and Health Management (PHM) system is used as a useful tool for prediction and then early warning. The PHM system assesses the CIS's current state by monitoring facilities, anticipates potential defects by analyzing the monitoring data, and assists in the proper management of technological CISs throughout their life cycle [8,95,117].
Early warning and predictability will provide timely information to implement efficient response measures against disruptive events. Therefore, it can positively affect the dedicated costs and time for the CISs' recovery process. It should be noted that the positive outcome of early warning is obtained by the adequate and appropriate decision-making of the CIS's management. Thus, inadequate or improper decisions may make the situation even worse. However, based on the reviewed studies, predictability and early warning as well as the role of management decision making has been less considered in the available literature. Since, in resilient CISs, early warning enables managers to get timely information about the proximity and development of disruption, more attempts need to be made in this part.
Flexibility: The CIS's ability to reacts to disruption and adjust its internal mechanism with the help of adaptive capacity, without the consideration of any prior responses, is called flexibility [83,123,145,146].
Redundancy: This refers to the degree to which components, CISs, and other units exist that are interchangeable and can meet functional needs in the presence of disruption [3,12,57,108,140,147,148]. According to Kong and Simonovic [99], redundancy creates alternative functions for the CIS's operation, and its goal is to achieve a robust CIS. Thus, it increases the absorptive capacity of the CIS [80,103,149].
Reliability: The CIS's ability to implement the needed performance under certain conditions and over some time without loss of performance is called reliability [8,75,131,133,143]. When the CIS is in a normal state (before a disruptive event), reliability provides its essential function [143]. The aim of absorptive, adaptive, and restorative capacities is to enhance the CIS's reliability degradation due to disruptive events [8,17,95,136,137]. Mathias et al. [150] noted that reliability focuses on avoiding disruptions, while resilience also counts the CIS's recovery. Therefore, reliability and recoverability are complement and greatly related to the CISs' resilience [131].
Resourcefulness: Resourcefulness is the capacity of using and mobilizing the required resources (e.g., money, information, technology, laborers, spare parts, etc.) to identify and solve problems in a prioritized manner [3,12,80,115,140,149]. The goal of resourcefulness is to enhance the CIS's rapidity. Hence, it also increases the restorative CIS capacity [149,151].
Maintainability: The term maintainability is a measure of how easily the CIS is repaired to a specified condition [117,136]. Most studies have used recovery speed or recovery time to quantify CIS maintainability [45,76,78,85,100,102,[152][153][154]. Therefore, if the time required to recover the CIS is short, it indicates proper CIS maintainability. The aim of absorptive, adaptive, and restorative capacities is to increase the ease of CIS recovery by reducing the CIS's damages caused by disruption [8,17,95,136,137].
Supportability: This ability refers to the intrinsic features of the CIS that facilitate efficient and effective support of the CIS throughout its life cycle [17,136]. According to the resourcefulness definition, this term affects the CIS's supportability.
Availability: Generally, the availability of CIS is defined as the CIS uptime ratio to the total CIS uptime and downtime. Thus, the CIS's availability refers to the portion of time that the CIS can be used [97].
Learning Capacity: This capacity is the degree to which the CIS can learn from the occurred disruption to prevent similar future events [136,150,155]. The obtained knowledge from past events can be incorporated for future iterations [150]. Few researchers have used this term. Therefore, this issue requires further research to clarify the effect of learning capacity on CIS.
Organizational Resilience: The owner's organization's resilience plays a significant role in CISs' resilience. Incorporating resilience aids CISs in coping with crises and recovering from disruptive events, which is impossible without an organizational foundation that can cope with and respond to disruption [107,156]. Organizational resilience includes all actors involved in CIS management, such as managers, personnel, and operators. The final purpose of organizational resilience is to enhance organizational performance in the face of irregular conditions and to create a problem-solving mentality at the organizational level [157]. Rehak et al. [107] tried to determine organizational resilience using internal processes of an organization, including risk management, innovation processes, and education/development processes. Based on Rehak et al. [107], the internal processes' levels of an organization, which provide the proper conditions for CIS to adapt to disruptions, can determine organizational resilience. However, based on the results, resilience analysts have paid less attention to this critical term because there is no easy way to analyze and assess organizational resilience. Thus, the main issue that should be addressed is to present a comprehensive explanation of organizational resilience. This will increase our understanding of this term and encourage other researchers to conduct research in this area.
According to the reviewed studies, society's role in the CISs' resilience is not well studied. As Qin and Faber [55] noted, "The major benefit associated with the concept of resilience as compared to traditional probabilistic reliability and risk modeling of systems is that resilience modeling accounts for the interrelations between the performances of the natural systems and the social systems and their internal organization". Therefore, the overall resilience of any CIS will be affected by the resilience of the three mentioned actors, i.e., engineered system (technological resilience), owner organization (organizational resilience), and society (social resilience).
The role of the reviewed terms in a resilient technological CIS is summarized in Figure 12. CIS normal operation starts at t 0 . During the pre-disruption period (t 1 − t 0 ), the performance level of the CIS is controlled by its reliability, and the potential defects are anticipated by analyzing the monitoring facilities' data. While the disruption begins at t 1 , the CIS applies its absorptive capacity (e.g., using redundant systems) and resists the disruption's adverse impacts. However, during damage propagation (t 2 − t 1 ), the CIS's adaptive capacity incorporates its absorptive capacity. Thus, CIS shows flexibility as well as robustness. As shown, the level of disrupted performance refers to the CIS vulnerability, and the remaining level of performance refers to the CIS robustness.
After disruption ends at t 2 , recovery actions should be taken. However, sometimes, due to delayed decisions, lack of readiness of the infrastructure owner organization, unavailability of required resources, etc., the CIS stays in the disrupted state for a while, i.e., (t 3 -t 2 ). Therefore, the CIS's proper restorative capacity and an adequate level of organizational resilience can reduce this period. Finally, at t 4 , recovery actions are completed, and a new cycle of the CIS's function starts. The CIS's performance level in the recovered status may be different from its primary performance level. This level can be equal to, lesser than, or even above the primary level. The recoverability of CIS depends on its supportability and maintainability. It should be noted that supporting CIS and learning from experiences should be activated throughout the CIS life cycle.

What Types of Approaches Have Been Used for Technological CISs' Resilience Analysis?
Despite years of studies on technological CISs' resilience, researchers in this discipline still do not agree on a specific resilience analysis approach. To date, various approaches have been proposed for resilience analysis. The proposed approaches have their interpretation of the CISs' resilience. Some consider either preparedness or recovery activities, and some consider both of them. Based on the reviewed studies, technological CISs' resilience analysis approaches can be classified into four main groups: empirical, qualitative, indexbased, and simulation approaches. Figure 13 illustrates the distribution of primary studies in these groups. As can be seen, simulation approaches were predominantly applied among the reviewed primary studies. Simulation Approaches: These approaches examine how the CIS structure affects its resilience. The behavior of the CIS should be observed in these approaches, and its properties must be modeled or simulated [6,157]. Simulation approaches are mostly used in CISs such as energy systems (e.g., electric grids), transportation networks, and water supply networks [17,75,131,137,158,159]. These approaches commonly apply Bayesian networks (BNs) [17,116,[135][136][137]160], Monte Carlo simulation [161,162], fuzzy models [72,163], and optimization algorithms [65,164]. BNs are mostly used to deal with uncertainties and partial data and information [135,137]. However, they are limited to systems with static nature. For instance, Kammouh et al. [135] used a dynamic BN to consider the dynamic nature of a transportation system by adding the dimension of time. In simulation approaches, optimization algorithms are usually used to optimize the resilience of engineering systems.
Simulation approaches are complex, costly, and time-consuming approaches and need a high level of expertise as well as extensive technical, operational, organizational data and information. Hence, their application is limited to organizations with these capabilities.
Empirical Approaches: In empirical approaches, the CIS's resilience is analyzed using the CIS's failure and recovery curves, which are usually determined by analyzing historical data of the CIS (e.g., time to failure and time to repair data) [157]. In these approaches, apart from the CIS's structure, the CIS's performance levels before and after the disruption are compared to each other. Empirical approaches are either deterministic [12,15,85,132,[165][166][167][168] or probabilistic [89,95,100,115,117,126]. Deterministic approaches do no use any probabilistic parameters in the resilience analysis process and cannot model the uncertainties that exist in the inputs of the metric [12,85,[165][166][167]. In probabilistic approaches, using the available data and statistical methods, the CIS's recovery and failure distribution function, etc., can be obtained [89,100,115,126]. However, empirical approaches, specifically those that are deterministic, include many hypotheses that limit these approaches. For example, some assume the CIS's performance is at the ideal level (i.e., 100%) and that recovery actions start right after the disruption [12,85,166]. These hypotheses and limitations restrict their application. Although both deterministic and probabilistic metrics use some terms to model a system's behavior (e.g., redundancy, robustness, etc.), they are limited in incorporating all necessary terms. Hence, they have limitations in modeling the system's behavior. Furthermore, empirical approaches highly rely on the available historical data, and in the absence of the required data, their application ischallenging for resilience analysts.
Index-based Approaches: In these approaches, the raw data needed for the CIS's resilience analysis are usually collected by measuring the indicators that express the CIS's resilience concept. The resilience index, which is also the CIS's resilience score, can be determined by aggregating the obtained information from indicators (usually using the weighted average method). The index-based approaches are often implemented semiquantitatively [79,157,169,170]. This approach makes it possible to use various influencing factors and indicators in the resilience analysis process. Furthermore, it would help recognize the weak and strong points of CIS resilience [171].
Qualitative Approaches: These approaches try to evaluate the CIS's resilience without any numerical descriptions and present a conceptual insight [172]. This conceptual insight of resilience is usually established using a graph. This kind of graph represents the relationship between the CIS's resilience and resilience contributing factors. In the qualitative approaches, some steps can be designed to evaluate the CIS's resilience qualitatively [6]. Since engineers prefer to use numeric tools, this approach has been rarely used in the field of technological CIS.

Conclusions
Today's society has become complex. The majority of people have access to several services such as electricity, heat, health care, clean water, transportation, and communication worldwide. These vital services are provided by technological CISs. Disruption of CISs may affect the entire community. For example, an electric power outage may cause water and communication network failure, restrict health care services usage, and disrupt the heating and transportation systems. Therefore, our welfare highly depends on the normal and continuous performance of such CISs. The current practice for CISs management is based on risk management. This approach has its limitation when it comes to rare and extreme events as well as unforeseen hazards. Moreover, it is not a very effective approach for managing pre-and post-disruption activities. Although there is an attempt to use the concept of resilience to improve the performance of the critical infrastructures, there is still a long way to go. Specifically, we should determine how the available practice can benefit from resilience assessment and how they can provide information for the resilience management process. Therefore, in the present paper, to learn more about the concept, areas of application, and analysis approaches of the technological CISs' resilience, available resilience-related literature was reviewed using the SLR technique. Based on the results, 1454 potential resilience-related studies were identified using search terms and resources. Using the inclusive and exclusive criteria, 192 primary studies were selected to conduct the review process. The results of the systematic review are summarized as follows: • The concept of resilience has gradually found a place among researchers in the discipline of technological CISs since 2003. In the last decades, the publications' number of resilience-related studies has grown significantly. Based on the result, the concept of resilience is an up-to-date, attractive, and important research area in the discipline of technological CISs.

•
Among the reviewed primary studies, energy systems, transportation networks, and water supply networks are mostly used as case studies. • A resilient technological CIS should have the ability to imagine what to expect, the ability to protect and resist a disruptive event, the ability to absorb the adverse effects of a disruption, the ability to adapt to new conditions and changes caused by disruption, and the ability to recover the CIS's normal performance level after a disruption. Building these abilities in a CIS is necessary to perform two types of preparedness and recovery activities.

•
In this paper, 20 terms were identified as contributing factors to CISs' resilience, which depends on preparedness and recovery activities. As shown, robustness is the most frequent term among the reviewed primary studies. However, none of the evaluated terms can guarantee the resilience of the CISs alone. Therefore, technological CISs' management must develop sets of these terms to build a resilient CIS.

•
Few researchers have tried to use the organizational resilience concept in technological CIS resilience analysis. However, there are still some gaps in our comprehension of this concept. Moreover, there is no consensus on how to assess it. Therefore, this issue will be an interesting research topic for researchers who are working on technological CIS resilience. • Resilience analysis approaches of technological CISs can be classified into four main groups: empirical, simulation, index-based, and qualitative approaches. Simulation approaches were dominantly applied to resilience analysis. Simulation approaches are case-study-based. Empirical approaches, specifically those that are deterministic, include many hypotheses that limit the use of these approaches. The index-based approaches are often implemented semi-quantitatively. Qualitative approaches propose a qualitative representation of the resilience of the CIS.

•
To apply the concept of resilience more effectively, the main effort should be performed on understanding and establishing the concept of resilience appropriate to the discipline of technological CISs. There is still some vagueness about its definition and analysis approaches. Therefore, first, the main researches must be performed in these areas.

•
In the reviewed studies, we noticed that the role of societies on the CISs' resilience was not considered. CISs have consisted of engineered systems and also owner organizations of them. Moreover, societies are the customers of these CISs' services. Therefore, the overall Resilience of any CIS will be affected by the resilience of the three mentioned actors. Therefore, the role of social and organizational resilience on overall CIS resilience needs more research.
• In modern communities, critical infrastructures are highly interconnected, and an electric power outage may cause water and communication network failure, restrict health care services usage, and disrupt heating and transportation systems. Therefore, the interdependency of critical infrastructures needs to be focused on more future research.

•
Since black swan events have extremely low occurrence probability as well as severe consequences, the black swan modeling of critical infrastructures needs some more research. • Continuity management is an important subject that should be developed for critical infrastructures. It is more than a response to disruption and commences with the policies and procedures developed, tested, and adopted when a disruption happens.

•
According to the reviewed papers, predictability and early warning as well as the role of management decision-making has been less considered in the available literature.
Since in a resilient CISs early warning enables managers to get timely information about the proximity and development of disruption, more attempts need to be made about this.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A    [195] Hsieh and Feng The resilience of transportation systems [196] Baroud and Barker Resilience of networks [133] Moslehi and Reddy Resilience assessment [197] Ramirez-Marquez et al.
Resilience of networks [198] Cai et al.  The resilience of infrastructure systems [99] Maheshwari and Ramakumar The resilience of energy systems [41] Hosseini and Ivanov The resilience of supply networks [209] Gautam et al.