Measuring Resilience Engineering: An Integrative Review and Framework for Bench-Marking Organisational Safety

: Interest in resilience engineering for improving organisational safety continues to grow among safety scholars and practitioners, but little attention has focused on a unifying deﬁnition, characteristics, and instruments for quantitative measurements. This is a signiﬁcant gap which can impede e ﬀ orts at benchmarking and evaluating resilience engineering for organisational safety. This integrative review was undertaken to address this research-practice gap in order to inform a theoretical framework. A ﬁve steep integrative literature review process was used to retrieve and critically evaluate peer-reviewed quantitative research articles published or in press from 2003 to November 2019. From the 3884 studies identiﬁed, screened, and selected, 17 met the ﬁnal inclusion criteria. In total, 15 speciﬁc instruments were identiﬁed, but only four were grounded on a theoretical framework or model—the most common instrument used for included structured surveys. A minimum of three and a maximum of 13 characteristics were measured; however, it is not clear what type of variables they represented. The six most common characteristics included top management commitment, just and ﬂexibility. An integrative model of how these can inform a Resilience Climate Questionnaire (RCQ) survey is presented.


Introduction
The effective management of organisation safety continues to attract the attention of academics, managers, and policy makers throughout the world. Consistent with contemporary safety practices, most efforts in this regard have focused on unwanted outcomes, injuries and losses arising from adverse events. These are in tandem with the commonly understood view of safety as the absence of unwanted outcomes, freedom from unacceptable harm, or the property of a system that seeks to ensure that harmful events are as low as possible [1]. While there are several strategies for achieving these (such as legislation, behavioural measures, ergonomics, risk management and safety management systems), most of these are based on the assumption that safety can be achieved by people performing work through prescribed norms, following procedures and rules, and reducing human error. This represented a safety I philosophy [2,3]; with safety defined as the absence of negative outcomes and operations deemed to be safe when the number of events that could go wrong were maintained at acceptably low levels. Hence, they focused on identifying and managing deviations from prescribed work. They were based on the assumptions that organisational systems were well designed and correctly organisational safety. This is the first comprehensive review on the topic, and advances previous work on RE indicators for safety management [12]. In doing so it seeks to address the previously identified need for a coherent integrative framework for RE [13].
The article is organized as follows. Section 2 presents the research paradigm and theoretical framework, which informed the specific method used for this review. This is one area of departure that this review takes in comparison with most published reviews. The specific methods used are discussed in Section 3. Section 4 includes the results of this review, while Section 5 discusses these findings and proposes an integrated framework for informing an instrument for measuring and benchmarking RE for organisational safety. Section 6 presents the strengths, limitations and implications of this review, followed by conclusions in Section 7.

Research Paradigm and Theoretical Perspective
Prior to selecting a research methodology, it is important for researchers to elucidate the philosophical and theoretical positions upon which their contributions are based. This will assist in explaining why some methods are appropriate for conducting certain types of research [32], and also connect the basic assumptions inherent in the paradigms and theoretical positioning used [33]. In this regard, the research paradigm and theoretical perspectives play an important role in organisational safety research. Burrell and Morgan [34] proposed that research aimed at investigating organisations can be situated along four paradigms: interpretivism, functionalism, radical structuralism and radical humanism, although the latter two are yet to be fully embraced by safety researchers. Interpretivism is closely associated with qualitative research and most common in RE [10,35]. This paradigm suggests that RE does not have a concrete existence but is something that is constructed by human actors, so cannot be investigated using the methods commonly used in natural and biological sciences [32]. Functionalism, which is closely associated with positivism, is gaining attention in RE studies [10,35]. This paradigm suggests that social phenomena have an objective existence so can be scientifically investigated using methods of natural and biological sciences [32]. This research is aimed at identifying and developing an integrative theoretical framework and informing an objective instrument for investigating RE for organisational safety, so a positivist paradigm was deemed most appropriate.
Apart from research paradigms, researchers also need to embed their inquiries in an appropriate theoretical perspective, which guides data collection and analysis [32]. In this instance, pragmatism was applied, largely because of its recognition as a philosophy of common sense, greater flexibility in terms of methods, and a focus on practical outcomes [36].

Method
This research utilized an integrative review, a method commonly used for evaluating strengths of evidence, identifying gaps in research, connecting related areas of published research, generating research question(s), identifying theoretical or conceptual frameworks, and exploring research methods [37]. They also enable researchers to infer generalizations about substantive issues from a set of studies directly bearing on those issues [38], draw together research published from different methodologies [39], and allow for wider perspectives and depth of evidence; including non-experimental research and theoretical literature [40]. Key authorities, such as Whittemore [37] and Soares, Hoga, Peduzzi, Sangaleti, Yonekura and Silva [39] have suggested a five stage approach-comprised of problem identification, literature search, data evaluation, data analysis, and presentation-be utilised. The first is concerned with identifying the variables of interest and the sampling frame to provide a focus and set the boundaries; the second involves ensuring identifying the maximum number of eligible articles; the third involves extracting data and evaluating the quality of the articles; the fourth involves ordering, coding, categorizing, and summarizing the findings; the final stage involves reporting the findings. The specific methods used for this review, adapted from [37], comprised of five stages, which are discussed below.

Framing the Research Questions
A set of three interrelated research questions were formulated for this review, including: i.
How has RE been conceptualized and defined in quantitative studies? ii. Which RE characteristics have been measured in these studies? iii. What psychometric properties were measured in these studies?

Searching and Selecting the Relevant Literature
Six electronic databases (CINAHL, Google Scholar, PsycINFO, PubMed, Scopus, and Social Science Journals) were searched on 12 November 2019, using "resilience engineering" as the keyword-i.e., TITLE-ABS-KEY ("resilience engineering,"), in line with a recent systematic review [9] and meta-analysis [11], to identify articles published or in press from January 2003 to November 2019. The search was limited to full-text articles and conference proceedings published in English. "Grey literature" was also searched by reviewing reference lists to identify any articles that may have been missed. The titles and abstracts were screened by two independent research assistants (RA) and included if they: (i) used groups, teams or organisations as a unit of analysis; (ii) were journal articles or conference proceedings; (iii) focused on safety. In making this decision it was observed that previous authors, such as Furniss, Back, Blandford, Hildebrandt and Broberg [21], have suggested that resilience is related to individuals, while others, including Woods [23] and Pillay [10], posited that the collective roles of groups and teams were important. Articles which did not meet the above criteria were excluded. Full-text copies of the remaining articles were retrieved and closely read by two RAs to identify studies eligible for review, including those that (i) described quantitative methods, factors and/or measures, and (ii) referred to and/or discussed instruments used. Disagreements were resolved through discussion with the two authors until consensus was achieved.

Data Extraction
The key information required for analysis and synthesis was exported into a data extraction sheet. Information regarding research aims, theory/model, research design (methods, participants), factors and variables measured, instruments used and approaches to statistical analysis were extracted. There is no specific published criteria for evaluating psychometric properties for RE studies; however, reliability and validity are two of most common used for evaluating the rigour of in quantitative studies [41].
Reliability is most assessed through Cronbach's alpha coefficient, with a minimum value of 0.70 suggested to be generally acceptable for new measures, constructs and/or scales [42]. Validity is the degree to which the research phenomenon being investigated is accurately measured, and is generally assessed for content, construct and criterion [42]. According to the author, content validity is generally determined by the subjective opinions of experts, construct validity through exploratory factor analysis (EFA) and confirmatory factor analysis (CFA), and criterion validity by considering convergence, divergence, and predictions.

Critical Appraisal
The selected articles were critically appraised using an eight-item questionnaire adapted from the Critical Appraisal Skills Programme (CASP) [43] and the quality assessment tool of diverse study designs [44]. The specific questions focused on: i. Aim(s): Is the aim(s)/purpose of research clearly stated? ii. Theory/model: Is an explicit theoretical framework/model used or discussed? iii. Research design: Is an overall research design mentioned or discussed? iv. Data collection: Does the data collection include procedures, settings and/or sampling? v.
Data analysis: Is the data analysis sufficiently rigorous and includes quality aspects?
vi. Bias: Were any biases or limitations of study considered and/or reported? vii. Results: Are the results clearly stated and discussed? viii. Value: Can this research be used to advance knowledge and/or practice?
Each question was assessed using a YES, NO, LIMITED response by the first researcher and cross-checked by the second. While this process was used for appraising the quality of published works selected for review, no articles were filtered out at this stage.

Data Analysis and Synthesis
The key information and findings for each study were gathered and summarized using a narrative approach. Figure 1 presents the findings of our search and selection strategy. The search of six databases and grey literature generated 3884 articles, from which 3731 duplicates were removed, leaving 153 articles for title and abstract screening. A further 78 were screened out at this stage, resulting in 75 articles for full-text review. An additional 58 were deemed not eligible at this stage, resulting in 17 studies for the final review and synthesis, these are summarized in Table 1. Each question was assessed using a YES, NO, LIMITED response by the first researcher and cross-checked by the second. While this process was used for appraising the quality of published works selected for review, no articles were filtered out at this stage.

Data Analysis and Synthesis
The key information and findings for each study were gathered and summarized using a narrative approach. Figure 1 presents the findings of our search and selection strategy. The search of six databases and grey literature generated 3884 articles, from which 3731 duplicates were removed, leaving 153 articles for title and abstract screening. A further 78 were screened out at this stage, resulting in 75 articles for full-text review. An additional 58 were deemed not eligible at this stage, resulting in 17 studies for the final review and synthesis, these are summarized in Table 1.

General Characteristics of Reviewed Studies
The results suggested that propositions for studies in RE did not receive much attention until 2010, two years after the need for the quantification of RE was first pointed out, with the first two studies published in 2013. Since then, there has been at least one study every year, except for in 2015. Eleven (64.7%) articles were published from Iran, while others were from Australia (5.9%), Austria

General Characteristics of Reviewed Studies
The results suggested that propositions for studies in RE did not receive much attention until 2010, two years after the need for the quantification of RE was first pointed out, with the first two studies published in 2013. Since then, there has been at least one study every year, except for in 2015. Eleven (64.7%) articles were published from Iran, while others were from Australia (5.9%), Austria (5.9%), Canada (5.9%), Italy (5.9%), Kuwait (5.9%) and Poland (5.9%). All studies reported the development of the instruments for assessing RE. Organisations in which these were proposed, developed and/or investigated mostly included petrochemical plants (N = 5, 29.4%), process industries (N = 4, 23.5%) and public hospitals (N = 2, 11.8%), with those remaining from aviation (N = 1, 5.9%), aluminium manufacturing (N = 1, 5.9%), Construction (N = 1, 5.9%), gold mining (N = 1, 5.9%), solid waste management (N = 1, 5.9%) and steel manufacturing (N = 1, 5.9%).
The results of the critical appraisal of each article is presented in Table 2. All seventeen articles included details of data collection methods, while sixteen (94.1%) covered in adequate details methods for data analysis and results. Ten (58.8%) included an explanation or mentioned an overall research design, while nine (52.9%) included an aim or purpose of the study and/or article. Only four (23.5%) used or suggested a theoretical framework or model informing the research design or instrument, while only one (5.9%) discussed any biases and/or limitations.
The diversity in the conceptualization of RE also means that there continues to be no unified definition of RE, consistent with previous conclusions [10,14]. Ten inter-related definitions identified in this review are summarized in Table 3. The most common of these suggests it is a paradigm for safety management, consistent with the seminal definition. Seven of the articles suggested it is an ability and/or capability, while one suggested it is a process. Four suggested it was a reactive ability, while four saw it as being proactive, reflecting the two faces of safety advocated by seminal authors of RE, such as Hollnagel, Wears and Braithwaite [2] and Hollnagel [18]. Authors, such as Boring [15,16], have previously argued that the loose definition of RE is intentional to avoid constraining its emergence and adoption. However, the absence of a unified definition around this emerging notion also means it can continue to be a source of confusion, especially outside the circle of researchers and practitioners who promote it, as it is not easy to have a clear sense of what it really denotes [26]. A unified meaning is necessary for setting the boundary and providing a focus RE [10]. In Section 5.4, one is provided, which the authors find useful in both regards, without making it superior to any of the others that have been previously proposed. Table 3. Conceptualization of RE in quantitative studies.

Authors
Definition [30] "developing an organisation's behavioural and cognitive capability such that it can effectively adjust and continue performing optimally near an its safe operating envelop in the presence of everyday threats and environmental stressors at all levels of the organisation" (p. 134) [29] "ability of a system or an organisation to react and recover from disturbances at an early stage with minimal effect on dynamic stability" (p. 2) [45] "ability of a system to adapt its functioning before and during disturbances, so that it can continue operations after a major mishap or in the presence of continuous stresses" (p. 89) [46] "a paradigm for safety management that concentrates on how to help people create foresight, anticipate the different forms of risk in order to cope with complexities under pressure and move towards success" (p. 100) [47] "a paradigm for safety management that stresses how people, systems and organizations learn, adapt, and create safety in an environment with hazards, tradeoffs, and multiple goals" (pp. 231-232) [48] "intrinsic ability of a system to adjust its operation before or following changes and disturbances, so it can maintain operations after an accident" (p. 247) [50] "inherent ability of a system to adapt its function before and during the situations where normal functioning is disrupted, so that it can continue operations after a major disaster or in the presence of constant stresses" (p. 56) [53] "intrinsic ability of a system to adapt its function before, during, or after major mishaps or changes, so that it can continue the operations required under both expected and unexpected conditions" (p. 20) [54] "a paradigm for safety management that concentrates on how to help people deal with complexity under stress to achieve success" (p. 336) [56] "an organizational culture that fosters safe practices for improved safety in an ultra-safe organization striving for cost-effective safety management" (p. 297) [58] "capability to sustain or rapidly return to a steady state that allows the organization to continue operation during or after a major event or in the presence of continuous stresses" (p. 191) [59] "ability of an organization to regulate its function before, during, and after perturbations and fluctuations, so that it can continue the operations required under both predicted and unpredicted situations" (p. 142) The measurement of RE (which includes all the attributes in a set of objective measures, or an aggregated index) generally largely followed a dimensional approach through several variables and/or factors, although the nomenclature used was inconsistently applied. For this review these have been simplified into RE characteristics, which include the main attributes that were measured. Table 4 summarizes the key RE characteristics measured, the instruments used, the theoretical frameworks or models used to develop these, and the statistical measures and results of these. According to Table 4, a minimum of three and a maximum of thirteen characteristics were used. 1.
Questionnaire developed for study (Two parts; one deals with resilience) Not indicated 1. Learning (L)/n = 11 2.
Flexibility (F)/n = 6 Analysis network process (ANP) used to quantify and determine the priorities of RE dimensions Zarrin and Azadeh [59] 71 Employees Questionnaire developed for study, including RE (7) The instruments varied in scope, with one measuring as few as three characteristics, while others covered up to thirteen. The 11 studies published from Iran, the country with the largest number of articles, did not utilize the same instruments. Three studies from this country [47,49,53] used ten characteristics to measure thirty questions, while two [56,58] measured six characteristics through sixty-one questions. Four instruments [30,47,54,57] were associated with a theoretical framework or model, but one of these [30] has not been empirically tested or validated. One instrument [29] was limited to measuring at the behavioural level, while instruments 4 [46], 5 [47], 8 [50], 12 [54] and 14 [56] were based on the seven key themes identified by Wreathall [27], but also included an additional four characteristics of Fault-Tolerance, Self-organization, Teamwork, and Redundancy.
Instruments 2 [29] and 15 [57] measured characteristics that were different to those suggested by Wreathall in [27] or [19], while instrument 9 [51] was the only one to explicitly measure the four capabilities of a resilient organisation [19]. The instruments developed by teams led by Shirali and Azadeh appear to be central, illustrated in studies 6, 7, 10, 11, 14 and 16 [48,49,52,53,56,59]. Two instruments measured Resilience Safety Culture (RSC) [56,57], while one measured Resilience Safety Climate [55]. This is consistent with the suggestion made in a previous article [30], which suggested that both safety culture and RE could be examined through safety climate surveys. The sampling sizes informing the surveys varied widely, with a minimum of 30 and maximum of 564, and targeted either single levels, such as workers and/or operators, or utilized a cross section across all levels.
The most common characteristics measured included six of seven themes suggested by Wreathall [27]. Specifically:

•
Learning/learning culture was included in fifteen studies [30,45- Behaviours [20]-together with buffering, flexibility, margins and tolerance [23]-were included in two instruments, while cognition [21,22], or the gap between work as imagined and work as performed [27,28], were mentioned in one study but did not feature in any instruments or empirical studies.
From a quantitative measurement point of view, however, there is no clear agreement on which of these acts as independent variables, dependent variables (outcomes) or mediating variables. For example, Azadeh, Salehi, Ashjari and Saberi [47] suggested that all ten characteristics they investigated were outcomes. Pęciłło [51], on the other hand, used all the characteristics as independent variables. The model generated by Chen, McCabe and Hyatt [55] is complex but did include unsafe events as a dependent variable, and measured it through physical injuries and job stress; which represent one aspect of RE outcomes-i.e., failures. The characteristics measured by Shirali, Shekari and Angali [56] simply referred to these as variables. In this regard, the characters measured may best be summarized as key dimensions that represent the construct of RE.
With respect to validity assessments, the Content Validity Index/Ratio (CVR and CVI) for instruments 6, 8 and 10 were greater than 0.70 [48,50,52]. Three instruments (4, 11 and 16) utilized a relative weighting of characteristics. The RSC assessed with α in instrument 14 [56] had one characteristics out of six with a α less than 0.70, similar to instrument 15 [57], which had one out of thirteen. For the latter instrument, the values obtained for the whole α, the CVR and the CVI, were all greater than 0.70.

Discussion
This research was aimed at informing a theoretical framework for measuring and benchmarking RE for organisational safety by reviewing how has it been conceptualized and defined in quantitative studies, instruments used, and psychrometric properties of the studies. The development of a coherent framework for measuring RE is an important first step for conducting benchmarking and evaluation studies within and across industries.
The results suggest there is a wide diversity in the way RE has been conceptualized, so there is no unified understanding of what exactly it is or is not. This resonates with previous findings [9][10][11][12], so is likely to remain a contested area for research and practice. Only a limited number of studies proposed or used a theory or model, and only a few were developed from a collective body of knowledge. Moreover, while additional characteristics have been identified, these add to the ambiguity and confusion about what is or is not, so debates around the best ways of measuring RE, or more specifically, resilience potential [35], will continue. In this regard, quantitative research in RE remains theoretically fragmented. However, what is clear is that it is an organisational construct, which is multidimensional [30,47,50], multi-level [29,30] and multi-factorial [29,30,53,57], so it can be evaluated and investigated at any or all of these. Moreover, it is associated with both human and organizational performance [30,47,54] and culture [30,53,57], and is concerned with boundaries of operations [30,46], adaptation [30,46], continuity of operations [29,51] and preparedness [48,52,55]. Collectively, these make it a relatively complex construct because researchers need to focus not only on a wide range of dimensions, levels, and factors, but also on the interactions within and between these [8,61].
The results also showed that structured surveys were the most common instrument used. Methodologically, such surveys can capture responses from a much wider range of informants and across multiple levels [62,63]. The dimensional approach used in the surveys offer the advantage of focusing on those specific characteristics of interest. However, the survey instruments varied widely in terms of organisational settings and contexts, range of characteristics, and number of questions for measuring each characteristic. In this regard, the development of measurement instruments has been limited to adding new characteristics to a previously used instrument and offering new and presumably refined indicators and rankings. However, only four studies [30,47,54,57] were supported with a theory or model. In the absence of any associated theory, the inclusion of these additional characteristics can be a cause of confusion among those seeking to build on that work. Reliability and validity were the two most common statistical attributes measured.

A Predominant Dimensional Approach
This review highlights very clearly the predominance of characteristics of resilient organisations proposed by Wreathall and Merritt in [64] and [27]. Early studies suggested that these were akin to leading indicators of organisational health in High Reliability Organisations (HROs) [65]. While there are some common terms and ideas between these, there are also a few philosophical differences. For example, response, learning, flexibility, and culture are common to both, and resilience is one of the core principles associated with HROs [65]. However, the HRO view of resilience is one a system has, so is reactive, while in RE it is a proactive process that a system does [18]. In addition, HRO associates resilience with containment, not anticipation, while RE is associated with both anticipation and response as systems develop mechanisms to create foresight to recognize and defend against paths to failure [5]. In addition, HROs focus on the cognitive capabilities of collective mindfulness of organisational behaviour; learning from failures in order to anticipate, contain and recover from events and mishaps and manage risks through standard routines and protocols [65]. In contrast, RE focuses more on proactive processes and concentrates more on successful outcomes and safety managed through flexibility [5,23,26].
Paradoxically, the key attributes proposed by Hollnagel [17,19] were mobilized in only one instrument, despite the fact that this author is one of the key seminal authors and gurus of RE. This paradox, however, is not one of them. Indeed, the four capabilities described by Hollnagel [61] through its Resilience Analysis Grid: anticipating, monitoring, responding and learning, can be transposed with some of the key themes proposed by Wreathall [27] and widely represented within the instruments analysed. Thus, "flexibility" refers to the ability to "respond," "Learning Culture" to the ability to "learn," "situational awareness" to the ability to "perceive/monitor" and "preparedness" to the ability to "anticipate." Finally, the coupling between the elements of the safety culture model proposed by Reason [66,67] and the four capabilities of Hollnagel [17,19] interact with six of the seven key themes proposed by Wreathall [27] and widely mobilized within the instruments. The psychometric results of these suggest these are robust and will allow for the quantitative assessment of resilience potential in organisations. The six most common characteristics can be used to inform an integrative theoretical framework and the main dimensions for a survey instrument that can be used across the broader industry.

Going Beyond the Predominant Dimensions
The instruments developed and used since 2013 clearly suggest that, while the key themes suggested by a Wreathall [27] are important, a comprehensive measurement of RE requires going beyond those characteristics. Based on the findings of this review, important characteristics that need to be considered in operationalizing RE instruments include:

•
The available margins of manoeuvre: Resilience, according to Hollnagel [19], assists organisations to adjust their functions before, during, or after any changes or disruptions in order to continue normal operations under both expected and unexpected conditions, so is very closely linked with adaptation. Indeed, resilience involves a return to its adaptive capacity in the face of new forms of variation and challenges [23]. For this reason, the available margins of manoeuvre are necessary for adaptation. In the current sector of French-language ergonomics [68][69][70][71], this notion of margins of manoeuvre was introduced in RE to in relation to resources set aside but which could be used to curtail unexpected demands and perturbations so that the system continued functioning instead of shutting down or reducing operations [72]. The lack or low rate of margins of manoeuvre reduces the adaptive capacity of organisations, making them vulnerable to variations induced by the occurrence of perturbations. The available margins of manoeuvre at the sharp-end level has been suggested to depend on two main factors within the organisation: (i) the degree of prescription/regulation; (ii) the degree of organisational control [73]. One of the instruments reviewed [53] investigated margins of manoeuvre by evaluating the closeness of its operating system in relation to its boundary of safe performance, and tolerance by gauging how well it worked at the borders when subjected to increased pressures and adaptive capacity. However, four instruments also measured tolerance [45,46,49,58]. Collectively, these capture the key facets that can be used to measure the available margins of manoeuvre. • Strengthen dimensions in relation to trade-offs between production and safety: The social creation of safety in RE involves the effective management of trade-offs between production and safety (P/S) [24,26,74,75], which involve sacrificial decisions being made at all levels of the hierarchy and will impact on the overall safety level of the organisation. The ability of an organisation to achieve under conditions of high pressures through effective management of production/safety trade-offs is a fundamental in RE. Regarding the instruments analysed in this review, this is taken into account but mainly at the level of the "management commitment" dimension-i.e., at the blunt-end level. However, the number of specific questions aimed at evaluating this remained modest. It would therefore be appropriate to add more questions for evaluating trade-offs between P/S, considering a breakdown of these issues on several dimensions, and questioning the sharp-end level. It might also be relevant to add a specific dimension.

•
Introduce a dimension dealing with managed safety vs. prescribed safety: At an operational level (i.e., sharp-end level), the degree of prescription within an organisation will more or less constrain front-line operators [76]. One of the indicators of RE (or lack of it) has been suggested to be the gap between work as prescribed and work as done [27,28], so monitoring and managing such gaps are important in driving safety achievements. According to Morel, Amalberti and Chauvin [26] and Dekker [28] the greater the degree of prescribing within an organisation, the less it will be possible for front-line operators to have sufficient autonomy and flexibility to adapt and deal with the occurrence of perturbations. On the other hand, the lower the level of prescribing, the greater the autonomy and margins of manoeuvre of front-line operators. From an RE perspective, this highlights two distinct forms of safety: one that is managed by the actors, on the one hand, and that prescribed by the organisation on the other [26,75,77,78]. Managed safety, which involves the adaptation of prescriptions to suit the context of local work situations, depends not only on the autonomy of the actors but also on their level of competence and expertise [26]. For this reason, any new instruments would need to incorporate questions that interrogate not only the level of prescription within organisations but also the autonomy of the actors, as well as their levels of competence and expertise. It should be noted that these three points are intrinsically linked to the issues raised above, which concern the trade-offs between performance and safety, on the one hand, and the margins of manoeuvre on the other.

Culture of Resilience vs. Climate of Resilience
Early conceptions of RE suggested it was linked to an organisation's culture [24,25]. This has resulted in new instruments for measuring RE, mostly through safety culture [30,56,57], although one attempt has also been made with safety climate [55]. This is consistent with a recent integrative review, which identified that safety culture was more referenced in organizational system models of safety, such as RE in comparison to safety climate [79]. However, there have been some criticisms of associating RE with safety culture, with suggestions that many high risk and complex organisations were already well advanced in their safety culture, and specifically in the management of anticipated and unanticipated events, so there was no need for RE [80,81]. Similar assertions can also be associated with authors such as Reason [66,67], who argued that resilience, anticipation, senior management commitment, monitoring, feedback and flexibility, in dealing with ill-defined hazardous conditions, were all important in achieving safety culture.
Beyond this, however, the problem of choice in naming instruments resilience safety culture or resilience safety climate is reminiscent of the continued conceptual confusions surrounding these notions [82][83][84]. Safety culture has been suggested to be embedded in the deeper core layer of an organisation's central assumptions about safety, but expressed in the middle layer (through beliefs, attitudes, motives, norms espoused values); was intangible, and had a distal influence on an organisation's performance [79,83]. Safety climate, on the other hand, includes the perceptions of the middle and outer layers, was more dynamic, and could be more readily applied to system-based safety models [79,84,85]. In addition, safety climate was more proximal indicator of safety [79].
Choosing between safety culture and safety climate also guides the type of instruments as both do not necessarily use the same methods and therefore differ from this point of view [82]. Most evaluations of safety climate have involved quantitative methods, such as structured questionnaires. The assessment of safety culture, on the other hand, relies more on qualitative methods (e.g., observations and interviews). This is partly due to the fact that the safety climate is visible through attitudes and practices, while the safety culture is reflected in the values and beliefs that underlie those attitudes [83].
According to Flin, Mearns, O'Connor and Bryden [84], safety climate, associated with the perceptions of safety for a specific location at a given time, was relatively unstable and subject to change, depending on the characteristics of the work environment.
Some authors have argued that, due to the elusive nature of safety culture, it was questionable whether it could be measured through scientifically sound methods [86], with some arguing that safety climate reflects safety culture [84,87]. What is actually being measured, according to Guldenmund [86], is safety climate, which represents the "measurable aspect of safety culture" (p. 29, [86]). This is also alluded to by Pillay, Borys, Else and Tuck [30], who suggested that RE could be measured through safety climate surveys. In addition, a comprehensive evaluation of safety climate research by Zohar [85] concluded that three key aspects which resonate with RE needed to be investigated. These included the relative priorities of competing demands (ways in which safety was prioritized in comparison to other goals, such as productivity or efficiency), gaps between words and deeds (management statements regarding the prioritization of safety, which were compromised under operational demands) and the local adaptation of policies and procedures. The first of these involves trade-offs between safety and efficiency, the second is consistent with the gap between work imagined as done, and the third with managed versus prescribed safety. These provide the basic foundations for measuring RE. In other words, the measurable facet of RE is resilience climate.
In terms of informing an instrument for measuring and benchmarking RE across the general industry, this review has identified that structured questionnaire surveys are the most common, which represents a way forward. However, given the discussions above, the terminology "Resilience Climate Questionnaire" better captures the key characteristics to be measured. Moreover, the choice not to add the word "Safety" to the title is justified because concepts of Resilience and Safety are not synonymous and may even be contradictory [75]. Avoiding any contradiction into the title of the instrument itself will make it clear the aim is not measure either safety climate or safety culture but to focus on resilience.

A Unified Definition and Integrative Framework
Based on this review, an integrated framework for benchmarking and measuring RE potential across the general industry can be proposed. However, as there is no clear definition of RE, it is important that this be clarified first, in order to set boundaries and provide a focus. The authors propose the following unified definition: "RE is a perspective for organisational safety management which enables organisational members to actively anticipate, respond, monitor and learn; by adapting to operate at the boundary of safe operations by narrowing the gap between work as imagined and work as performed; and manifested in an organisation's culture, cognition and behaviours" Framing RE as above does highlight several key things. Firstly, RE is a perspective, so represents a philosophical shift. In effect, this shift is proactive and addresses the need for organisations to adapt to changes and threats prior to, during and after disruptions; or in the course of normal work operations, consistent with Safety II [3,18]. Secondly, although an individual can have all the attributes of resilience, it is only when they are inherent across the organisation that these play a role in RE [10]. Thirdly, the collective capabilities enable the organisation to anticipate, respond, monitor and learn, as suggested by Hollnagel [17,19]. Fourthly, it is about narrowing the gap between work as imagined and work as performed [27,28], which leads to adaptive performance at the boundary of safe operations. Finally, it is manifested in the organisation's culture, cognition and behaviours.
RE can also be operationalized and measured quantitatively as resilience climate. In summarizing the key characteristics discussed previously, it would therefore be relevant to consider the development of a new survey instrument, in the form of a Resilient Climate Questionnaire. This should include, as a minimum, six of the seven most common themes identified by Wreathall [27]. These include top management commitment, just culture, learning culture, reporting culture, flexibility and awareness [30,[45][46][47][48][49][50]52,54,55,58,59]. In addition, other characteristics can be incorporated; however, this needs to be supported with adequate theory. Figure 2 presents an integrative framework that captures the collective themes informing a comprehensive Resilience Climate Questionnaire, and act as a new model for advancing research and practice in RE for organisational safety.

Strengths, Limitations and Implications
There are several strengths with this review. It utilized a positivist paradigm to investigate objective instruments and measures [10,35] and applied a pragmatic theoretical perspective to focus on practical outcomes [36]. The researchers used a structured approach for searching and selecting articles. The title, abstract and full text article searches and selection were carried out by two independent Ras, and the final set of articles was also subjected to a process of critical appraisal. Collectively, these demonstrate a high degree of rigor in the review process, and confidence in the results and discussions.
There are also some limitations with this review. The exclusion of other bibliographic databases (such as Ei Compedex of Inspec), could potentially lead to overlooking articles that may have been published-for example, from a system engineering perspective. Only one keyword was used in the search and selection criteria, and this could have missed articles that did not use this term. The search was also limited to English language articles-non-English language papers were not searched and this may also have caused us to miss a few articles.
Despite these, this review is one of the first to focus specifically on measurements and instruments in RE, so the findings are significant. It advances previous work on the identification of RE indicators for safety management [12]. It is one of the first to focus specifically on the identification of measurements and tools, so addresses a significant research-practice gap in the field. As RE is multilevel, multi-dimensional and multi-factorial, the integrated model illustrated in Figure 2 can be used to inform a practical survey instrument for measuring, benchmarking, and improving RE as an organisational safety strategy. Findings from this review suggest that six key dimensions provide a good starting point for measuring RE through resilience climate, which is the measurable facet of RE. Consistent with safety climate studies, these six dimensions can be used as independent variables.
This review also identified that a few studies used additional characteristics to measure RE, which have been summarized in Figure 2. While acknowledging that this does capture the complexity of the RE construct, they can be a source of confusion. There is a need to support their inclusions with appropriate theory to identify the type of variables (independent, mediating or dependent) they are expected to represent.
This article also advocates the use of questionnaires as an instrument to assess resilience climate

Strengths, Limitations and Implications
There are several strengths with this review. It utilized a positivist paradigm to investigate objective instruments and measures [10,35] and applied a pragmatic theoretical perspective to focus on practical outcomes [36]. The researchers used a structured approach for searching and selecting articles. The title, abstract and full text article searches and selection were carried out by two independent Ras, and the final set of articles was also subjected to a process of critical appraisal. Collectively, these demonstrate a high degree of rigor in the review process, and confidence in the results and discussions.
There are also some limitations with this review. The exclusion of other bibliographic databases (such as Ei Compedex of Inspec), could potentially lead to overlooking articles that may have been published-for example, from a system engineering perspective. Only one keyword was used in the search and selection criteria, and this could have missed articles that did not use this term. The search was also limited to English language articles-non-English language papers were not searched and this may also have caused us to miss a few articles.
Despite these, this review is one of the first to focus specifically on measurements and instruments in RE, so the findings are significant. It advances previous work on the identification of RE indicators for safety management [12]. It is one of the first to focus specifically on the identification of measurements and tools, so addresses a significant research-practice gap in the field. As RE is multilevel, multi-dimensional and multi-factorial, the integrated model illustrated in Figure 2 can be used to inform a practical survey instrument for measuring, benchmarking, and improving RE as an organisational safety strategy. Findings from this review suggest that six key dimensions provide a good starting point for measuring RE through resilience climate, which is the measurable facet of RE. Consistent with safety climate studies, these six dimensions can be used as independent variables.
This review also identified that a few studies used additional characteristics to measure RE, which have been summarized in Figure 2. While acknowledging that this does capture the complexity of the RE construct, they can be a source of confusion. There is a need to support their inclusions with appropriate theory to identify the type of variables (independent, mediating or dependent) they are expected to represent.
This article also advocates the use of questionnaires as an instrument to assess resilience climate in organisations. Such an instrument is simply a device for collecting information that provides insights into the specific research questions being investigated. For this reason, it is important to understand the advantages and limitations of its usage.
Some of the advantages of questionnaires as used in general research include: i.
A well-designed one can be used to collect huge quantities of data; ii. Respondents can be sourced from a wide range of contexts and levels of the organisation; iii. They are relatively inexpensive to administer; iv. They can be administered in different ways; v.
Very little training is required to develop them; vi. Response rates for some types of instruments, such as group-administered questionnaires, can be higher; vii. They can be easily and quickly analysed [62,63,88].
The papers informing this review did not clearly discuss the advantages of using questionnaires. However, previous research on measurements of similar organisational constructs, such as safety culture, have suggested that the ease and speed of implementation, reproducibility, and being able to offer comparisons between organisations and groups made them an attractive alternative [89][90][91][92].
Questionnaires also have several limitations, most of which are the due to the poor design of instruments. For example: i.
It becomes difficult to avoid leading questions; ii. Questions can contain multiple sets of ideas so can become complicated; iii. Some questions, such as those on age, gender or the respondent's specific role, can be irritating if no context is provided for their response; iv. They may be ambiguously worded [63,88]; v.
The subjective use of scales can induce variations in perceptions among respondents [91]; vi. Prior commitment from management and supervisors are required to enable operators complete these during their work hours [93]; vii. The results generated from questionnaires also provide a superficial description of the organisation [92,94].
In order to address some of these limitations, more than one method can be used to provide greater insights into the key characteristics being. Examples include supplementing questionnaire surveys with qualitative methods, such as audits of workplace practice [30], experts views [46] or semi-structured interviews [53].

Conclusions
This review was aimed at reviewing the conceptualization, definition and measurement of RE-identifying any instruments used, and the psychometric measures tested, in order to inform a theoretical framework and measurement instrument that can be used to advance research and practice in the field. A positivist paradigm and the theoretical framework of pragmatism, together with a structured integrative review method, were used to enhance the rigour of the research process. The search across six comprehensive databased generated over 3900 articles, of which 17 were included in the final synthesis. In total, 15 survey instruments were identified, but only four of these were supported with theory. In total, 11 of these were from Iran. A minimum of three and a maximum of 13 dimensions were used in the surveys; however, there was a wide diversity in the number of questions asked for each dimension. There was wide diversity in the conceptualization and definition of RE, which suggests that it is a multidimensional, multifactorial, multi-level construct which exists across an organization, and manifests through culture, cognition and behaviours. A unified definition and an integrative model for informing a Resilience Climate Questionnaire, which is the measurable potential for RE, is proposed and currently being tested at the University of Newcastle, Australia. While questionnaires do offer several advantages, they also have their limitations, so supplementing these with other qualitative methods will help alleviate some of these limitations.