A Mixed Rough Sets/Fuzzy Logic Approach for Modelling Systemic Performance Variability with FRAM

: The task to understand systemic functioning and predict the behavior of today’s sociotechnical systems is a major challenge facing researchers due to the nonlinearity, dynamicity, and uncertainty of such systems. Many variables can only be evaluated in terms of qualitative terms due to their vague nature and uncertainty. In the first stage of our project, we proposed the application of the Functional Resonance Analysis Method (FRAM), a recently emerging technique, to evaluate aircraft deicing operations from a systemic perspective. In the second stage, we proposed the integration of fuzzy logic into FRAM to construct a predictive assessment model capable of providing quantified outcomes to present more intersubjective and comprehensible results. The integration process of fuzzy logic was thorough and required significant effort due to the high number of input variables and the consequent large number of rules. In this paper, we aim to further improve the proposed prototype in the second stage by integrating rough sets as a data ‐ mining tool to generate and reduce the size of the rule base and classify outcomes. Rough sets provide a mathematical framework suitable for deriving rules and decisions from uncertain and incomplete data. The mixed rough sets/fuzzy logic model was applied again here to the context of aircraft deicing operations, keeping the same settings as in the second stage to better compare both results. The obtained results were identical to the results of the second stage despite the significant reduction in size of the rule base. However, the presented model here is a simulated one constructed with ideal data sets accounting for all possible combinations of input variables, which resulted in maximum accuracy. The same should be further optimized and examined using real ‐ world data to validate the results.


Introduction
Resilience Engineering is a discipline concerned with designing and constructing resilient systems, i.e., systems with the ability to cope with complexity and adapt with unforeseen changes and performance variability [1]. Classical approaches are no longer sufficient in the age of complexity to provide a complete and comprehensive picture, and a trend shift occurred in recent years, leading to the introduction of innovative methods to present a systemic perspective in system's analysis. One of the main methods in Resilience Engineering is the Functional Resonance Analysis Method (FRAM) [2,3]. The principles (Figure 1), on which FRAM rely, allow for understanding complex systemic behavior as a result of performance variability and its combinations. The idea in systemic approaches is that undesired outcomes are not simply and entirely explainable in terms of singular components' failure, errors, or sequential events. The natural deviation in performance from prescribed procedures and routines is an inherent characteristic of any system and is even necessary sometimes to cope with unexpected changes in performance conditions [1]. This is what Hollnagel defines as the difference between Work-As-Imagined (WAI) and Work-As-Done (WAD) [4]. This new understanding for the development of undesired outcomes led to the redefinition of what constitutes safety. Instead of simply looking at "what goes wrong", one could look at "what goes right" as well to better manage performance variability and provide resilient systems [4]. The main advantages is represented in the fact that what goes right occurs more often than what goes wrong, which is especially rare in high-reliability systems such as aviation. This new approach to safety is known as SAFETY-II, in contrast to the classical approach known as SAFETY-I [4]. In the study of complex systems, assessing instances of interest in quantitative terms might become difficult and sometimes even not possible. One might be forced to rely in many cases on qualitative measures to assess the magnitude of such instances, which might not be adequate or precise to a sufficient degree. Especially in the case of imperfect and vague data, estimating and determining the values of such variables become a difficult objective to achieve. Generally, people find it more difficult in such cases to express their evaluation in terms of numbers and prefer to use natural language [5]. Qualitative tools such as FRAM can prove helpful in such cases by maintaining a systemic perspective and enabling the analyst to evaluate such complex instances using words. The rising popularity of FRAM in recent years has translated into a significant body of research addressing several forms of applications across different fields and disciplines such as aviation [6][7][8], construction [9], healthcare [10][11][12], education [13], etc. Despite the many advantages offered by FRAM, there still exists room for improvement and development to apply FRAM in a more standardized and reliable manner. One issue with qualitative scales generally can be attributed to the ambiguity and vagueness of the used scales, which might result in different perceptions of the associated magnitudes [4]. People can perceive the magnitude and meaning of the same words and terminology differently. Additionally, it is not always easy to determine the outcome of an evaluation when using qualitative terms [14]. One must apply natural language to describe the system in question, relying on experience (historical data) and the expertise of field experts, which might not always be accessible for the analyst. Since its introduction, significant research projects have been initiated to explore the possible benefits of FRAM and introduce innovative models building on its principles. One of the earliest studies in that direction was Macchi (2009), which aimed at introducing an aggregated representation of performance variability by using ordinal numerical scales [15]. Another contribution to the body of research on FRAM was provided by Rosa et al. (2015), who proposed a FRAM framework using the Analytic Hierarchic Process (AHP) to provide a numerical ranking based on the comparison of pairs of criteria [9]. The work of Patriarca et al. provided several significant contributions in that regard [16][17][18][19]. The study proposing a semi-quantitative FRAM model relying on the Monte Carlo Simulation and using discrete probability distributions to better represent functional variability was especially interesting for our project [19]. Another study applying a probabilistic approach was conducted by Slater (2017), which constructed a FRAM model as a Bayesian network of functions [20]. Lee and Chung (2018) proposed a methodology for accident analysis based on FRAM to quantify the effect of performance variability in human-system interactions (HSI) [21].
Research efforts to address issues related to uncertain information go back to the 20 th century, which led to the introduction of new solutions for dealing with imperfect knowledge such as Fuzzy Set Theory [22] and Rough Set Theory (RST) [23]. The use of fuzzy logic has been especially successful in that aspect since its introduction in 1965, and several frameworks were consequently developed and proposed to push the research work further and provide better and more reliable results. In the previous stage of our project [24], we addressed the lack of quantification and aimed at presenting a possible approach to overcome this limitation by integrating fuzzy logic into the framework of FRAM. The link between fuzzy logic and FRAM has been identified by Hirose and Sawaragi (2020) [25] as well, who proposed an extended FRAM model based on the concept of cellular automation, applying a fuzzified CREAM (Cognitive Reliability and Error Analysis Method) [26] to connect the functions and visualize their dynamics [25]. The fuzzy logic-based approach enabled us to compute with natural language and provide quantifiers for the quality of output through the construction of fuzzy inference systems.
However, despite the many advantages offered by fuzzy logic to provide quantitative outcomes for vague concepts and natural terminology, several limitations were faced in our prototype nonetheless. Firstly, the analyst is forced to develop a quantitative range of values and partition the universe of discourse to determine membership values, which might be relatively subjective. In case the number of input values were high, the issue of "rules' explosion" could deem the model unfeasible and difficult to realize. To avoid the rules' explosion problem and construction of a resource-demanding model, the number of variables and associated classes was limited. Additionally, one might not be always able to determine the outcome or the class of the decision by relying solely on qualitative scales written in natural language. The outcome might as well be variable and vague in nature due to the incompleteness of needed information or vague nature of the input variables and outcome itself. In this stage of our project, we aim at further improving our hybrid model by proposing the application of Rough Sets Theory (RST) to handle the input data, generate a more efficient rule base, and classify the outcome. RST can be helpful in that case by filtering the available information and identifying patterns to determine the outcome based on data tables (instances, cases, statistics, etc.) [27]. RST is a data-mining tool capable of deriving decisions from a minimal set of rules generated for input data in the presence of uncertain and vague information. The applications of RST were especially beneficial in the fields of Artificial Intelligence, Knowledge Engineering, Decision Making, etc. The principle of RST employs the notion of approximation space, which states that each object in the universe of discourse can be at least associated with some information [28]. Objects possess certain shared attributes, which form information about those objects that allows comparing and discerning based on the outcome. Objects, which are characterized by the same information, are indiscernible, i.e., they cannot be distinguished in relation to the available information about them. The indiscernibility relation forms the mathematical foundation of rough set theory [28]. The generated rule base would then be applied in the Fuzzy Inference System (FIS) to generate quantified outcomes for the functions. Finally, the results obtained by this hybrid model are to be compared with the results of the prototyping model for validation and drawing conclusions. In the next section, a brief overview of the features of RST is provided and the proposed approach is afterwards presented.

An Overview of Rough Set Theory (RST)
Rough set theory is a mathematical framework proposed by Pawlak in 1982 for handling datasets and computing approximate decisions given a defined set of attributes [23]. Datasets in RST are organized in two-dimensional matrices (tabular form), which is named an information system [28]. The columns in an information system represent the condition attributes, and the rows represent the objects in the observed universe of discourse [28]. For each object, a value can be assigned with respect to the observed attribute. The decision attribute, which is the conclusion of the assigned values for the condition attributes, is provided in the final column of the information system ( Table  1). The information system is then called a decision system. Mathematically, it can then be represented as IS = (U, A, V, D), where IS is the decision system; U is a finite set of objects; A is a finite set of attributes; V is the set of assigned values, and D is the decision class [28]. In RST, knowledge depends on the ability to classify objects, which represent real or abstract things, states, concepts, instances, or processes [29]. RST analyzes data tables in terms of equivalence classes by identifying patterns and shared attributes to discern the objects. Equivalence classes are subsets of the original set and represent indiscernible objects, which cannot be distinguished from each other by examining their attributes. The notion of a rough set presents a definition of a set of objects that cannot be defined definitely by these equivalence classes, since it overlaps with at least one of them [30]. A rough set, in contrast to a crisp set, consists of a boundary region, in which objects cannot be classified with certainty as either members or non-members of the set [27]. In other words, the available information on the objects in question is not sufficient to classify the objects as definite elements of the set or not. Therefore, a rough set is defined in terms of a pair of sets: the lower and the upper approximations. The lower approximation is the set of all objects that certainly belong to the original set, and the upper approximation is the set of all objects that possibly belong to the original set.
Following on our definitions above that U be a finite set of objects and A be a finite set of attributes, then ∀a∈ A, there exists a set of values such that a function : ⟶ can be determined. Let B be any subset of A, then a binary relation I(B), called an indiscernibility or equivalence relation, can be defined as follows [27]: An equivalence class containing an element x can then be defined as B(X), and x & y are called B-indiscernible [27]. The approximations for X can be defined as follows [27]: The boundary region (X) is defined as the difference between the upper approximation B*(X) and the lower approximation * (X). The set B is called rough if the boundary region is not empty. A rough decision class is one that cannot be uniquely represented by the input data for the respective attributes. The usefulness of this method is represented in the ability to approximate X using only the information provided by B through the upper and lower approximations of X. This concept of approximations allows for computing reducts in data tables, which is explained in the following section.

Reducts
Reducts are reduced subsets of the original sets, which contain the same accuracy and essential information as the complete dataset [31]. Only the necessary attributes that provide information to define and discern the decision class are contained in the reduct. Computing reducts is an essential concept in RST and serves the purpose of minimizing the rule base to provide more efficient computation and simplified representation of large datasets. The decision table is examined to identify multiple reducts using a discernibility function (true or false) for each object, which is true for all attributes' combinations that discern the object from the other objects in the dataset with different decisions [30]. The RST approach allows, through the use of efficient algorithms for evaluating the significance of data and detecting hidden patterns, to determine sufficient minimal datasets (data reduction) and generate a set of decision rules [30]. This approach is easy to understand and offers a straightforward interpretation of the obtained results.

Decision Rules
The reducts are then used to generate the decision rules by constructing conditional IF-THEN statements [31]. The antecedent part (the IF part) is the conditional part derived from the values assigned to the conditional attributes, while the consequent part (the THEN part) is the conclusion or decision class resulting from those values. The generated rules can be evaluated in terms of support, coverage, and accuracy [30]. The support for a rule represents the number of objects in the decision table that matches the rule [30]. The coverage of a rule represents its generality, i.e., the number of objects with the same decision class in the decision table matching the IF-part of the rule [30]. The accuracy on the other hand represents the number of objects in the coverage group providing also the same decision, i.e., matching the THEN-part as well [30]. The objective is always to obtain higher accuracy and coverage to provide more validity for the generated rules. The same procedure is repeated for all equivalence classes to construct the rule base. The rules can then be used in a descriptive manner to identify patterns and better understand dependencies and relationships in the dataset. Historical data used to construct the rule base can then be used to predict possible outcomes as well.
Since its introduction, there have been many applications of rough sets, which extended across numerous domains and fields, from Artificial Intelligence (A.I.) [32] over Pattern Recognition [33] to Risk Management [34] and Data and Knowledge Engineering [35] to Hospitality Management [36], Health Care [31,37], Marketing [38], and Human Resource Management [39], etc. One especially interesting application of rough sets is the deployment as an assisting tool in decision-making, where decision rules are derived based on historical information [40]. These decision rules can then be deployed to classify newly acquired information [40]. Based on a set of performance conditions, i.e., a set of values assigned to the attributes of the information system, a decision or a quality class can be determined. The process for rules' generation relies on the notion of indiscernibility and equivalence relations introduced by the rough set theory. For this purpose, several algorithms can be used to identify equivalence classes and the reducts within the provided information system to extract a minimal but sufficient rule base for classifying input data. Additionally, the relationship between rough sets and fuzzy logic is of special interest in this context. The concepts of rough sets and fuzzy sets are different in so far that they refer to different aspects of vagueness and uncertainty [41]. While fuzzy set theory defines vagueness in terms of a fuzzy membership function, rough sets use the concept of a boundary region to characterize imprecision in the available information [28]. However, different does not mean unagreeable; rather, the two concepts are closely related and complement each other [42]. The application of rough sets in conjunction with fuzzy sets is not original since the two methods are related. In the literature, there exists a significant amount of studies that combined the two approaches to utilize the advantages of the two methods [43][44]. The definition of rough sets as fuzzy sets in fuzzy rough sets or the opposite in rough fuzzy sets are two examples of how close the two methods are [42]. In this paper, we propose a modified FRAM framework combining the RST approach to analyze data and generate rules and the FIS to apply the generated rule bases and quantify the outcome's quality. It is important here to note first that the objective of this study is not to mainly contribute to the improvement of either fuzzy logic or rough sets. Rather, this study utilized well-established frameworks such as the Mamdani-Assilian Inference System [45] or the RST model proposed by Aleksander Øhrn and the Rosetta development team [31]. The main contribution of this study is the proposition of a model combining the two tools within the framework of FRAM. The two methods present advantages in the evaluation of qualitative values expressed in natural language, which can be difficult to achieve with classical statistical methods [46][47]. The qualities of those methods are much related to the concept of FRAM and can be helpful to further improve FRAM and present a different approach. The application of rough sets for evaluating complex sociotechnical systems such as aviation generally, and deicing operations specifically, is therefore interesting and could provide insights and a new direction to be addressed by further research efforts in the future. The proposed approach will be explained in the following section.

Proposed Approach
In this section, we present our proposal to modify the framework of FRAM to integrate rough sets and fuzzy logic. It is also important here to note that the objective is not to decompose the system into its components to linearize and simplify the relationships in question. Rather, the goal is to maintain a systemic perspective and propose an additional tool that can complement classical tools. Our objective is to combine the above-mentioned methods, which stem from different fields, to benefit from the advantages that each method presents and consequently overcome some of the limitations faced in our prototyping model. In the prototyping model, we proposed a framework to integrate fuzzy logic into FRAM [24]. Therefore, the basics of fuzzy logic and the details on how to construct a fuzzy inference system will not be discussed here in depth again; rather, only the aspects concerning the addition of rough sets and the combination of the two approaches will be explained. For further details on the design and construction of the fuzzy FRAM model, the reader is advised to consult our previous paper [24]. The addition of rough sets into the prototyping model is explained in the following five steps.

Step Zero
The FRAM framework in its basic form consists of five steps [3]. The first step is concerned with formulating the purpose of the analysis, i.e., the main function, process, or system to be evaluated [3]. The objective defines as well whether the analysis should be concerned with past events to draw conclusions and learn lessons for the future; or the analysis should be of predictive nature, which builds on historical data to identify possibilities for success and failure.

Step One
The set of functions that constitute the analysis model are to be defined in a task-analysis similar approach. Functions represent a specific objective or task within the selected context of analysis [3]. After formulating the main function and objective in the first step, a list of subprocesses or functions, which constitute the system at hand and taken together form all steps needed to achieve the main function of the system, is selected. The number of functions defines the context and the boundaries of the model, where the background functions serve as the model boundaries and the foreground functions form the focus of the analysis [3]. For each function, there are six aspects: input, preconditions, time, control, resources, and output [3]. The first five aspects represent five types of incoming instances linked to the function from upstream functions. The output of the function represent the outcome of the function that will be linked to the downstream functions as one of the earlier-mentioned five aspects. The couplings among the functions define how the functions are linked together within the system. Additionally, for each defined function, a list of relevant Common Performance Conditions (CPC) [15] is selected depending on the type and nature of the function itself to account for the influence of the context on the performance of the function. The characterization of the functions is important for deciding what kind of data will be needed for the analysis. For a predictive assessment, data providing indications on the quality of performance conditions are needed. The data entered into the information system will be utilized to form a discernibility matrix, where the rows represent the objects and the columns the attributes. Each cell in the information system (decision table) contains a value assigned to the object in the same row concerning the attribute in the same column. The n×n discernibility matrix is then completed to identify indiscernible objects, which have different decision classes for the same attributes' values. In other words, for each row and column in the discernibility matrix, the condition attributes that discern objects with different outcomes or decision classes are identified to determine a minimal but sufficient set of performance conditions.

Step Two
The characterization of performance variability for each function in our model takes place over two main steps: the characterization of the internal variability and then the characterization of the external variability. The functions in our previous model were defined in terms of two Fuzzy Inference Systems (FIS). The first FIS was concerned with the Internal Variability Factor (IVF) examining the CPC list to estimate possibilities for variability from within the function. The second FIS was of higher order and was concerned with characterizing the External Variability Factor (EVF) imposed on the function through the couplings with upstream functions.
The starting point here for inducing variability into the system is from within each function, using the IVF, which can be considered as a seventh aspect for the function. The background functions form the boundary of the analysis context and provide therefore an invariable outcome. The downstream functions, if only the functional couplings were to be considered, would have no way to produce variable outputs, since all incoming aspects would be invariable. Therefore, to provide the means for the analyst to predict or anticipate variable output, the IVF can be utilized. The IVF can be defined as the impact of the working environment and the present performance conditions at the time of execution of the function in question. For our model, we selected the CPC list, which was originally used as a part of the CREAM method [26].
The CPC list is not supposed to represent complex relationships among functions; it merely serves the purpose of representing the impact of the working environment on performance and helps the analyst anticipate possible variability in outcome. The selection of a list of performance conditions or performance-shaping factors can be conducted in practice depending on the context, the nature of the system of interest, and the functions themselves and is by no means limited to the CPC list. FRAM functions can be classified as one of three types in accordance with the MTO classification concept: huMan, Technological, and Organizational functions [15]. Depending on the MTO type, a list of relevant CPCs can then be selected. Each CPC will be evaluated as either "adequate" or "inadequate" for start, and the partitions can be of course extended to a three-or five-point scale; however, this would translate into a large number of rules and might create a demanding inference system later.
Whether the CPC list would be needed entirely or partially can be determined using the principle of indiscernibility in RST to identify the set of attributes that is necessary to preserve the same classification information as the original set (Reducts). The two classes will serve firstly as the attribute values in the RST method and secondly, as the membership functions or the two partitions of the universe of discourse in the FIS. Therefore, the qualitative scale with the two values will be used to construct the following: firstly, the data table to feed to the RST method and then the reduced rule base generated by the RST will be used in the FIS. The dataset can be constructed consisting of instantiations of the function in question and by recording historical data for the same operation over time. For example, for the function "Deicing", the performance conditions present for the deicing of aircraft 1 are recorded, then aircraft 2, and so on until aircraft n. The more data entry points accumulated, the higher the accuracy and validity of the generated rules. The functions in the RST framework represent the objects, while the performance conditions are the condition attributes (Table 2). The decision attribute for the IVF can be then "non-variable", "variable", or "highly variable" ( Table 2). The dataset is usually split randomly into two sets: one training set to identify the reducts and generate the rules and one testing test to check the validity of the rules. The split factor of the dataset is determined first. Then, the suitable algorithm to explore the training set to identify the reducts is selected. The reducts are then determined, and the respective rules are generated. For each row in the discernibility matrix, a rule is formulated using logical "OR" or "AND" operators to formulate the antecedent part (the IF-part). The outcome of the antecedent is determined in the consequent (the THEN-part), where logical "OR" and "AND" operators can be used if multiple decisions are possible. The generated rules can then be used to classify the testing set to validate the generated rules. The requirements to consider the rules is determined by selecting the acceptable levels of coverage, support, and accuracy of the rules. The rules can then be examined by experts to examine the validity and meaningfulness of the final set of rules. Afterwards, the rule base can be migrated into the FIS of the function in question to apply as the rule base for classifying and quantifying the outcome. In the FIS, the analyst defines which metrics or assessment scales can be used to evaluate each CPC and assigns a value on a scale between zero and ten to each CPC. Figure  2 depicts the triangular membership functions for the input values for each CPC, whose universe of discourse is partitioned into the two earlier-mentioned classes: "inadequate" or "adequate". For the instantiation of the analysis scenario, a score on a ten-point scale is selected for each CPC to run the simulation. The numerical outcome of this first-order FIS is then the IVF. The IVF can be calculated using the following formulae: = . / where here represents the value of the n-th CPC for the i-th rule, MIN denotes the fuzzy minimum function, (v) is the implication value for each rule, is the aggregated value for all (v), and is the final numerical score calculated by applying the centroid method. The second step in Step Two is to characterize the External Variability Factor (EVF). The same process can be repeated for the higher-order FIS for the external variability factor to reduce the rule base and allow for more efficient processing. The variability of the function's output is normally characterized in terms of timing and precision using a three-point qualitative scale. The EVF is defined in our model as the external variability provided to the function through the couplings with upstream functions. We merged the impact of the timing and precision aspects to provide a simplified variability representation of the output (Figure 3). In the proposed model here, this characterization was further simplified and an aggregated representation is proposed for two reasons. First, in a predictive or proactive assessment, it is not always straightforward or obvious what the outcome using qualitative scales should be. There exists a level of vagueness to the different perceptions of the meaning of words between people, the vague nature of the input variables themselves, or the lack of quantitative or standardized protocols to determine the result. For some instances, it is sometimes difficult and hardly possible to predict (in a predictive assessment) how the variability is manifested. One might be able to anticipate that in the case of variable resources or preconditions, the output would be variable as well. However, it could be difficult to determine whether the variability would manifest in terms of timing or precision. It would be easier to state that an output would be variable to a certain degree, whether negatively or positively, but more difficult to identify how the variability would translated in practice , for example, as a delay, imprecision, earliness, on time, etc. Secondly, the more classes we assign to input variables, the more rules we obtain and the more demanding the construction process of the model would be. Therefore, to design an efficient model, it is advised to keep the number of classes low at this stage, however without trivializing and reducing the meaningfulness of the obtained results. Therefore. The impact of the two phenotypes is combined to produce a three-point scale: "non-variable" accounts for positive or neutral impact; "variable" represents the outputs with low and medium variability; and "highly variable" represents very critically and negatively variable outputs (Figure 3). The EVF is determined in the same manner as the IVF; however, the functional aspects in addition to the IVF are used as attributes in the RST table (Table 3), this time with three possible input values or classes, as described earlier: "non-variable", "variable", and "highly variable" (Figure 4). For determining the implication of the fuzzy rules, the "MIN" method was used; while for the aggregation, both the "MAX" and the "SUM" methods were applied depending on the nature of the output. The numerical quality of the output is obtained by defuzzifying the final aggregated fuzzy area and calculating its center of gravity.

Step Three
A specific analysis scenario is constructed to apply the developed model. The analysis scenario is different from the FRAM model in so far that it presents specific operational conditions and a specific case for evaluation. The FRAM model on the other hand consists of functions that compose the general context that we would like to study without specifying performance conditions and the occurrence of events. Depending on the present conditions, the internal variability for each function can be determined and thereafter, the output's variability and its resonance and impact on other functions.

Step Four
The final step would be to evaluate the generated results and examine what countermeasures are necessary to avoid failures and ensure resilience. The numerical outcomes provide in that case an assistive indicator that could point to possible sources of variability (negative or positive) and how high this possibility could be. Figure 5 presents an overview of the modified mixed rough sets/fuzzy logic FRAM framework.

Step Zero: Context and Objective
The objective of the analysis is still to provide a predictive assessment to provide performance indicators that can be helpful in identifying possible sources of performance variability. The selected context serves the purpose of providing a demonstration for an application scenario and relies to a certain degree on educated assumptions to perform the simulation.
For the application of the developed model, the same application scenario applied in the previous stage [24] is used again here to maintain the same settings and compare the results of the two models: the fuzzy logic-based FRAM model and the mixed fuzzy logic/rough sets-based FRAM model. Airport operations are gaining in complexity year after year with the continuous technological advancements and the increasing traffic volume [48]. The selected analysis context depicts a hypothetical scenario inspired by two accidents related to aircraft deicing operations, namely the Scandinavian Airlines flight 751 crash in 1991 [49] and the Air Maroc accident in Mirabel in 1995 [50], to evaluate aircraft deicing operations. Several events and influential factors that played a significant role in the development of the two above-mentioned accidents were adopted to construct the analysis context for this simulation. The designed scenario presents a setup in which an international flight is scheduled for departure at a North American airport for a Trans-Atlantic flight. The pilots of the flight were lacking experience with deicing procedures, and the instructions and guidelines provided by their airliner did not specify clearly deicing-related communication protocols with the Air Traffic Control (ATC) and the deicing tower. Detailed and adequate instructions concerning the required inspection procedures prior to and post deicing operations were also insufficient and underspecified. The aircraft was taxied from the gate to the deicing pad to be deiced by two deicing trucks. The temperature at the time of the operation was around 0 °C, and snow showers were present. The flight was delayed as a result of high traffic volume and due to the weather conditions and the state of the runways, which affected the movement of the aircraft on the airport grounds. The following performance conditions as a result of the described scenario are not optimal: the provision of adequate training and competence, airliner procedures and instructions, availability of resources and time pressure. The above-described scenario will be used next in Step One to identify the functions that constitute the system in place and which added together characterize the whole process to deice the aircraft before taking off. The impaired performance conditions will be afterwards used in Step Two to assign the scores to the selected lists of CPC for each function.

Step One: Functions' Definition and Characterization
The definition and characterization of the functions is a very decisive step to determine what kind of data is needed. Each function draws a relationship between the output that we wish to evaluate and the different aspects of the function that affect its quality. The functions were selected based on knowledge gained through an extensive literature review of deicing reports and the research activities of our team. The scope of the analysis is limited mainly to the deicing activities performed at the deicing pad. The functions are characterized in a way that allows for a wide systemic perspective. Totally, there are 4 background functions and 13 foreground functions. The characterization process of functions to define the analysis context is by no means rigid, and functions can be well modified or updated in case additional insights or info were to be presented later to achieve better results. Table 4 presents a list of the functions and their characteristics. Table 4. A list of the FRAM functions in the deicing model [24].

No.
Function Name Type Description

Step Two: Variability Characterization
The starting point to characterize variability is with the internal variability for each function using the selected CPC lists. The scales to evaluate the CPCs and the form of the membership functions in the FIS can be defined as needed depending on the context of analysis and the nature of the observed instances (numerical or linguistic). In our case, each CPC can be assigned a numerical value on a scale between zero and ten, which will be partitioned in the FIS into two classes: "adequate" or "inadequate". The chosen form of membership functions for all CPCs is triangular. The training data needed to start was initially provided for the fuzzy inference system by generating the data table using an automatic generator to account for every possible combination of values for the CPCs. In the case of human functions, we have then eight CPCs for each function with two classes, which would translate into a data table with 256 rows or objects (possible combinations of values). In the case of an organizational function, we have three CPCs for each function with two classes, which would translate into a decision table with eight rows. We apply the RST suite using a genetic algorithm to compute the reducts for each function and generate two rule bases: one for the internal variability factor (IVF) and one for the external variability factor (EVF). The decision class (THENpart) is determined by assigning a numerical score for each class in the antecedent (IF-part).
After generating the rule bases for all functions, the rule bases are migrated into the FIS of the functions and an instantiation of the constructed model is run for testing the RST-generated rule base and to generate numerical results. The RST-generated rule base for the CPC list is used to generate the IVF first for each function (Table 5 and Table 6). For running the instantiation of the model, a numerical score on a scale between zero and ten for each CPC with respect to each function is selected. The numerical scores were selected based on the evaluation of the performance conditions as described above in STEP ZERO. The accuracy of these scores and development of relevant performance indicators in a real-world study would rely heavily on the expertise of the analyst and the context provided, as explained earlier. The purpose of these scores here is to serve as input values for our model to run the simulation. The weight for all CPCs is the same, and the IVF was considered to have the same weight as the other functional aspects for simplification. This however can be changed depending on the selected functions. The IVF is then used to compute the quality score of the function's output, which is a value between 0.25 and 1.25, where 1 is a non-variable output, any value above one represents positive variability, while any value below one represents negative variability.

Step Three: Functional Resonance
The relationships among the functions are defined by linking the outputs of the upstream functions as one of the five incoming aspects of the downstream functions. The output in that case becomes a condition attribute of the downstream function. The IVF and the five incoming aspects are fuzzified to determine the quality of the function's output and so on. The quality of each output is presented in Table 7. The instantiation of the model can next be illustrated in the graphical representation ( Figure 6), which provides a visualized overview of the relationships among the functions. The graphical representation allows for an easier evaluation and examination of the model to identify possible overlapping and combinations of variability.

Step Four: Variability Management
Finally, the provision of numerical outputs as output's variability indicators points at possible variability in the performance of the defined functions. The combination of variability can point as well to weak and strong spots in the studied system. Consequently, this allows us to introduce precautionary and preventive measures to enforce the system and make it more resilient by strengthening the weak components and promoting conditions that ensure desired outcomes. The graphical representation of the functions presents a very helpful tool to illustrate the relationships and dependencies among the functions and how performance variability can affect their outputs.
As mentioned earlier, the analysis scenario was kept the same as in the previous stage of implementation. The same settings were kept identical to be able to better compare the results and ensure that any differences in results cannot be attributed to any factors other than the added RST method. The characterization of functions and their relationships in the FMV was kept the same. The scales, the partitioning, the membership functions, and the FIS settings (Fuzzification, Aggregation, Defuzzification, etc.) were kept identical as well. The only element changed was the new rule base generated by the RST method.
Based on the formulated assumptions, the simulation scenario presented a case study in which airliner training and instructions were inadequate. The flight was delayed as a result of extreme weather conditions and a stressed flight schedule at the airport. The performance conditions for the functions Training, Airliner Guidelines and Instructions, Planning, Flight Crew Supervision, Pre-Deicing Inspection, Deicing, Post-Deicing Inspection, Anti-icing, and Taxi to Runway were negatively impacted (negative variability <1). Accordingly, the values in Table 5 and Table 6 were chosen to simulate the impact of the defined performance conditions on each function and lead to the generation of internal variability. Functions such as "Deicing Tower Control" and "Taxi Aircraft to Deicing Pad", whose CPCs were not affected and maintained a maximum score of 10, had a maximum output of 1.25 and impacted the output of the function positively and would dampen any negative variability provided through the couplings. Since we kept the weight equal for all CPCs and aspects, the IVF for the functions "Deicing" and "Post Deicing Inspection" were also equal after selecting equal input values in Table 6. The four background functions "Review Meteorological Data", "Aircraft Specifications", "Regulations & Supervision", and "ATC Supervision" in Table 7 provided invariable outputs as defined and therefore a neutral score of 1. On the other hand, Optimal performance conditions for functions such as "Resources & Equipment", "Taxi Aircraft to Deicing Pad" and "Deicing Tower Control" impacted the output of those functions positively (>1). The obtained numerical scores for the outputs were identical to the results of the first model, which represents an ideal outcome.

Discussion
In contrast to simple systems, the evaluation of complex systems is not straightforward and offers many complications for the analyst, especially in systems that rely to a great degree on qualitative assessment of the variables in question. Studying qualitative contexts requires imagination and relies mainly on human judgement using natural language to classify objects of interest. The experts in practice make informed decisions based on their years-long experience and in-depth knowledge of the inner workings of the system in question. Whether written records or human expertise, the decision-making relies on experience gained through past events that formed the knowledge in databases or human experts. A limitation with such knowledge is that it is not always straightforward in presenting results or outcomes. It is not always obvious which outcome would be received under certain performance conditions. The observed instances might be vague in nature and therefore difficult to quantify or sometimes even assess using natural language. It would be difficult to quantify such variables as the comfort of a car or the adequacy of instructions [5]. This could be as well due to the design of the used scale or due to the different subjective perceptions of humans of qualitative concepts. Incomplete information and the difficulty to make a judgement and assign a specific value to a given variable are therefore common [5]. Designing frameworks to handle data to recognize patterns and derive conclusions would be very advantageous to help experts and new decision-makers reach a decision. Mechanisms that facilitate the extraction of knowledge from imprecise, incomplete, and vague data will be needed not only to extract the needed knowledge from such data tables but to help classify problems and outcomes as well [40].
In the case of reactive or retrospective analyses, the events that transpired are clear and definite and allow in most cases for an exact description of the analysis context [51]. Proactive evaluations start with the design phase of a system, during which appropriate design concepts complying with the specified performance requirements are identified [51]. The quality assessment should examine whether the proposed concepts comply with the required performance and what challenges might arise in the future following the real-world implementation. To perform this task, the concepts adopted in retroactive or retrospective analysis methods can be utilized as well to anticipate obstacles and challenges. While this task might be easier with simpler systems, it is mostly not possible during the design phase of complex systems to identify all risks and performance-impairing factors since not all parameters can be explicitly known in advance. In proactive or predictive analyses, there exists a certain lack of certainty and results are produced relying mainly on assumptions reached by evaluating historical results. Only when real-world implementation is accomplished and time was given to interact with the real-world operational environment, one might be able to know the implications of the designed system. It is therefore very important to further advance proactive evaluation methods adopting a systemic perspective to comply with the faced challenges. The concepts of Resilience Engineering can prove helpful in this endeavor to design resilient systems capable of adjusting and coping with the complexity of the real world.
Relying on traditional statistical methods could be desired in many cases; however, it is not always possible when it comes to uncertain environments. Such methods require large datasets to draw meaningful statistical inferences [52] and can be affected by other factors such as the independency of variables and the normality of data distribution [39]. Data-mining tools as RST and fuzzy logic can be in such cases more suitable to evaluate complex nonlinear contexts, in which qualitative scales are the only possible measurement method. The application of "IF-THEN" rules written in natural language is more comprehensible for decision-makers and offers more communicative results. These rules can be derived by the RST method without specific limitations and constraints of data distributions or data size [39]. A historical database can be constructed as explained in the methodology section by recording events and operations, assigning a score to the relevant attributes. The more data points provided for the RST method, the more accurate and reliable the results are going to be. There will always exist a specific degree of uncertainty with the provided results, but that is just the case with all predictions. The results can be considered as indicators that can point to possible sources of performance variability within the examined context.
The model here presents an evolution of the previously published model [24], which incorporated fuzzy logic as a quantification tool into FRAM. Fuzzy logic has been applied previously in conjunction with CREAM [53] and FRAM [25] and proved useful in applications with qualitative scales. However, the model presented here employs solely the CPC list just to account for the impact of the context and provide quantifiers for the internal and external variability of the functions. The addition of RST can help with the rule explosion problem in case of a high number of variables and associated classes and partitions, which results in a large rule base that can be heavy and resource demanding on the computing machine. Covering all possible combinations of variables and values to provide an output to each specific combination of values would convert into a very unfeasible mission in the presence of thousands of rules. A method for deriving rules from a limited set of data is therefore desirable to provide a more practical approach for analysts. For the construction of our model, the FRAM functions are characterized as fuzzy inference systems to produce a quantified output. The same functions are characterized at the same time as RST decision tables, in which the CPC and the couplings are defined as attributes to derive rules and classify the output. This requires knowledge and expertise in both fuzzy logic and rough sets, which can be somewhat demanding. Additionally, the amount of required data is high and exhaustive, which results in an exhaustive filtration and treatment in the selection process and some extensive efforts in the characterization (attribute selection, partitioning, membership functions, etc.) and the rule evaluation and validation processes. The proposed model here does not necessarily aim at making a significant contribution to fuzzy logic, rough sets, or the combined approach. The main contribution lies in the use of these tools within the framework of FRAM to provide quantified outputs and more efficient data classification algorithms. It merely presents a first step and a possible approach to use such techniques in combination with FRAM to present more intersubjective and quantified results using natural language for evaluation.
In this paper, the focus was mainly directed to the integration process of rough sets, which should allow for handling a larger number of input variables. However, the settings were kept as defined in the previous stage to provide a better comparison of the results. The numerical results obtained for the outputs in this paper were identical to the last digit to the ones provided in the first model, which presented a combined model of FRAM and fuzzy logic. The findings and derived conclusions of this instantiation align with those of the first one, which enforces the standing of the results and demonstrates the usefulness of rough sets. The same dataset was used for both models, which presented an ideal scenario accounting for every possible combination of input values. However, the advantage for using RST additionally is presented in being able to produce minimal but efficient rule bases (same accuracy in our case). Additionally, the application of rough sets would help in the treatment of datasets obtained from real-world recordings and archived data, which could be incomplete, inconsistent, or limited in size. The RST method using historical data can help in addition to classify outcomes and reach decisions without continuously relying on experts, other than for the evaluation at the time of observation. This would result in more efficient and smaller rule bases, which can be generated automatically by the RST method. The use of real-world data shall explore further the merits of such an approach, and further insights can be provided once real-world data is used. The size and consistency of the provided data are two significant factors to consider in the process.
At this stage of model development, the framework is a simplification of reality due to the fact that the characterization of functions and data selection process are entirely simulated. As was the case in the previous stage, the purpose is still to demonstrate a possible approach to introduce quantification into FRAM without losing its properties and significant advantages in handling complex contexts. This study explores such possibilities, utilizing rough sets in this case in addition to fuzzy logic to lay down the theoretical foundation as a first step from a technological readiness perspective. The definition of the analysis scenario relied on assumptions and simplified reality to make things easier. The weights were selected the same for all attributes and the functional aspects as well. This might be different in a real-world application and can be adjusted as deemed appropriate by the analyst, depending of course on the significance of each attribute to the execution of the function in question.
The need to adopt a Resilience Engineering perspective in system's analysis and look at failure and success as complex emergent events has been addressed and advocated in a wide array of studies [4,1]. Such applications are promising and can provide interesting and helpful results that can complement established methods to better keep up with technological developments and the continuously increasing complexity of sociotechnical systems [54]. However, as is the case with new and innovative methods, the application of such methods from a technology readiness point of view is not without challenges and issues to overcome. The lack of sufficient data and precedence cases at the beginning results in the absence of standardized protocols, which shapes the analyses to rely more on subjective judgement and personal expertise. The rarity of adverse events in high-reliability organizations such as aviation makes the production of sufficiently large databases and meaningful statistics a difficult task. The specificity of case studies make it difficult to generalize findings to other analysis scenarios. The validation process to ensure the provision of valid and reliable results requires the generation of large databases and a sufficiently high number of case studies to present reproducible results. It would be therefore more helpful to adopt a SAFETY-II approach instead and look at successful outcomes as emergent and complex events, as is the case with adverse outcomes. The meaningfulness of the produced results of this model or any other proposed new method depends greatly on the meaningfulness of the provided input data. Therefore, the construction of adequate databases and standardized performance indicators is necessary to utilize innovative methods and adopt new perspectives on safety and system's performance. The application of rough sets as a data-mining tool to filter and classify data in addition to fuzzy logic as a tool for quantification and computing with natural language can prove especially helpful in such endeavors.
The proposed model in this study is still in the design phase and presents the first steps to implement data-mining tools as rough sets and fuzzy logic into FRAM. To become application ready and provide more reliable results, further validation and optimization work is still needed. The next step would be to construct a more realistic model using real-world data to examine how the model would perform under realistic circumstances. Going forward, the application of the proposed approach into new contexts can additionally help validate the model and provide more insights on how to further improve and modify the proposed framework. Nonetheless, the obtained results are still promising and present an interesting start point to explore further applications and drive research efforts on that front forward.

Conclusions
In this paper, we built on the results and methodological propositions achieved in the second stage of this project. To further improve fuzzy-FRAM and ease the modelling process, RST was proposed as a data-mining tool to facilitate the treatment of input data, generate more efficient rule bases, and derive decisions based on recorded historical data. The proposed model was then applied to a case study examining performance in aircraft deicing operations, maintaining the same settings as in the second stage. The datasets used to generate the reduced rule base were the datasets generated by the fuzzy inference system, which creates an ideal dataset accounting for every possible combination of input values. The produced numerical outcomes were identical to the results received in the previous model, which displays the usefulness and accuracy of the RST framework. However, it remains necessary to say at this stage as well that the presented model is still a prototype and requires further validation and optimization in future research work to provide more representative and reliable results.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1.