A Novel Risk Assessment and Analysis Method for Correlation in a Complex System Based on Multi-Dimensional Theory

: With the rapid development of high integrations in large complex systems, such as aircraft, satellite, and railway systems, due to the increasingly complex coupling relationship between components within the system, local disturbances or faults may cause global e ﬀ ects on the system by fault propagation. Therefore, there are new challenges in safety analysis and risk assessment for complex systems. Aiming at analyzing and evaluating the inherent risks of the complex system with coupling correlation characteristics objectively, this paper proposes a novel risk assessment and analysis method for correlation in complex system based on multi-dimensional theory. Firstly, the formal description and coupling degree analysis method of the hierarchical structure of complex systems is established. Moreover, considering the three safety risk factors of fault propagation probability, potential severity, and fault propagation time, a multi-dimensional safety risk theory is proposed, in order to evaluate the risk of each element within the system e ﬀ ecting on the overall system. Furthermore, critical safety elements are identiﬁed based on Pareto rules, As Low As Reasonably Practicable (ALARP) principles, and safety risk entropy to support the preventive measures. Finally, an application of an avionics system is provided to demonstrate the e ﬀ ectiveness of the proposed method.


Introduction
In recent years, due to the complex correlation of components in complex systems, local faults may have a great effect on the overall system by fault propagation [1][2][3]. Therefore, the safety and risk analysis of such complex systems has attracted more and more attention. Safety analysis and risk assessment aims to eliminate and control various hazards through the design system and take preventive measures to prevent accidents that will cause personal injury, equipment damage, and task failure during system operation. With the development of science and technology, a series of analysis methods for evaluating system failures and risk events have been developed, especially in high-risk fields such as aerospace, chemical, nuclear, and other industrial fields [4,5]. However, there are still a number of safety problems in these methods caused by the coupling and correlation characteristics in complex systems.
Traditional safety modeling and analysis methods are mainly based on the logical process of induction and deduction to carry out system safety analysis. From the local characteristics of the system or the direct relationship between internal components, these methods are used to find the root cause of safety problems and carry out safety work such as analysis, verification, assurance, etc. Typical analysis methods are Fault Tree Analysis (FTA) [6], Event Tree Analysis (ETA) [7,8], Failure Mode In view of the above considerations, this paper proposes a novel risk assessment and analysis method for correlation in complex system based on multi-dimensional theory, aiming at analyzing and evaluating the risk of complex system considering the coupling correlation, so as to identify the critical risk elements. Firstly, the formal description and coupling correlation analysis method of the hierarchical structure of complex systems based on the typical task-function-resource model is proposed, aiming to achieve a formal description of the coupling correlation between components within complex systems, providing the foundation for and analysis and evaluation of risk. Moreover, considering the three safety risk factors of propagation probability, potential severity, and propagation time, a multi-dimensional safety risk theory is proposed, in order to evaluate the risk of each element in the system effecting on the overall system from multiple perspectives. Furthermore, critical safety elements are identified based on Pareto rules, ALARP principles, and safety risk entropy to support the preventive measures.
The remainder of the paper is organized as follows. Section 2 describes the hierarchical model of complex systems and coupling correlation between elements in system. In Section 3, multi-dimensional safety risk theory and assessment are proposed. Section 4 introduces an application of avionics system. Section 5 presents the conclusions.

Coupling Correlation of Complex System
In a general sense, the adjective "complex" describes a system or component that by design or function or both is difficult to understand and verify [38]. Complex system is any system featuring a large number of interacting components that is often difficult to understand, and hard to solve [39,40]. Compared with simple systems, complex systems are usually characterized by more components and a high degree of coupling [41,42]. In the real world, there are a large number of systems that show the characteristics of complexity, such as ecosystem, social organization system, complex social technology system, complex electromechanical system, and complex equipment system [43][44][45]. The complex system concerned in this paper is mainly located in the complex engineering technology system, that is, a kind of complex system with engineering technology characteristics.
Coupling correlations refer to all kinds of association relationships between various elements in the system due to task and function requirements, such as resource reuse, information transfer, data sharing, etc. The strength of the coupling correlation can be quantified by the degree of coupling. For complex systems, the internal coupling correlations are more complicated. The complex coupling correlations increase the risk of fault propagation in the system. The establishment of a system model based on the coupling correlation is the basis for analyzing and quantifying the risk of system for fault propagation.

Hierarchical Model
Generally, a system is built on the background of specific task requirements, that is, the use case scenarios of the system are planned in advance through requirements analysis. These planned use case scenarios can be defined as the task view or task layer of the system. Then, based on the system task planning, the necessary functional decomposition is needed, namely, what basic functions need to be established in order to achieve a specific task. Therefore, this paper defines such decomposed functions as the functional view or function layer of the system. However, the tasks and functions are in the design of the system logic layer. The final implementation still needs the support of the physical layer such as typical computing, storages, communication resources, etc. In other words, the configuration and mapping relationships from logic layer to the physical layer in the system require to be clarified and completed. This paper defines these general physical resources as the resource view or resource layer of the system. In summary, when analyzing from the perspective of hierarchical decomposition, a hierarchical system model based on the task-function-resource layer can be established [46][47][48]. Then, the coupling relationships between the elements in the task-function-resource layer and between the layers can be considered. Based on the topology modeling theory, a topology model with elements as nodes and correlation relationships as connections can be formed. Finally, combined with hierarchical decomposition and coupling analysis, a complex system hierarchy model based on task-function-resource architecture is synthesized, as shown in Figure 1 and different colors and shapes are applied to present the elements in different layers for distinguishing. It is based on the assumption that the number of tasks, functions, and resources is unchanged during the time and the correlation relationships in systems are constant during the time.
Appl. Sci. 2020, 10, 3007 4 of 23 hierarchical decomposition and coupling analysis, a complex system hierarchy model based on taskfunction-resource architecture is synthesized, as shown in Figure 1 and different colors and shapes are applied to present the elements in different layers for distinguishing. It is based on the assumption that the number of tasks, functions, and resources is unchanged during the time and the correlation relationships in systems are constant during the time. It is assumed that the destructive event of the system only originates from the fault of the element of resource layer, and the element of function layer exists as the use of the element of resource layer and the role of the caller. Therefore, in this paper, based on the fault propagation problem introduced by resource layer, the fault of the resource element is the fault trigger point, and the function element provides the propagation medium.

Formal Description of Hierarchical Model
For the task-function-resource hierarchy architecture of the system, from the perspective of the element set, the system's task element set, function element set, and resource element set can be defined separately. The task element is a task unit established by the system requirement analysis. It is supported by a series of basic function elements. The task element set can be expressed as a set of several task elements: = { , , … , }. Similarly, the function element is the basic function unit that supports the task implementation in the system. It is supported by a series of basic resource elements. The function element set can be expressed as a set of several functional elements: = { , , … , }. The resource element is a physical or logical unit that supports the realization of function in the system. The system resource element set can be represented as a set of several resource elements: = { , , … , }.
Through the analysis of the system hierarchy architecture, in order to describe specific relational information, adjacency matrix can be used for the most direct formal record, that is, the mapping correlation matrix between task-function elements can be expressed as shown in Matrix (1). It is assumed that the destructive event of the system only originates from the fault of the element of resource layer, and the element of function layer exists as the use of the element of resource layer and the role of the caller. Therefore, in this paper, based on the fault propagation problem introduced by resource layer, the fault of the resource element is the fault trigger point, and the function element provides the propagation medium.

Formal Description of Hierarchical Model
For the task-function-resource hierarchy architecture of the system, from the perspective of the element set, the system's task element set, function element set, and resource element set can be defined separately. The task element t i is a task unit established by the system requirement analysis. It is supported by a series of basic function elements. The task element set T can be expressed as a set of several task elements: T = {t 1 , t 2 , . . . , t k }. Similarly, the function element f i is the basic function unit that supports the task implementation in the system. It is supported by a series of basic resource elements. The function element set F can be expressed as a set of several functional elements: . . , f m . The resource element r i is a physical or logical unit that supports the realization of function in the system. The system resource element set R can be represented as a set of several resource elements: R = {r 1 , r 2 , . . . , r n }.
Through the analysis of the system hierarchy architecture, in order to describe specific relational information, adjacency matrix can be used for the most direct formal record, that is, the mapping correlation matrix between task-function elements can be expressed as shown in Matrix (1).
where M t f ij = 1 means there is a direct correlation between task element t i and function element f j ; M t f ij = 0 means no direct correlation.
Similarly, the function-resource element mapping correlation matrix can be expressed shown as Matrix (2).
where M f r ij = 1 means there is a direct correlation between function element f i and resource element r j ; M f r ij = 0 means no direct correlation.
If requiring to further record the cross-layer correlation relationship between task and resource elements, we can define and obtain it by matrix operation M tr = M t f × M f r shown in Matrix (3). However, in general, the practical significance of this cross-layer correlation is not obvious. It is the focus on the system design to clarify the software-hardware configuration mapping relationships from functions to resources. Therefore, this paper will focus on correlations from functions to resources.
where M tr ij ≥ 1 means there is a direct correlation between task element t i and resource element r j . M tr ij = 0 means no direct correlation.

Analysis of Coupling Degree
As the physical layer within the system, the form of coupling between the resource layer is also the most obvious: on the one hand, this coupling may result from the functional/logical coupling generated by each resource element serving the same function; on the other hand, it may also cause direct material or information transfer between resource elements, thus introducing specific coupling relationships. Both of the above two coupling forms can be defined as direct coupling. In contrast, a more complex form of indirect association between groups of coupled resource elements is generated due to the addition of the resource-sharing form. This form of coupling can be defined as indirect/cascading coupling. In order to quantitatively describe the direct and indirect coupling relationships within the system hierarchy, this study takes the resource layer as an example to define and distinguish the two coupling concepts.

Direct coupling degree matrix
The direct coupling degree is used to characterize the direct coupling relationship between elements. It represents the situation where there are direct information interactions, material exchanges or being occupied by the same other layer elements in the layer. The direct coupling degree matrix C d is represented in Matrix (4): where C d ij = 0 means no direct correlation; C d ij = 1 means correlation degree between resource element r i and resource element r j is 1, that, is, the fault propagation from resource element r i to resource element r j only needs 1 step.

Indirect coupling degree matrix
According to the fault propagation theory and cascading failure theory, the fault of a single element will not only affect the element itself, but also cause a cascading effect by the correlation between elements, causing the fault propagation and diffusion, and the more serious situation may affect the normal operation of the whole system. Thus, simply establishing the concept of direct coupling degree is insufficient to assess the potential risk introduced by multiple coupling correlation of elements. In contrast, the indirect coupling degree is more efficient to reflect the degree of such risk.
The indirect coupling degree matrix C c ij characterizes the indirect coupling relationships between elements, which is an extension of the direct coupling degree. It can be represented by the indirect coupling degree matrix C c as presented in Matrix (5).
Based on the direct coupling degree matrix C d , the shortest fault propagation path order is calculated based on the Floyd algorithm, and that means the indirect coupling degree matrix C c is generated. Among them, the element C c ij in the matrix is a natural number. When C c ij = 0, it means no indirect coupling relationship. When C c ij = n, it means that coupling correlation degree between resource element r i and resource element r j is n, indicating that fault propagation from resource element r i to resource element r j needs n steps.
The basic process of the Floyd algorithm is to start from the direct coupling matrix C d and recursively update n times. Each update process introduces a new transition node to compare whether the path optimization can be achieved, until all nodes are introduced. Meanwhile, by using Floyd algorithm, the shortest path matrix C r is obtained, where C r ij represents the next resource element that fault propagation from resource element r i to resource element r j should go through. Then, the order of the resource elements which the shortest path of fault propagation from resource element r i to resource element r j should go through can be deduced in turn.

Potential Severity
Further, when a risk quantification is required in view of coupling correlations, a potential severity matrix S p can be established as Matrix (6): The potential severity between resource elements will decrease non-linearly with the indirect coupling degree (such as the impact of radio waves, noise, etc.), that is, due to the natural elasticity and robustness of the system, the more propagation steps a potential fault needs, the lower its effect will be. Therefore, a function relationship between the potential severity and the indirect coupling degree is required to be established. According to the characteristics of the membership relationship between the two factors in the shape of the graph. The typical mapping relationship function is divided into normal type, Γ type, and Cauchy type [49][50][51], and each type is divided into smaller-type, middle-type, and larger-type [52,53]. Because the degree of propagation effect decreases nonlinearly with the coupling degree, a typical smaller-type of Cauchy type membership function [54,55] S p (C c ) is used for fitting in this paper shown in Equation (7).
where C c represents the coupling degree (positive integer) which can be obtained from the indirect coupling degree matrix. S p is the potential severity; a and c are constant, and need to be further quantified. Moreover, the factor of safety critical degree SCG = g 1 , g 2 , . . . , g n of resource elements requires to be considered. In other words, there is difference in the fault effect strength in different resource elements. Therefore, the SCG factor needs to be added to the potential severity matrix, S = S p × SCG forming updated potential severity matrix S.

Propagation Probability and Propagation Time
Ideally, the original data should be determined by experimental statistics. However, in the case of insufficient experimental data, the expected data can be obtained by simulation complex system or modified by referring to expert experience. For example, for the direct propagation probability and direct propagation time, from the perspective of related faults, based on the analysis of the fault effect mechanism between elements, fault correlation effect (simulation) test work can be carried out. Based on the test data, the frequency and average time of fault propagation are calculated and counted as the expected values of the direct propagation probability matrix C P d and the direct propagation time matrix C T d . In the paper, fault injection [56,57] is applied in the simulation system for a large number of times (usually 10,000) to record and obtain average propagation probability and propagation time [58,59]. In general, if the sample size is large enough, then the average value can be regarded as the actual value [60][61][62].
If fault propagation from resource element r i to resource element r j needs n steps (can be obtain from the shortest path matrix), the probability of each propagation step is p 1 , p 2 , . . . , p n (can be obtained from the direct propagation probability matrix C P d ). Indirect propagation probability C P c(ij) that fault in element r i to element r j can be calculated by Equation (8) and then indirect propagation probability matrix C P c is formed.
Similarly, if the fault propagation from resource element r i to resource element r j needs n steps (it can be obtained from the shortest path matrix), and the time of each propagation step is t 1 , t 2 , . . . , t n (it can be obtain from the direct propagation time matrix C T d ), indirect propagation time C T c(ij) that fault propagation takes from element r i to element r j can be calculated by Equation (9). Then indirect propagation time matrix C T c is formed.

Multi-Dimensional Safety Risk Model
Generally, the safety risk of a system is measured in two dimensions, which is to quantify the safety risk from two dimensions: the probability of a dangerous event and the severity of the potential effect as shown in Equation (10). However, it is incomprehensive to fully characterize the safety risk characteristics of the system by analyzing and evaluating safety risks from only two dimensions. Therefore, this paper takes another dimension (propagation time) into consideration and proposes a new theory to quantify safety risk from three dimensions: probability, severity and time, and compare the effect weight and correlation of each element, in order to analyze and evaluate the safety risk comprehensively. Combined with the risk concept of Terje Aven [63], the multi-dimensional safety risk model can be formalized as presented in Equation (11).
where P is the probability of a dangerous event, S is the severity of the potential effect.
where P is the fault propagation probability, S is the potential severity, and T is the fault propagation time.

Calculation of Multi-Dimensional Safety Risk Model
Traditional risk assessment often adopts qualitative/semi-quantitative methods. The basic rule is to classify risk factors into different levels qualitatively based on experience, and then refer to the risk assessment model for semi-quantitative risk assessment. The core reason for using the qualitative/semi-quantitative risk assessment method is that the risk factors have different dimensional units, and the resulting risk values can be considered as the normalized result after empirical classification. In GJB 900A [64], the probability and severity are classified into five levels and four levels, respectively, and based on expert scoring method [65,66], different probability levels and severity levels corresponding to different risk values are shown in Table 1. Therefore, risk value in Table 1 mapping to the two-dimensional space, then Euclidean distance between the risk assessment point R(P, S) as shown in Figure 2 and the space origin is introduced to calculate risk evaluation values as shown in Equation (12). a, b is the preference correction factors.
Similarly, based on the multi-dimensional safety risk theory, this paper uses a five-level risk factor level method based on expert experience [67]. In other words, the degree from light to heavy is level 1 to level 5. Similarly, based on GJB 900A and expert scoring method, different risk values corresponding to different propagation probability, severity and propagation time are obtained as shown in Table 2. Therefore, the actual parameter values of the safety risk factor propagation probability , potential severity , and propagation time can be quantified into risk factor level.

Risk Probability Level Severity Level Propagation Time
Based on multi-dimensional safety risk model, the risk factors , , are mapped to the threedimensional space shown as Figure 3. The improved Euclidean distance between the risk assessment According to Table 1, R = f (P, S), 1 = f (1, 1), 2 = f (2, 1), 3 = f (4, 1) . . . , and based on Equation (12), 'regress' function in MATLAB is applied to implement multiple linear regression fitting, Similarly, based on the multi-dimensional safety risk theory, this paper uses a five-level risk factor level method based on expert experience [67]. In other words, the degree from light to heavy is level 1 to level 5. Similarly, based on GJB 900A and expert scoring method, different risk values corresponding to different propagation probability, severity and propagation time are obtained as shown in Table 2. Therefore, the actual parameter values of the safety risk factor propagation probability P, potential severity S, and propagation time T can be quantified into risk factor level. Based on multi-dimensional safety risk model, the risk factors P, S, T are mapped to the three-dimensional space shown as Figure 3. The improved Euclidean distance between the risk assessment point R(p, s, t) and the space origin is introduced to calculate risk evaluation values as shown in Equation (13).
where f 1 (p), f 2 (s), f 3 (t) are risk factor levels, which actual parameter values of risk factors P, S, T are classified into, respectively; a, b, c is the preference correction factors. Based on Table 2 and Equation (13), 'regress' function in MATLAB is applied to implement multiple linear regression fitting, obtaining a = 2.2, b = 3.3, c = 2.7.
In addition, total safety risk value of the system and safety risk ratio of element is calculated as shown in Equation (14) and Equation (15)

Pareto rule
The safety risk ratio characterizes the extent to which each element in the system contributes to the total safety risk value of the system, and from this, the critical safety factor in the system can be intuitively identified. According to Pareto rule [68,69], when distinguishing safety-critical links, it can be considered that 80% of accidents are originated from 20% of dangerous sources. Therefore, the value of the safety risk ratio is sorted in descending order, and the first 20% of the values of are defined as safety-critical elements, and the last 80% are defined as general safety elements.

ALARP principle
As a project risk criterion generally adopted by domestic and foreign institutions, the principle of ALARP (As Low As Reasonably Practicable) [70,71] sets two risk "boundaries" based on the value of safety risk and related experience: intolerable boundary and negligible boundary [70], meanwhile forming three risk region and level: serious risk region, ALARP region and negligible region, and the top extreme of the principle is "accident", and the bottom extreme is "safety". ALARP rule is shown as Figure 4. The values of the regions and boundaries of ALARP principle are all relative, and there In addition, total safety risk value R N of the system and safety risk ratio η i of element i is calculated as shown in Equations (14) and (15).

Pareto rule
The safety risk ratio characterizes the extent to which each element in the system contributes to the total safety risk value of the system, and from this, the critical safety factor in the system can be intuitively identified. According to Pareto rule [68,69], when distinguishing safety-critical links, it can be considered that 80% of accidents are originated from 20% of dangerous sources. Therefore, the value of the safety risk ratio η i is sorted in descending order, and the first 20% of the values of η i are defined as safety-critical elements, and the last 80% are defined as general safety elements.

ALARP principle
As a project risk criterion generally adopted by domestic and foreign institutions, the principle of ALARP (As Low As Reasonably Practicable) [70,71] sets two risk "boundaries" based on the value of safety risk and related experience: intolerable boundary and negligible boundary [70], meanwhile forming three risk region and level: serious risk region, ALARP region and negligible region, and the top extreme of the principle is "accident", and the bottom extreme is "safety". ALARP rule is shown as Figure 4. The values of the regions and boundaries of ALARP principle are all relative, and there is no standard of definition [72,73]. In practice, expert evaluation method considering potential severity, propagation probability, and propagation time can be applied to determine final values of the boundaries [70,74,75]. Meanwhile, alternative values of the boundaries are also obtained. Finally, compared and analyzed results of final values and alternative values of boundaries, final results can be determined. ALARP region means risk value in this region is reasonably acceptable. Therefore, according to the ALARP principle, this paper classifies risk value of each element into different regions, in order to make further research to propose preventive measurements, so as to reduce the risk level and improve system safety.  Figure 4. As Low As Reasonably Practicable (ALARP) model.

Safety risk entropy
The essence of entropy [76,77] is considered as a measure of the degree of disorder in the system. Currently, there are three typical definitions: Clausius entropy, Boltzmann entropy, Shannon entropy. Therefore, in this paper, safety risk entropy is defined as the measure of all random factors in system safety risk. Through the previous system's safety risk analysis, it was found that the randomness mainly derives from the probabilistic characteristics of each step of the fault propagation process. Therefore, according to the definition of Shannon entropy, it is assumed that the fault propagation from resource element to resource element requires n steps, and the probability of fault propagation for each step is , ,…, (based on the direct propagation matrix ). Then, based on Shannon entropy, means the effect of safety risk entropy that from resource element resource to the resource element , as shown in Equation (16). In other words, represents the uncertainty risk of fault propagation from resource element to resource element . Moreover, total safety risk entropy of resource element , which effects the overall system calculated in Equation (17). The higher the safety risk entropy value is, the greater the uncertainty risk caused by the fault in this element effects on the system is.
According to comprehensive analysis on results of Pareto rule, ALARP principle and safety risk entropy, aimed at the serious risk region and critical risk factors, the coupling correlations are further researched to propose preventive measurements, so as to reduce the risk level and improve system safety.

Safety risk entropy
The essence of entropy [76,77] is considered as a measure of the degree of disorder in the system. Currently, there are three typical definitions: Clausius entropy, Boltzmann entropy, Shannon entropy. Therefore, in this paper, safety risk entropy is defined as the measure of all random factors in system safety risk. Through the previous system's safety risk analysis, it was found that the randomness mainly derives from the probabilistic characteristics of each step of the fault propagation process. Therefore, according to the definition of Shannon entropy, it is assumed that the fault propagation from resource element r i to resource element r j requires n steps, and the probability of fault propagation for each step is p 1 , p 2 , . . . , p n (based on the direct propagation matrix C P d ). Then, based on Shannon entropy, H ij means the effect of safety risk entropy that from resource element resource r i to the resource element r j , as shown in Equation (16). In other words, H ij represents the uncertainty risk of fault propagation from resource element r i to resource element r j . Moreover, total safety risk entropy H i of resource element r i , which effects the overall system calculated in Equation (17). The higher the safety risk entropy value is, the greater the uncertainty risk caused by the fault in this element effects on the system is.
According to comprehensive analysis on results of Pareto rule, ALARP principle and safety risk entropy, aimed at the serious risk region and critical risk factors, the coupling correlations are further researched to propose preventive measurements, so as to reduce the risk level and improve system safety.

Hierarchical model
Integrated modular avionics (IMA) [78,79] is a shared set of flexible, reusable, and interoperable hardware and software resources. When integrated, these resources can form a platform that provides service, designed and verified to a defined set of safety and performance requirements, to host applications performing aircraft functions [80]. Based on ASAAC criterion [81], IMA system is managed by a three-layer model: Aircraft Level (AL), Integration Area Level (IAL) and Resource Element Level (REL). This three-level hierarchy of IMA is typical task-function-resource model.
On the basis of the initial design plan of a certain aircraft, this integrated modular avionics (  Figure 5 and Table 3.

Hierarchical model
Integrated modular avionics (IMA) [78,79] is a shared set of flexible, reusable, and interoperable hardware and software resources. When integrated, these resources can form a platform that provides service, designed and verified to a defined set of safety and performance requirements, to host applications performing aircraft functions [80]. Based on ASAAC criterion [81], IMA system is managed by a three-layer model: Aircraft Level (AL), Integration Area Level (IAL) and Resource Element Level (REL). This three-level hierarchy of IMA is typical task-function-resource model.
On  Figure 5 and Table 3.  It is generally considered that in the IMA system, the top-level system functional entities are unique, and it can be considered that there is only one element in the IMA system task set = { }, which is aimed to complete the management of the entire IMA system to support the operation of the system. Therefore, it can be ignored. Then, function element set = { , , }; resource element set = { , , … , }.  It is generally considered that in the IMA system, the top-level system functional entities are unique, and it can be considered that there is only one element in the IMA system task set T = {t 1 }, which is aimed to complete the management of the entire IMA system to support the operation of the system. Therefore, it can be ignored. Then, function element set F = f 1 , f 2 , f 3 ; resource element set R = {r 1 , r 2 , . . . , r 9 }.

Coupling degree Matrix
Function-resource element mapping coupling matrix M f r is presented in Matrix (18).  (19).

Indirect coupling degree
Based on Floyd algorithm, fault propagation path order is calculated referring to direct coupling matrix C d , and then, the indirect coupling degree matrix C c and the shortest path matrix C r of fault propagation are generated as shown in Matrix (20) and Matrix (21), respectively. C r ij represents the next element that the fault propagation from element i to element j should pass. For instance, fault propagation from element 1 to element 9, according to C r 19 = 3, it can be inferred that fault in element 1 will propagate to element 3 first. Then, according to C r 39 = 5, it can be inferred that fault will propagate from element 3 to element 5 second. Finally, according to C r 59 = 9, it can be seen that fault will propagate from element 5 to element 9. Therefore, the fault propagation path from element 1 to element 9 is formed: 1 → 3 → 5 → 9. So, other fault propagation paths can be deduced by analogy.  1 2 3  1 2 3  1 2 3   4 3 3  4 3 3  4 5 6   3 3 3  3 3 3  5 8 5  1 2 3  3 3 3  3 3

Potential severity
The potential severity effect is greatest when there is a direct coupling correlation relationship between the elements (the coupling degree is 1), so the corresponding potential severity value is set to 1. In addition, when the coupling degree is 5 or above, the degree of effect is the smallest, and the corresponding potential severity value is set to 0.1. Based on Equation (7), S p (1) = 1 1+a(1−c) 2 = 1, S p (5) = 1 1+a(5−c) 2 = 0.1, as shown in equation set (22), and then solving equation set (22), obtaining c = 1, a = 0.56. Then, based on Equation (7), obtaining S p (2) = 0.64,S p (3) = 0.31,S p (4) = 0.17. Therefore, potential severity matrix S p is shown as Matrix (23).

Propagation probability
Fault injection is applied in simulation of the IMA system for 10,000 times, and the direct propagation probability matrix C P d and the direct propagation time matrix C T d are obtained as shown in Matrix (25) and Matrix (26).
where the unit in C P d and C T d is percentage (%) and second (s), respectively. Based on direct propagation probability matrix C P d , referring to propagation path, indirect propagation probability is calculated by Equation (8). For example, fault propagation path from element 1 to element 9 is 1 → 3 → 5 → 9. The propagation probability from element 1 to element 3, and element 3 to element 5, and element 5 to element 9 is 0.8, 0.8, 0.9, respectively, based on C P 19 = 0.8, C P 39 = 0.8, C P 59 = 0.9 in the direct propagation probability C P d . Then, fault propagation probability from element 1 to element 9 is 0.8 × 0.8 × 0.9 = 0.576. Similarly, whole indirect propagation probability matrix C P c is formed as shown in Matrix (27).

Propagation time
Similarly, based on direct propagation time C T d , referring to propagation path, indirect propagation time is calculated by Equation (9). For instance, fault propagation path from element 1 to element 9 is 1 → 3 → 5 → 9. The fault propagation time from element 1 to element 3, and element 3 to element 5, and element 5 to element 9 is 0.7, 0.7, 0.4, respectively, based on C T 19 = 0.7(s), C T 39 = 0.7(s), C T 59 = 0.4(s) in the direct propagation time C T d . Then propagation time from element 1 to element 9 is 0.7 + 0.7 + 0.4 = 1.8(s). Similarly, the whole indirect propagation time matrix C T c is formed as shown in Matrix (28).

Classification of risk factors
Considering numerical ranges within the indirect propagation probability matrix C P c , the potential severity matrix S, and the indirect propagation time matrix C T c , qualitative risk factor level rules are given (from light to heavy, respectively 1 to 5), as shown in Table 4 based on experience.

Results of risk
• Pareto rule The results of R i and η i are sorted in descending order as shown in Table 5. Then Pareto chart is presented as Figures 6 and 7.

• ALARP principle
Under the ALARP principle, two risk "boundaries" are set based on expert experience: the intolerable boundary and the negligible boundary are 145 and 130, respectively, and the alternative values of the boundaries are 140 and 130, respectively. On the condition of alternative values, element 3, 4, 5, 6, 8 are all in the serious risk region. According to Pareto rule, when distinguishing safety-critical links, it can be considered that 80% of accidents are originated from 20% of dangerous sources, but the accumulated risk value of the elements in the serious risk region is well over 20%, which is a violation of the Pareto rule. Therefore, value of the boundaries 145, 130 are determined. The risk level of each element is presented in Figure 8.
Appl. Sci. 2020, 10, 3007 18 of 23 3, 4, 5, 6, 8 are all in the serious risk region. According to Pareto rule, when distinguishing safetycritical links, it can be considered that 80% of accidents are originated from 20% of dangerous sources, but the accumulated risk value of the elements in the serious risk region is well over 20%, which is a violation of the Pareto rule. Therefore, value of the boundaries 145, 130 are determined. The risk level of each element is presented in Figure 8. Based on Equation (17), total safety risk entropy of element which effects on the overall system is calculated as shown in Matrix (33). The safety risk entropy of the single element in descending order and the accumulated the safety risk entropy are presented in Figure 9. • Safety risk entropy Based on Equation (16), entropy matrix H is presented as Matrix (32).
Based on Equation (17), total safety risk entropy H i of element i which effects on the overall system is calculated as shown in Matrix (33). The safety risk entropy of the single element in descending order and the accumulated the safety risk entropy are presented in Figure 9.

Discussion
1. According to the sorting results in Figure 6, Figure 7, and Pareto Rule, elements 5 and 6 can be defined as the safety-critical elements; and elements 1, 2, 3, 4, 7, 8, and 9 are defined as safetygeneral elements. 2. Based on Figure 8, element 5 and 6 are in the serious risk region; elements 3, 4, 7, 8, and 9 are in the ALARP region; and elements 1 and 2 are in the negligible region. 3. From Figure 9, elements 1 and 2 have higher uncertainty; elements 6, 7, and 9 have moderate uncertainty; and elements 3, 4, 5, 8 are with lower uncertainty. 4. In summary, elements 5 and 6 are the safety-critical elements and located in a serious risk region, which has a serious effect on the overall system. Simultaneously, it has a certain degree of uncertainty. Therefore, corresponding measures must be taken to ensure the safety of elements 5 and 6 in order to decrease the system risk. In other words, more attention must be paid to Signal Processing Module in this avionics system. Elements 3, 4, 7, 8, and 9 are in the ALARP region and have lower uncertainty. This means that the risk caused by these elements in the region are acceptable. As a consequence, Data Processing Module, Power Conversion Module, and Network Support Module should be given due attention if the conditions permit. Although elements 1 and 2 have higher uncertainty, they are located in the negligible region. Consequently, Graphics Processing Module can be ignored under limited conditions. If the conditions permit, in view of the higher uncertainty of elements 1 and 2 (Graphics Processing Module), by increasing the reliability of elements 1 and 2 and ensuring the reliability of the element's correlation with other elements, such as ensuring the reliability of the data transmission channel between elements 1 and 2 and other elements, and the reliability of the information transmission bus, etc. Based on these measures, the fault propagation from elements 1 and 2 to other elements can be reduced, so as to reduce risk to overall systems of high uncertainty of elements 1 and 2.

Conclusions
While aiming to address the insufficiency of traditional safety risk analysis and risk assessment technology to solve coupling problems between components in complex systems, this study proposed a novel risk assessment and analysis method for correlation in complex systems based on multi-dimensional theory. Firstly, a matrix-based hierarchical model for the complex system is presented and correlation relationships between elements in the system were established. Furthermore, based on correlation relationship, the multi-dimensional theory and model are

1.
According to the sorting results in Figure 6, Figure 7, and Pareto Rule, elements 5 and 6 can be defined as the safety-critical elements; and elements 1, 2, 3, 4, 7, 8, and 9 are defined as safety-general elements. 2.
Based on Figure 8, element 5 and 6 are in the serious risk region; elements 3, 4, 7, 8, and 9 are in the ALARP region; and elements 1 and 2 are in the negligible region. 3.

4.
In summary, elements 5 and 6 are the safety-critical elements and located in a serious risk region, which has a serious effect on the overall system. Simultaneously, it has a certain degree of uncertainty. Therefore, corresponding measures must be taken to ensure the safety of elements 5 and 6 in order to decrease the system risk. In other words, more attention must be paid to Signal Processing Module in this avionics system. Elements 3, 4, 7, 8, and 9 are in the ALARP region and have lower uncertainty. This means that the risk caused by these elements in the region are acceptable. As a consequence, Data Processing Module, Power Conversion Module, and Network Support Module should be given due attention if the conditions permit. Although elements 1 and 2 have higher uncertainty, they are located in the negligible region. Consequently, Graphics Processing Module can be ignored under limited conditions. If the conditions permit, in view of the higher uncertainty of elements 1 and 2 (Graphics Processing Module), by increasing the reliability of elements 1 and 2 and ensuring the reliability of the element's correlation with other elements, such as ensuring the reliability of the data transmission channel between elements 1 and 2 and other elements, and the reliability of the information transmission bus, etc. Based on these measures, the fault propagation from elements 1 and 2 to other elements can be reduced, so as to reduce risk to overall systems of high uncertainty of elements 1 and 2.

Conclusions
While aiming to address the insufficiency of traditional safety risk analysis and risk assessment technology to solve coupling problems between components in complex systems, this study proposed a novel risk assessment and analysis method for correlation in complex systems based on multi-dimensional theory. Firstly, a matrix-based hierarchical model for the complex system is presented and correlation relationships between elements in the system were established. Furthermore, based on correlation relationship, the multi-dimensional theory and model are proposed in order to evaluate risk more objectively. Moreover, based on the Pareto rule, ALARP principle, and safety risk entropy, the critical risk elements are identified, which provides a theoretical basis for putting forward preventive measures, so as to ensure and improve system safety. Compared with the current methods and technologies, the method proposed in this paper mainly reflects the advantages of two aspects. On the one hand, the hierarchical model is modeled in a matrix manner, and the association relationship of each element in the complex system is quickly and accurately analyzed, which reduces the skill requirements of analysts. On the other hand, it provides a feasible and multi-faceted analysis method for the risk assessment of systems in view of fault propagation, which is the core judgment criterion for identifying critical risk factors and of great significance for ensuring system safety.