Next Article in Journal
Evaluation of Infiltration Swale Media Using Small-Scale Testing Techniques and Its SWMM Modeling Considerations
Previous Article in Journal
Approaches for Assessment of Soil Moisture with Conventional Methods, Remote Sensing, UAV, and Machine Learning Methods—A Review
Previous Article in Special Issue
Evolution of Data-Driven Flood Forecasting: Trends, Technologies, and Gaps—A Systematic Mapping Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multiple Correlation Analysis of Operational Safety of Long-Distance Water Diversion Project Based on Copula Bayesian Network

1
School of Water Conservancy, North China University of Water Resources and Electric Power, Zhengzhou 450046, China
2
Central China Regional Headquarters of Powerchina Road-Bridge Co., Ltd., Zhengzhou 450000, China
3
Western Reginal Headquarters of Powerchina Road-Bridge Group Co., Ltd., Chengdu 610000, China
4
Southeast Reginal Headquarters of Powerchina Road-Bridge Group Co., Ltd., Hangzhou 310000, China
*
Author to whom correspondence should be addressed.
Water 2025, 17(16), 2389; https://doi.org/10.3390/w17162389
Submission received: 23 June 2025 / Revised: 2 August 2025 / Accepted: 7 August 2025 / Published: 12 August 2025

Abstract

Based on the Copula theory, a multiple correlation analysis model for the operation safety risks of long-distance water diversion projects was established. Combined with Bayesian network reasoning, a polynomial regression analysis, and other techniques, a dynamic analysis method for the operation safety of long-distance water diversion projects based on a Copula Bayesian network model was proposed, providing decision support for the operation safety risk management of long-distance water diversion projects. We took the Middle Route Project of the South-to-North Water Diversion Project as an example to verify the validity and practicability of the model. The results show that this method can capture the nonlinear mapping relationship when the probability of risk occurrence changes dynamically on the basis of considering the risk correlation, and realize the dynamic analysis of risk correlation.

1. Introduction

A long-distance diversion project is an effective way to solve the uneven distribution of water resources, alleviate the disparities between the supply and demand of water resources in water-deficient areas, optimize the spatial allocation of water resources, and is an important infrastructure for promoting sustainable social and economic development [1]. At present, there are many long-distance water diversion projects in operation worldwide, such as the California State Water Transfer Project in California, USA and the Central Valley Water Transfer Project in California, USA, Libya’s Great Manmade River Project in southern Libya, the Quebec Water Transfer Project in Canada, the West-to-East Water Transfer Project in Pakistan, Karakum Canal Project in the south of Turkmenistan, and China’s South-to-North Water Transfer Project [2,3]. Safety management during the operation period of long-distance water diversion projects involves factors such as engineering, society, and the environment. There are many risk factors, and these risk factors are interrelated [4,5,6]. For instance, the rainstorm and flood in Beijing on July 21st exposed the culvert of the Beijuma River and caused severe erosion [7]. The torrential rain on July 20th in Zhengzhou caused large-scale landslides in the Wenxian section of the Yellow River Crossing Project and the collapse of the inverted siphon embankment of the Dashahe River [8]. Therefore, it is important to study the multiple correlation of risks for reducing risk impact and improving project management.
At present, some scholars have conducted safety risk analyses and assessments of long-distance water diversion projects during their operation period from different perspectives, such as natural disasters, the environment, hydrology, ecology, and structure. In terms of natural disasters, Jin et al. [9] analyzed the flood risk of inverted siphon structures through the sensitivity analysis method. Feng Ping et al. [10] proposed a novel risk combination model based on two-dimensional composite events and a method for estimating flood control risks in water diversion projects. Chen Xinjian et al. [11] accurately predicted and deeply analyzed the geological disaster risks of the Han River to the Wei River Water Diversion Project. Based on a study of the spatial distribution of ice damage risks in the Middle Route Project of the South-to-North Water Diversion Project, Li Fen et al. [12] quantitatively evaluated the risks of different canal sections using the fuzzy comprehensive evaluation method. In terms of environmental risk research, Yan et al. [13] studied the risk of algal sludge deposition during the operation period of long-distance water diversion projects and proposed corresponding preventive measures. Fang Shanqi et al. [14] studied the spatial distribution of chemical fertilizer use in the water sources of the South-to-North Water Diversion Project and evaluated its degree of environmental pollution. In the field of hydrological risks, Chao et al. [15] proposed an emergency water supply plan for long-distance water diversion projects and verified its feasibility by taking the Danjiangkou Reservoir as an example. Lin et al. [16] studied the safety risk factors of pipeline water supply systems in long-distance water diversion projects and formulated risk response measures. Zhai Jiaqi et al. [17] conducted a water supply hydrological risk simulation for the South-to-North Water Diversion Project using the SWAT model. In terms of the research on the risks of engineering structures, Jia Chao et al. [18] analyzed the failure risk of the aqueduct of the South-to-North Water Diversion Project and evaluated the reliability of the aqueduct body under different working conditions using the three-dimensional finite element method. Song Xuan et al. [19] identified the risks of crossing structures in the Middle Route Project of the South-to-North Water Diversion Project. Liu Kang [20] et al. used a dynamic Bayesian network model to evaluate the structural safety of water diversion tunnels. Wen Shiyu et al. [21] systematically sorted out the geological disaster-causing structures and structures of water diversion tunnels based on an analysis of the disaster formation mechanism of geological disasters. Zhang Shuhao et al. [22] constructed a surrounding rock stability evaluation model based on a matter-element extension. Zhang et al. [23] proposed a multi-source sensor data fusion method based on improved D-S evidence theory and a BP neural network to evaluate the structural safety of diversion structures. Gong Li et al. [24] evaluated the winter operation safety of long-distance open channels in cold areas based on game-improved extension theory. Chen et al. [25] conducted safety tests and evaluations of the Dabeishan aqueduct by using UAV aerial photography, BIM, 3D laser scanning, and other digital technologies. Zhang Zhen et al. [26] established a risk assessment index system for beam-type aqueducts, comprehensively evaluated six typical projects by using game theory and a cloud model, and proposed targeted improvement measures.
Risk identification largely relies on expert experience, such as the Delphi method, checklist method, risk decomposition method, fault tree analysis method, graphical method, etc. With the accumulation of data from engineering operations and management processes, the risk factors gradually shift to objective data mining, such as spatio-temporal data mining models, text mining, machine learning, etc. The risk factors obtained through data mining can further sort out and measure the interrelationships among risk factors, and estimate the possibility of risk occurrence [27]. Ma et al. [28] combined a directed weighted complex network with an improved risk matrix to analyze the risk evolution of marine accidents caused by typhoons and evaluate their serious consequences. Zhang et al. [29] employed a multi-layer complex network approach to abstract and model the development process from drivers’ historical traffic violations to their subsequent accidents, revealing the inherent laws. Lu et al. [30] identified the weights of the impact of different events on accidents by constructing a structural equation model. Luo et al. [31] modeled and analyzed the event state transition process by the Monte Carlo method, so as to improve the accuracy of risk occurrence probability estimation. Zhang Yexiang et al. [32] combined fault trees with dynamic Bayesian networks to construct a fuzzy dynamic risk assessment model and identified key risk factors through reverse reasoning. Zhang Jiechao et al. [33] determined the weights by using the OTM-IAHP entropy weight combined weighting method and comprehensively evaluated the risk level in combination with the two-dimensional cloud model. Fu L. et al. [34] proposed a risk interaction modeling and analysis program based on association rule mining and a weighted network, and carried out a risk assessment of a subway’s deep foundation pit engineering. Davide et al. [35] used CPNs to model the risks in ERP projects and analyze the correlation of risks. The above-mentioned research better reveals the correlation mechanisms among the risks of the respective research objects, but is unable to conduct quantitative and dynamic analyses of them.
Currently, there are various methods for dynamically analyzing the correlation between variables, such as artificial neural networks (LSTM) and Bayesian networks. These methods can establish the correlation between variables using a large amount of historical data, transforming dynamic temporal modeling problems into static spatial modeling problems. However, artificial neural networks have difficulties describing the autocorrelation of time series, and their prediction accuracy is limited. An LSTM neural network has high prediction accuracy, but the established model is complex and difficult to train. A Bayesian network, as a risk inference and fault diagnosis tool integrating probability theory and graph theory, has been widely used in many fields, such as medicine, system reliability analysis, and risk analysis [36]. However, Bayesian networks rely on discrete data for node data inference, which may lead to complexity in calculation of safety risk probability of long-distance diversion project operations [37,38]. The construction of Bayesian network should be based on an accurate analysis of the correlation between risks. The Copula function is a mathematical tool for solving multivariate probability problems. It can connect the joint distribution function and marginal distribution function together to reflect the correlation among variables without limiting the marginal distribution of variables, and it will not change the information of original variables in the process of transformation, thus avoiding the loss of information and the increase in the calculation cost caused by the discretization of continuous variables. In recent years, some scholars have introduced Copula theory into Bayesian networks and used simulations to determine the correlation relationship between parameters [39]. Zha X [40] et al. proposed a new framework for risk analysis by coupling Bayesian networks and Copula theory, which can handle the dependency of nodes better. Sun Y et al. [41] introduced Bayesian networks into reliability analysis and proposed a Copula Bayesian network, verifying the rationality of the CBN model. Ghosh A [42] et al. verified the stability of a Bayesian network based on Copula theory. Lasserre M [43] et al. utilized a Copula Bayesian network model to address the issue of the lack of continuous variables in general models.
Although significant progress has been made in the risk assessment of long-distance water diversion projects in the existing research, its methodology still has limitations: for instance, tools such as fault trees and risk matrices dominated by static paradigms cannot reveal the dynamic coupling relationship of risk factors evolving with the seasons. For example, the correlation intensity between flood disasters and channel engineering risks changes significantly over time. When Bayesian networks encounter high-dimensional interaction scenarios, the discretized conditional probability table not only causes the dissipation of continuous variable information, but also triggers a combinational explosion due to the sharp increase in parent nodes, and is unable to depict the nonlinear dependencies at the tail, resulting in a sharp decline in the modeling accuracy of risk interactions above five dimensions. The systematic absence of spatial dimensions makes it impossible for models to capture the cascading effects of local risks, such as landslides being transmitted downstream through the water system network. In response to the above bottlenecks, this paper proposes a hybrid framework of embedding R-Vine Copula into Bayesian networks: On the one hand, R-Vine is used for the non-discretization modeling of the nonlinear dependence of high-dimensional continuous variables, and the correlation strength is accurately quantified with the Kendall rank correlation coefficient while fully retaining the edge distribution features. On the other hand, the time-varying deduction of risk probability is achieved by constructing a dynamic Copula Bayesian network, thereby incorporating temporal evolution, spatial propagation, and high-dimensional interaction into a unified assessment system, providing a methodological breakthrough for the panoramic analysis of risks in long-distance water diversion projects.

2. Methodology

2.1. Identification of Risk Factors

Text mining technology analyzes the operation safety text of long-distance water diversion projects through four steps: selecting a corpus, text normalization processing, constructing a keyword dictionary, and constructing a synonym dictionary. It determines that the set of risk factors for the operation safety of long-distance water diversion projects is R = R 1 , R 2 , R 3 , R S , where R m , R n R represent the risk factor; m , n , S are the number of risk factors; and m n .

2.2. Marginal Probability Distribution Determination

In order to obtain an accurate statistical model, it is necessary to fit the theoretical probability distribution on the basis of the sample size of risks, and determine the probability distribution model after testing. The determination of edge probability distribution will directly affect the accuracy of Copula function modeling, and reasonable edge probability distribution is the premise of correct estimation of Copula function. There are many kinds and different characteristics of safety in long-distance water diversion projects, and their probability distribution may belong to different forms. In this paper, common univariate probability distribution models, such as Normal distribution, Exponential distribution, Weibull distribution, Log-logistic distribution, and Gamma distribution, are used to determine the probability distribution of safety risk factors for long-distance water diversion project operation. In this paper, maximum likelihood method is used to estimate parameters.
For the edge probability distribution preliminarily calculated for each risk, strict mathematical tests are required to determine the edge probability distribution function that can best fit the actual distribution. Goodness-of-fit tests are usually adopted to judge whether the distribution is consistent with the actual distribution. Common goodness-of-fit test methods include K-S test, chi-square test, and A-D test. In this paper, K-S test is used to test goodness of fit. The K-S test can reflect the deviation between the theoretical distribution function and the empirical distribution function, and the statistics are defined as follows:
T = max R m F * x R m F x R m
where F * x R m is the empirical cumulative distribution function of risk R m , F x R m is the theoretical distribution function of risk R m , and T is the maximum of the variance.
If there are multiple distributions that pass the K-S test, the goodness-of-fit test is performed using the AIC, BIC, and OLS methods to determine the optimal probability distribution function for each risk factor.

2.3. Construction of a Risk-Associated Network Hierarchical Structure

2.3.1. Multidimensional R-Vine Copula Function

The basic theory of Vine Copula is as follows:
For d-dimensional random variables x 1 , x 2 , x 3 , x ζ , the joint probability density function f x 1 , x 2 , x 3 , x d is
f x 1 , x 2 , x 3 , x d = f d y d · f y d 1 y d · f y d 2 y d 1 ,   y d f y 1 y 2 , y 3 , y d
According to Copula theory, the multivariate joint density function can be described as
f x 1 , x 2 , x 3 , x d = c 12 d F 1 x 1 , F 2 x 2 F d x d · s = 1 d f s x s
where c 12 d F 1 x 1 , F 2 x 2 F d x d is the d-dimensional Copula density function and f s x s is the edge density function.
The d-dimensional Vine is decomposed into a d 1 -level tree. The d-level tree has d d + 1 nodes and d d edges. Each edge represents a two-dimensional Pair Copula function. Each tree has a root node. Vine Copula has R-Vine Copula, regular Vine Copula, D-Vine Copula, C-Vine Copula, and other types. The construction of C-Vine Copula model is based on the selection of root nodes, and the selection of root nodes needs to reflect the strongest correlation between the variables. The D-Vine Copula model focuses on arranging the node order of the first layer, while the R-Vine Copula model fully considers the association relationship between each node. The structure of the four-dimensional R-Vine Copula is shown in Figure 1.
And there is
f y 1 , y 2 , y 3 , y 4 = f x 1 · f x 2 · f x 3 · f x 4 · c 12 · c 13 · c 14 · c 23 1 · c 34 1 · c 234 1
The R-Vine Copula model expresses the correlation problem of multidimensional random variables as the correlation problem of multiple two-dimensional random variables through a regular tree structure. The R-Vine Copula model with d-dimension random variables consists of a d 1 -level tree, denoted T 1 , T 2 , T 3 T d - 1 , with e denoting the edges; the edge set of the d -level tree is e d ; and the node set is N d . When the structure of the R-Vine Copula tree with d-dimension random variables is determined, the joint probability density of d-dimension is as follows:
f x 1 , x 2 , x 3 , x d = s = 1 d f s y s · d = 1 d 1 i , j D e e d c i , j D e F x i x D e , F x j x D e
where D e = c 1 , c 2 , c n is the set of conditional values, and the corresponding variable set is x D e = x e 1 , x e 2 , x e n ; F x i x D e is the conditional probability distribution of x i when the variable distribution in x D e is known; i , j D e denotes the edge in e d , i and j denote the variables x i and x j , respectively, and D e denotes the common conditional value of these two variables. i , j D e satisfies the following properties:
(1)
When d = 1 , D e ; i and j are nodes of two segments with edge e , respectively;
(2)
When d 1 , record the sets of adjacent elements on both sides of the d 1 -layer tree as S 1 and S 2 , respectively, then D e = S 1 S 2 ; record T = S 1 S 2 S 1 S 2 , then i , j T and i j .

2.3.2. Two-Dimensional Copula Function

Currently, the Copula functions commonly used in engineering risk analysis can be categorized into two major families: elliptical and Archimedean. The elliptical family is represented by the Gaussian Copula and t-Copula, while the Archimedean family focuses on three classical models—Gumbel, Clayton, and Frank. This study addresses the multi-source risk coupling problem in the operational safety of long-distance water diversion projects by selecting the aforementioned five Copulas for bivariate joint modeling.
The Gumbel Copula, with its upper tail dependence characteristic, accurately captures the synergistic amplification effect of extreme risk events. The Clayton Copula, relying on lower tail dependence, demonstrates superior performance at characterizing the correlation of low-probability, high-consequence risks. Meanwhile, the Frank Copula excels at symmetric dependence structures, making it suitable for robust descriptions of general risk associations. The final selection of the Copula functions was determined not only based on classical information criteria, such as AIC and BIC, but also through a combined diagnostic approach involving Kendall’s rank correlation coefficient heatmaps and tail dependence coefficients. This approach validated the rationality of model selection from both statistical significance and engineering interpretability perspectives.
For the parameter estimation of two-dimensional Copula function, this paper adopts the maximum likelihood estimation method by maximizing the joint probability density function of a given data set. Theoretically, as the number of samples decreases, the maximum likelihood estimate will be closer to the true value. Maximum likelihood estimates in the Copula function can be obtained as follows:
α R m , α R n ; ρ R m R n = arg max m , n = 1 a ln α R m , α R n ; ρ R m R n
where α R m and α R n are parameters of probability distribution functions F R m y and F R n y , respectively, where m n .

2.3.3. Hierarchical Structure of Risk Correlation Network

The CBN model consists of nodes and arcs. An integrated DEMATEL-ISM approach is adopted to construct the hierarchical structure of risk factor correlations, enabling a more accurate investigation of interdependencies among risk factors and determination of the direction of arcs in the CBN model. To reduce the computational complexity of identifying the optimal bivariate Copula functions, cross-layer dependency relationships are eliminated, transforming the originally constructed multi-level hierarchical structure into a simplified model containing only direct dependencies. The resulting multi-level hierarchical structure is then used as a candidate R-Vine Copula structure, within which the optimal bivariate Copula functions are determined.

2.4. Copula Bayesian Network Model Construction

2.4.1. Bayesian Network

The conditional probability table of traditional Bayesian networks (BNs) is calculated by probability theory on the nodes of the network, and the probability distribution of each node is obtained by given initial conditions and propagation along the directed edges according to conditional probability. For a set of random variables X = x 1 , x 2 , x 3 , x d , the joint probability distribution F x 1 , x 2 , x 3 , x d is represented by the product of the conditional probability tables (CPTs) associated with each node:
F x 1 , x 2 , x 3 , x d = s = 1 d F x s p a x s
where x s X is the parent of node x s , and F x s p a x s represents the probability distribution of x s when p a x s occurs.

2.4.2. Topological Model Construction

The classical Bayesian network model mainly relies on expert experience and structure learning, and directly uses the DEMATEL-ISM model to build the risk factor association hierarchical structure mapping into the Bayesian network, which has the disadvantages of high complexity and uncertainty of parameters. At the same time, the constructed model has strong subjectivity and a certain subjective bias. However, the Bayesian network structure, determined based on structure learning, is to select the optimal network suitable for the sample through an algorithm based on the sample data. Such algorithms have the disadvantages of complicated calculation, high time cost, and insufficient effectiveness when modeling a complex system with high dimensional correlation. For example, a K2 algorithm based on a score search needs to use prior information and uses less scenes; a Markov chain Monte Carlo algorithm (MCMC) can guarantee the learning accuracy of the algorithm, but the convergence speed is slow; and an HC algorithm is simple, but it is not suitable to search for the optimal solution.
This study considers a hybrid method for constructing a Bayesian network structure that combines expert knowledge with the R-Vine Copula function. The operational safety risks of long-distance water diversion projects are taken as the network nodes. Based on the determined risk correlation relationship, the first-layer structure of the constructed optimal R-Vine Copula tree is used as the topological structure. Link each node with directed arrows to represent the correlation between risks. The arrows point from the associated risk to the associated risk, and the network topology is constructed with the associated risk as the parent node and the associated risk as the child node.

2.4.3. Determination of Model Parameters

To construct a complete dynamic Copula Bayesian network model, it is necessary to obtain the occurrence probability of risks in the associated situation. Traditional Bayesian networks require clear probabilities or data as the prior probabilities for input. They allocate marginal distributions to the source nodes and conditional probability tables to the child nodes. However, due to the need for a complete database, the nodes cannot have too many states. Moreover, the child nodes can only have a small number of parent nodes, and traditional Bayesian networks are not suitable for complex systems. Meanwhile, the conditional probability table is discrete data and cannot represent continuous variables, which may lead to certain deviations between the analysis results and reality. The CBN model constructed in this study uses the Copula function to represent the correlation between variables and define the joint density, which makes up for the insufficient calculation accuracy of the traditional Bayesian model and combines it with expert experience to reduce the calculation error.

2.5. Simulation Analysis

At present, there are many kinds of Bayesian network simulation tools at home and abroad. In this paper, UNINET3.3.0 software designed by the risk and environment modeling team of Technische Universiteit Delft is used. Different from Netica, GeNIe, Unicet, and other Bayesian network simulation software, the main function of UNINET is to model the structure of high-dimensional distribution with correlation, and random variables can be combined with the Vine Copula function and BN.

3. Example Analysis

3.1. Project Overview

The main canal of the Middle Route Project of the South-to-North Water Diversion starts from the Taochia Canal Head Hub in Zhechuan County, Henan Province. Most of the canal route is located in front of Songshan Mountain, Funiu Mountain, and Taihang Mountain, west of the Beijing–Guangzhou Railway. The canal line passes through four provinces and municipalities, including Henan, Hebei, Beijing, and Tianjin, spanning the four major river basins of the Yangtze River, Huaihe River, Yellow River, and Haihe River. The total length of the line is 1431.945 km, with an annual water conveyance capacity of 14.14 billion cubic meters. It is characterized by large-scale, complex geological conditions along the line; a complex structure composition; and complex dispatching operation. The safety of the project during the operation period is of vital importance.

3.2. Identification of Risk Factors

This article used the “China South-to-North Water Diversion Project Construction Yearbook”, the project operation safety risk analysis report, and network data as the main data sources for risk identification. The text processing was carried out using the batch file processor and Ultra-replace tool in ROST CM6.0 software to ensure the integrity and accuracy of the risk identification data. After text processing, a total of six primary risks and 19 secondary risks were obtained. The operation safety risk index system of the Middle Route Project of the South-to-North Water Diversion Project is shown in Table 1.
This risk classification system was constructed by systematically reviewing the literature on similar water diversion projects [4,5,6,9,10,11,12,13,14] and verified in combination with the relevant data. The system encompasses six core dimensions (natural risk, engineering risk, water quality risk, operational risk, economic risk, and social security risk), covering not only traditional engineering risks but also integrating the crucial non-traditional factors in long-distance water transmission systems. This framework has been verified through both expert interviews and case analyses, ensuring its scientific nature and practicality.

3.3. Marginal Probability Distribution Determination

Under the condition of a limited sample size, this study adopted the overall trend diffusion technique to expand the sample size. This method first extracts the multi-scale features (including annual cycle, seasonal cycle, and emergency event features) from the original monitoring data through a wavelet analysis, and then establishes a trend diffusion model in combination with the hydraulic characteristics of long-distance water diversion projects. This technology ensures that the generated virtual samples not only maintain the statistical characteristics of the original data (verified by the K-S test, p > 0.05), but also conform to the laws of engineering physics. Ultimately, the limited samples were expanded into training samples that met the research requirements, effectively supporting the subsequent research work.
Combining the overall trend diffusion technology and the virtual sample technology using the Monte Carlo algorithm to expand the sample size for the edge distribution calculation of the occurrence probabilities of each risk, and using the K-S test method to test it, the critical value was taken as 0.05. The goodness-of-fit test was conducted through the AIC method, BIC method, and OLS method. Among them, the distribution function with the smallest statistic was recognized as the optimal edge distribution function, and the risk edge probability distribution function is shown in Table 2.
For extreme risk events, such as social security risks that have the characteristics of “low occurrence frequency and high impact consequences”, the generalized extreme value distribution (GEV) demonstrates unique modeling advantages: it includes three types of extreme values, namely Gumbel, Frechet, and Weibull. By fitting the actual sample data and using various goodness-of-fit test methods, such as the K-S test, it was found that the GEV distribution could optimally reflect the tail characteristics of the observed data in the probability distribution modeling of these risks and accurately describe the occurrence probability of extreme events. The empirical analysis showed that the GEV distribution not only well fit the probability distributions with obvious heavy tails and asymmetry, such as social security risks, but its prediction results were also highly consistent with the occurrence patterns of extreme events observed in engineering practice. This provides a solid theoretical basis for accurately assessing low-probability–high-impact risks and significantly enhances the engineering applicability and decision support value of the risk model. Therefore, choosing the GEV distribution can enhance the scientificity and rationality of risk modeling.

3.4. Construction of Hierarchical Structure of Risk Correlation Network

(1) Construct a hierarchical structure of the associated network.
The research group invited research backbones, renowned scholars, and front-line technical backbones from research institutes, universities, and operation management units who are engaged in the fields of risk management, emergency management, and facility operation management of water conservancy projects to form an expert group for interviews. Based on the experience of the experts, the group made a judgment on the correlation degree of the risks and provided judgment information. The hierarchical structure of the associated network was determined according to the DEMETAL-ISM method. The multi-level hierarchical structure model of the operation safety risk association of the Middle Route Project of the South-to-North Water Diversion Project is shown in Figure 2.
It can be seen from Figure 2 that there are 151 sets of correlation relationships within the operation safety risks of the Middle Route Project of the South-to-North Water Diversion Project. To reduce the calculation amount of the two-dimensional optimal Copula function, it is necessary to preliminarily determine the R-Vine Copula structure. The cross-layer association relationship is deleted, and the already constructed multi-level hierarchical structure model is transformed into a multi-level model with a direct association relationship. The multi-level hierarchical structure model of the direct association relationship of the operation safety risk of the Middle Route Project of the South-to-North Water Diversion is shown in Figure 3. The transformed multi-level hierarchical structure model of the operation safety risk of the Middle Route Project of the South-to-North Water Diversion Project is taken as a possible structure, and the two-dimensional optimal Copula function therein is determined.

3.5. Copula Bayesian Network Model Construction

Based on the sample data, according to the multi-level hierarchical structure model of the direct correlation relationship of operational safety risks in the Middle Route Project of the South-to-North Water Diversion, The Gumbel Copula, Clayton Copula, Frank Copula, t-Copula, and Gaussian Copula functions were, respectively, used for fitting to determine the optimal fitting distribution between the two risks, and the maximum likelihood estimation was used for the parameter estimation. A goodness-of-fit test was conducted using the AIC method, BIC method, and OLS method, and the optimal distribution was selected. The two-dimensional optimal Copula distribution function and its parameters and the goodness-of-fit test results are shown in Table 3, and the Kendall rank correlation coefficient thermodynamic diagram is shown in Figure 4. From the Q-Q plot corresponding to the fitting probability curve, it can be seen that the Q-Q plots of the risk association are all close. Moreover, the goodness-of-fit test results are all optimal. The optimal two-dimensional Copula function was selected to draw the joint probability density distribution function. The Q-Q plots corresponding to the probability curves fitted by some two-dimensional optimal Copula distribution functions and the joint probability density distribution functions are shown in Figure 5.

3.6. Optimal R-Vine Copula Structure Selection

The standard deviation of the correlation degree was used as the threshold to screen the risk correlation relationships with lower correlation degrees. The heat map of the screened risk correlation degree is shown in Figure 6, and the Kendall rank correlation coefficient heat map of the risks with direct correlation relationships is shown in Figure 7. The sum of the screened risk correlation degree and the Kendall rank correlation coefficient was taken as the edge weight. Based on the multi-level hierarchical structure model of direct risk correlation, combined with the construction of the R-Vine Copula structure, the MST-PRIM algorithm was used to determine the optimal first-layer tree structure of the R-Vine Copula model for the operational safety risk of the Middle Route Project of the South-to-North Water Diversion Project, as shown in Figure 8.
The first-layer tree structure of the R-Vine Copula model related to the operation safety risks of long-distance water diversion projects is jointly constructed by 19 risks. The structure of the second-layer tree is restricted by the structure of the first-layer tree. The edges of the first-layer tree are the nodes of the second-layer tree. Therefore, the second-layer tree has 20 nodes. Repeat the steps to determine the optimal two-dimensional Copula function between the two nodes, construct the second-layer tree structure, and repeat this step to generate the topological structure of all the trees to complete the R-Vine Copula model for the operation safety risk association of long-distance water diversion projects.

4. Simulation Analysis

Take 19 security risks as the network nodes and add the optimal edge probability distribution of the occurrence probability of each risk as determined. Among them, risks R2, R3, R6, R7, R10, R15, R17, and R18 are GEV distributions, which are converted into cumulative edge distribution functions and stored in ascii format as “*.dis” for input. The first-layer tree of the R-Vine Copula structure with the optimal risk correlation for the operation safety of the Middle Route Project of the South-to-North Water Diversion Project is taken as the basic structure of the initial CBN network. The initial CBN network is constructed with the risk correlation relationship as the directed edge, and the Kendall rank correlation coefficient is added. The initial CBN network is shown in Figure 9. The nodes in the figure are presented as rectangles. The optimal edge probability distribution of risk factors is shown in the center of the rectangle. Below the rectangle, the mean and standard deviation of the risk occurrence probability are clearly marked, providing an important basis for measuring the average level and fluctuation degree of risk. The lines connecting each node in the figure represent the correlation degree and mutual relationship between the risk occurrence probabilities.

5. Discussion

5.1. Qualitative Analysis

A spider web diagram is regarded as one of the best tools for visualizing a qualitative analysis, mainly used for the qualitative analysis of high-dimensional joint distributions. The horizontal axis represents each risk, and the vertical axis represents the probability value of risk occurrence. Each sample is represented by a serrated line, which intersects the vertical axis of the node. With a sample size of 500, by uniformly setting four value intervals of risk occurrence probability and setting them as the initial conditions, the multiple correlations of risks when flood disaster R1, ice disaster R4, and terrorist attack risk R16 occur are analyzed, respectively. The spider web diagram of risk occurrence probability is shown in Figure 10. In the figures, the blue line, green line, red line, and black line, respectively, represent the occurrence probability in the four value intervals of the risk occurrence probability. The occurrence probability of flood disaster R1 conforms to the Weibull distribution. Through a sampling analysis, the values of the risk occurrence probability are basically in the middle two regions. At this time, the occurrence probabilities of risks R5, R6, R7, R8, R9, R12, R13, and R18 are greatly affected and are in a high value state. Among them, the occurrence probability of R5 is relatively high, and its value is at the top of the spider web. The occurrence probability of ice disaster R4 conforms to the Normal distribution. Through a sampling analysis, the risk occurrence probability values are basically in the middle two regions. At this time, the occurrence probabilities of R3, R5, R6, R7, R8, R9, R12, R13, and R18 are greatly affected and have relatively high occurrence probabilities. Among them, the occurrence probabilities of R5 and R18 are relatively high, and their values are located at the top of the spider web. The occurrence probability of terrorist attack risk R16 conforms to the Weibull distribution. Through a sampling analysis, the values of the risk occurrence probability are basically the three regions with relatively high occurrence probabilities. At this time, the occurrence probabilities of R5, R6, R7, R8, R9, R12, R13, and R18 are greatly affected and have relatively high occurrence probabilities. Among them, the occurrence probabilities of R5 and R18 are relatively high. The values are located at the top of the spider web. In the case of multiple risk correlations, the spider web diagram can show that the engineering risks (R5, R6, R7, R8, and R9), dispatching operation risks (R12 and R13), and social public opinion risks (R18) are highly correlated. These risks may be the most significantly correlated variables, so they are taken as hypothetical conclusions for further verification.

5.2. Forward Reasoning Analysis

(1) The probability of each risk changes dynamically after the risk is controlled under the relevant circumstances.
Flood disasters (R1), earthquake disasters (R2), geological disasters (R3), and freezing disasters (R4) are natural risks and belong to causative factors. Apart from earthquake disasters (R2) being affected by flood disasters (R1), and geological disasters (R3) being affected by flood disasters (R1) and freezing disasters (R4), they are not influenced by the other types of risks. Instead, they serve as causal factors influencing the occurrence probability of other risks. Therefore, it is necessary to strengthen control to reduce the occurrence probability of these five types of risks. The probability distribution of occurrence of each risk after the occurrence probability of five causes is reduced is shown in Figure 11. The black probability distribution is the probability distribution of each risk after the flood disaster (R1) is reduced, the gray is the original probability distribution, and the number below the rectangle is the mean and standard deviation of the occurrence probability of each risk.
(2) The probability of each risk is dynamically changed after multi-risk control in the correlation case.
The distribution of the occurrence probabilities of each risk after the occurrence probabilities of the five types of causative risks are all decreased is shown in Figure 12. The occurrence probabilities of R5, R6, R7, R8, R9, R11, R12, R13, and R14 have all decreased significantly.
Through a forward inference analysis, it is found that the probability distribution of risks R5, R6, R7, R8, R9, and R11 changes the most significantly. When only a single-cause risk is controlled, it has little influence on the probability change for other risks. However, after the comprehensive control of causal risk factors, the occurrence probability of other risks decreases significantly, especially the occurrence probability of engineering risks (R5, R6, R7, R8, and R9), which decrease by more than 30%.
Natural risks belong to cause-type risks and are greatly affected by seasonal factors. The areas along the main trunk canal of the Central Line Project face flood disaster risk (R1) during flood season. The rainy season in this region is concentrated, the rainstorm intensity is high, and the duration is short, and this region is also the key rainstorm area in China. During the flood season, flood waters may exceed the flood discharge capacity of aqueducts, drainage aqueducts, drainage culverts, and river siphons, thus threatening the safety of trunk canals. In addition, floods and their cargoes may cause damage to engineering structures through scour and impact, and may even lead to sewage contamination of main water channels. In the fourth quarter, low temperatures may cause a frost disaster (R4), which may lead to frost swelling and melting of foundation soil, thereby damaging buildings, or the erosion of concrete materials as moisture and carbon dioxide in the air increases, resulting in the accelerated aging of materials, and affecting the safety of project operation. A flood disaster (R1) may cause a geological disaster (R3), such as debris flow, landslides, etc., and then cause impact damage to the channel’s buildings. Therefore, in risk management and control, it is necessary to comprehensively consider the dynamic changes to the occurrence probability of cause-based risks. By inputting the possible values or probability distributions of different risk occurrence probabilities into the constructed CBN model, operation managers could deduce the changing trend of the risk occurrence probability of long-distance water diversion projects, and take timely control measures to reduce the risk probability.

5.3. Polynomial Regression Analysis

The CBN model constructed in this paper uses Clayton Copula, Frank Copula, and t-Copula functions to conduct correlations among variables. A linear correlation can reflect the correlation between variables and the mutual influence of risks. Therefore, UNISENSE2.0.0.0 software is used in this paper to conduct a polynomial regression analysis. The expected value, mean value, and regression coefficient of each risk factor are calculated by fitting the joint scatter distribution of the original data with the polynomial.
According to the risk change trend, the change trend of the associated risk can be predicted when the risk changes dynamically. Taking the channel engineering risk R5 with a high degree of association as an example, the channel engineering risk R5 is associated with risks R1, R2, R3, R4, R6, R7, R8, R9, and R16. When the occurrence probability of associated risks increases, the mean level of R5 shows an upward trend, and the upward trend of fitting is basically consistent. The change trend of the occurrence probability of channel engineering risk R5 is shown in Figure 13.
The multiple regression coefficient in the multiple regression describes the influence degree of each independent variable. Take the engineering risk as an example, the multiple association regression coefficient between risk R5 and risks R1, R2, R3, R4, R6, R7, R8, R9, and R16 is 0.6858, and the multiple association regression coefficient between risk R6 and risks R1, R2, R3, R4, R5, R7, R8, R9, and R16 is 0.6655. The multiple association regression coefficient of risk R7 with risks R1, R2, R3, R4, R5, R6, R9, and R16 is0.7935, and the multiple association regression coefficient of risk R8 with risks R1, R2, R3, R4, R5, R6, R9, and R16 is 0.8806. The resulting nearest causal layer is the risk associated with the largest number of risks. The multiple correlation regression coefficients and correlated risks of risks in neighbor causal layer are shown in Table 4.

6. Conclusions

The main conclusions of this paper are as follows:
(1)
Through data mining technology, based on the analysis of the construction yearbook of the long-distance water diversion project, the project operation safety risk analysis report, monitoring data, network data, etc., the batch file processor and the Ultra-replace tool in ROST CM6 software were used for text processing to obtain the operation safety risk index system of the Middle Route Project of the South-to-North Water Diversion Project. We ensured the completeness and accuracy of the retrieval.
(2)
Based on a Monte Carlo simulation and combined with the overall diffusion technique, the probability of risk occurrence with a small amount of data was preprocessed, which compensated for the problem of samples missing caused by the lack of historical data. The parameter estimation of the two-dimensional Copula function was achieved by applying a maximum likelihood estimation and AIC, BIC, and RMSE. By constructing the R-Vine Copula model, the computational complexity of the two-dimensional optimal Copula function was reduced.
(3)
By integrating Bayesian network reasoning, a polynomial regression analysis, and other techniques, a dynamic analysis method for the operation safety of long-distance water diversion projects under the correlation situation based on the CBN model is proposed. This method takes into account the correlation among risks; captures the nonlinear mapping relationship when the probability of risk occurrence changes dynamically; realizes the dynamic analysis of risks in correlated situations; and provides scientific, effective, and timely decision-making information support for the dynamic operation and maintenance of safety risks in long-distance water diversion projects.
(4)
Taking the Middle Route Project of the South-to-North Water Diversion Project as an example for analysis, the characteristics of the probability of risk occurrence under the associated situation can be obtained through the calculation of the constructed model. The research results show that there are significant differences in the correlation intensity among different risks. Different risk prevention and control strategies should be adopted. According to the determined direction of risk correlation, the transmission and diffusion patterns of risks can be revealed. The higher the correlation degree, the stronger the transmission and diffusion ability. The higher the correlation degree, the stronger the uncertainty. After the comprehensive control of causative risk factors, the occurrence probabilities of other risks significantly decreased, especially the occurrence probabilities of engineering risks (R5, R6, R7, R8, and R9), which dropped by more than 30%.

Author Contributions

Conceptualization, P.L. and G.L.; Methodology, P.L.; Software, P.L. and F.C.; Validation, F.D. and F.C.; Formal analysis, F.D.; Investigation, F.D. and Y.S.; Resources, G.L.; Data curation, G.L. and F.C.; Writing—original draft, P.L.; Writing—review & editing, Y.W. and B.W.; Visualization, Y.W.; Supervision, Y.S. and B.W.; Project administration, B.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Power CHINA Roadbridge Group CO. LTD. (DJLQKJ2023); the Power Construction Corporation of China, Ltd. (DJKJ2023), Research on Digitalization and Intelligentization-Driven Collaborative Governance in the Wen’anwa Flood Storage; and the Training Programme for Young Backbone Teachers of Higher Education Institutions in Henan Province (2024GGJS061).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors Pengyuan Li, Yuansen Wang, Yongguo Sheng and Feng Cheng are employed by Central China Regional Headquarters of Powerchina Road-Bridge Co., Ltd. The author Fudong Dong was employed by Western Reginal Headquarters of Powerchina Road-Bridge Group Co., Ltd. The author Guibin Lv was employed by Southeast Reginal Headquarters of Powerchina Road-Bridge Group Co., Ltd. The remaining authors stated that the study was conducted without any commercial or financial relationships, which might be interpreted as a potential conflict of interest.

References

  1. Yan, H.; Lin, Y.; Chen, Q.; Zhang, J.; He, S.; Feng, T.; Wang, Z.; Chen, C.; Ding, J. A Review of the Eco-Environmental Impacts of the South-to-North Water Diversion: Implications for Interbasin Water Transfers. Engineering 2023, 30, 161–169. [Google Scholar] [CrossRef]
  2. Langford, J.; Man, D.C.; Hirsch, S.; Reiter, P.D. Adapting to Drought in Australia and California: Creative Water Transfers in a Water-Scarce World. J.-Am. Water Work. Assoc. 2015, 107, 20–24. [Google Scholar] [CrossRef]
  3. Li, Y.; Chen, X. The water diversion project of the waterway in California, USA. Water Resour. Dev. Res. 2002, 45–48. [Google Scholar]
  4. Wang, B.; Fan, T.; Cui, Y.; Nie, X. Diagnosis of key safety risk sources of long-distance water diversion engineering operation based on sub-constraint theory with constant weight. Desalination Water Treat. 2019, 168, 374–383. [Google Scholar] [CrossRef]
  5. Nie, X.; Zhao, T.; Zhang, P.; Fan, T.; Wang, B. Study on Risk Correlation Analysis and Risk Transmission of Long-distance Water Diversion Project. J. N. China Univ. Water Resour. Electr. Power 2022, 43, 45–53. [Google Scholar]
  6. Zhang, J.Y.; Shu, Z.K.; Wang, H.J.; Li, W.J.; Zhang, X.L. A discussion on several hydrological issues of “7·20” rainstorm and flood in Zhengzhou. Acta Geogr. Sin. 2023, 78, 1618–1626. [Google Scholar]
  7. Lu, J.; Zhang, W.; Fan, L.; Guo, J.; Miao, C. Installation and Operation Practice of the Pumping Station for the Emergency Temporary Water Conveyance Project of the Beijuma River Underground Channel in the South-to-North Water Diversion Project. City Town Water Supply 2023, 31–37. [Google Scholar] [CrossRef]
  8. Xu, L.; Wu, Q.; Wang, Y. Thoughts on the Operation Safety of the Middle Route Main Project of the South-to-North Water Diversion Project and the “7·20” Torrential Rain in Zhengzhou, Henan Province. Harnessing Huaihe River 2024, 52–54. [Google Scholar]
  9. Jin, S.; Liu, H.; Ding, W.; Shang, H.; Wang, G. Sensitivity Analysis for the Inverted Siphon in a Long Distance Water Transfer Project: An Integrated System Modeling Perspective. Water 2018, 10, 292. [Google Scholar] [CrossRef]
  10. Feng, P.; Yan, D.; Geng, L.; Tian, W. Study on flood risk assessment of the main channel in middle route of the water transfer project from south to north. Shuili Xuebao 2003, 40–45. [Google Scholar]
  11. Chen, X.; Zhao, X.; Duan, Z. Risk Prediction of Geological Hazards in Hanjiang-to-Weihe River Diversion Project. J. Catastrophology 2011, 26, 47–51. [Google Scholar]
  12. Li, F.; Li, Y.; Li, M.; Zhang, C. Spatial distribution of ice hazards in middle route of South-to-North Water Transfer Project based on fuzzy evaluation model. South-North Water Transf. Water Sci. Technol. 2017, 15, 132–137. [Google Scholar]
  13. Long, Y.; Yang, T.; Gao, W.; Liu, Y.; Xu, C.; Yang, Y. Prevention and control of algae residue deposition in long-distance water conveyance project. Environ. Pollut. 2024, 344, 123294. [Google Scholar] [CrossRef]
  14. Fang, S.; Yang, J.; Qiang, Y.; Wang, Y.; Xi, J.; Feng, Y.; Yang, G.; Ren, G. Distribution and environmental risk assessment of fertilizer application on farmland in the water source of the middle route of the South-to-North Water Transfer Project. J. Agro-Environ. Sci. 2018, 37, 124–136. [Google Scholar]
  15. Ma, C.; Liu, Z.; He, W.; Zhang, Y.; Jiang, A.; Zhang, J.; Lian, J. Reliability of Emergency Water Supply for a Reservoir and Enhancement through Floating Photovoltaics in a Long-Distance Water Diversion Project. J. Water Resour. Plan. Manag. 2023, 149, 04023021. [Google Scholar] [CrossRef]
  16. Shi, L.; Zhang, J.; Yu, X.-D.; Chen, S.; Zhao, W.-L.; Chen, X.-Y. Water hammer protection for diversion systems in front of pumps in long-distance water supply projects. Water Sci. Eng. 2023, 16, 211–218. [Google Scholar] [CrossRef]
  17. Zhai, J.Q.; Zhao, Y.; Pei, Y.S. Research on Hydrological Risk Factors of Water Supply of the Source of Middle Route of the South-to-North Water Transfer Project. S.-N. Water Transf. Water Sci. Technol. 2010, 8, 13–16, 22. [Google Scholar]
  18. Jia, C.; Liu, N.; Chen, J. Risk analysis for aqueduct structure of the South-North Water Transfer Project (Central Route). J. Hydroelectr. Eng. 2003, 29, 23–27. [Google Scholar]
  19. Song, X.; Liu, H.; Geng, L.; Jiang, B.; Li, A. Risk Identification for Crossing Structures in the Middle Route of the South-to-North Water Transfer Project. South-North Water Transf. Water Sci. Technol. 2009, 7, 13–15. [Google Scholar]
  20. Liu, K.; Liu, Z.; Chen, Y.; Ma, F.; Wang, H.; Huang, H.; Xie, H. Dynamic Bayesian network model for the safety risk evaluation of a diversion tunnel structure. J. Tsinghua Univ. (Sci. Technol.) 2023, 63, 1041–1049. [Google Scholar]
  21. Wen, S.; Xiao, Y.; Qin, Y.; Li, F. Analysis of Typical Disaster-causing Geological Structures in Long and Large Tunnels of Central Yunnan Water Diversion Project. Mod. Tunn. Technol. 2022, 59, 719–726. [Google Scholar]
  22. Zhang, S.; Ai, Y.; Chen, J.; Jin, C.; Ji, Z. Application of matter element extension method in stability evaluation of surrounding rock of diversion tunnel. J. Saf. Environ. 2024, 24, 10–18. [Google Scholar]
  23. Zhang, S.; Liu, T.; Wang, C. Multi-source data fusion method for structural safety assessment of water diversion structures. J. Hydroinform. 2021, 23, 249–266. [Google Scholar] [CrossRef]
  24. Gong, L.; Lu, R.; Jin, C.; Wu, M. Winter operation safety evaluation of long distance water diversion channels in cold areas based on game-improved extension theory. J. Nat. Disasters 2019, 28, 81–92. [Google Scholar]
  25. Chen, W.L.; Chen, X.L.; Wu, W.D.; Xie, Z.K. Application of Digital Technology in Safety Evaluation of Dabeishan Aqueduct. IOP Conf. Ser. Earth Environ. Sci. 2021, 787, 012156. [Google Scholar] [CrossRef]
  26. Zhang, Z.; Chen, H. Risk Assessment of Beam-type Aqueduct Based on Game Theory-Cloud Model. Haihe Water Resour. 2024, 3, 92–98. [Google Scholar] [CrossRef]
  27. Ouache, R.; Chhipi-Shrestha, G.; Hewage, K.; Sadiq, R. An integrated risk assessment and prediction framework for fire ignition sources in smart-green multi-unit residential buildings. Int. J. Syst. Assur. Eng. Manag. 2021, 12, 1262–1295. [Google Scholar] [CrossRef]
  28. Ma, L.; Ma, X.; Chen, L.; Zhang, R.; Zhang, J. A methodology to quantify risk evolution in typhoon-induced maritime accidents based on directed-weighted CN and improved RM. Ocean. Eng. 2025, 319, 120303. [Google Scholar] [CrossRef]
  29. Zhang, R.; Shuai, B.; Gao, P.; Zhang, Y. Driver’s journey from historical traffic violations to future accidents: A China case based on multilayer complex network approach. Accid. Anal. Prev. 2024, 211, 107901. [Google Scholar] [CrossRef]
  30. Lu, Y.; Liu, J.; Yu, W. Social risk analysis for mega construction projects based on structural equation model and Bayesian network: A risk evolution perspective. Eng. Constr. Arch. Manag. 2023, 31, 2604–2629. [Google Scholar] [CrossRef]
  31. Luo, P.; Hu, Y. System risk evolution analysis and risk critical event identification based on event sequence diagram. Reliab. Eng. Syst. Saf. 2013, 114, 36–44. [Google Scholar] [CrossRef]
  32. Zhang, Y.; Zhang, J.; Yang, Y.; Li, D.; Wang, B. Dynamic Evaluation of Highway Engineering Construction Safety Risk Based on Fuzzy Dynamic Bayesian Network. Henan Sci. 2024, 42, 653–659. [Google Scholar]
  33. Zhang, J.; Zhang, Y.; Yang, Y.; Li, D.; Wang, B. Quantitative Analysis of Highway Engineering Construction Safety Risk Based on Combinatorial Weighted Two-Dimensional Cloud Model. Henan Sci. 2024, 42, 1458–1466. [Google Scholar]
  34. Fu, L.; Wang, X.; Zhao, H.; Li, M. Interactions among safety risks in metro deep foundation pit projects: An association rule mining-based modeling framework. Reliab. Eng. Syst. Saf. 2022, 221, 108381. [Google Scholar] [CrossRef]
  35. Aloini, D.; Dulmin, R.; Mininno, V. Modelling and assessing ERP project risks: A Petri Net approach. Eur. J. Oper. Res. 2012, 220, 484–495. [Google Scholar] [CrossRef]
  36. Quinci, G.; Paolacci, F.; Fragiadakis, M.; Bursi, O.S. A machine learning framework for seismic risk assessment of industrial equipment. Reliab. Eng. Syst. Saf. 2024, 254, 110606. [Google Scholar] [CrossRef]
  37. Zhang, Y.; Weng, W.G. Bayesian network model for buried gas pipeline failure analysis caused by corrosion and external interference. Reliab. Eng. Syst. 2020, 203, 107089. [Google Scholar] [CrossRef]
  38. Li, X.; Liu, T.; Liu, Y. Cause Analysis of Unsafe Behaviors in Hazardous Chemical Accidents: Combined with HFACs and Bayesian Network. Int. J. Environ. Res. Public Health 2020, 17, 11. [Google Scholar] [CrossRef]
  39. Siavash, G.; Esmatullah, N.; Saied, Y. BIM-based solution to enhance the performance of public-private partnership construction projects using copula bayesian network. Expert Syst. Appl. 2023, 216, 119501. [Google Scholar]
  40. Zha, X.; Sun, H.; Jiang, H.; Cao, L.; Xue, J.; Gui, D.; Yan, D.; Tuo, Y. Coupling Bayesian Network and copula theory for water shortage assessment: A case study in source area of the South-to-North Water Division Project (SNWDP). J. Hydrol. 2023, 620, 129434. [Google Scholar] [CrossRef]
  41. Sun, Y.; Chen, K.; Liu, C.; Zhang, Q.; Qin, X. Research on reliability analytical method of complex system based on CBN model. J. Mech. Sci. Technol. 2021, 35, 107–120. [Google Scholar] [CrossRef]
  42. Ghosh, A.; Ahmed, S.; Khan, F.; Rusli, R. Process safety assessment considering multivariate non-linear dependence among process variables. Process Saf. Environ. Prot. 2020, 135, 70–80. [Google Scholar] [CrossRef]
  43. Lasserre, M.; Lebrun, R.; Wuillemin, P.H. Constraint-based learning for non-parametric continuous bayesian networks. Ann. Math. Artif. Intell. 2021, 89, 1035–1052. [Google Scholar] [CrossRef]
Figure 1. The structure of the four-dimensional R-Vine Copula.
Figure 1. The structure of the four-dimensional R-Vine Copula.
Water 17 02389 g001
Figure 2. Multi-level hierarchical structure model of operation safety risk association of Central Line Project of South-to-North Water Diversion.
Figure 2. Multi-level hierarchical structure model of operation safety risk association of Central Line Project of South-to-North Water Diversion.
Water 17 02389 g002
Figure 3. Multi-level hierarchical structure model of direct correlation relationship between operational safety risks of Central Line Project of South-to-North Water Diversion.
Figure 3. Multi-level hierarchical structure model of direct correlation relationship between operational safety risks of Central Line Project of South-to-North Water Diversion.
Water 17 02389 g003
Figure 4. Kendall rank correlation coefficient thermodynamic diagram.
Figure 4. Kendall rank correlation coefficient thermodynamic diagram.
Water 17 02389 g004
Figure 5. Part of the two-dimensional optimal Copula distribution function fits the Q-Q plot corresponding to the probability curve and the joint probability density distribution function.
Figure 5. Part of the two-dimensional optimal Copula distribution function fits the Q-Q plot corresponding to the probability curve and the joint probability density distribution function.
Water 17 02389 g005aWater 17 02389 g005bWater 17 02389 g005cWater 17 02389 g005d
Figure 6. Heat map of screened risk interdependence.
Figure 6. Heat map of screened risk interdependence.
Water 17 02389 g006
Figure 7. Heat map of Kendall rank correlation coefficients between risks with direct interdependency.
Figure 7. Heat map of Kendall rank correlation coefficients between risks with direct interdependency.
Water 17 02389 g007
Figure 8. The optimal first-layer tree structure of the R-Vine Copula model.
Figure 8. The optimal first-layer tree structure of the R-Vine Copula model.
Water 17 02389 g008
Figure 9. CBN initial network.
Figure 9. CBN initial network.
Water 17 02389 g009
Figure 10. Spider web diagram of risk occurrence probability.
Figure 10. Spider web diagram of risk occurrence probability.
Water 17 02389 g010aWater 17 02389 g010b
Figure 11. Probability distribution of each risk after the occurrence probability of the five causal-type risks is reduced, respectively.
Figure 11. Probability distribution of each risk after the occurrence probability of the five causal-type risks is reduced, respectively.
Water 17 02389 g011aWater 17 02389 g011bWater 17 02389 g011cWater 17 02389 g011dWater 17 02389 g011e
Figure 12. Probability distribution of each risk after the occurrence probability of five causes is reduced.
Figure 12. Probability distribution of each risk after the occurrence probability of five causes is reduced.
Water 17 02389 g012
Figure 13. Linear regression analysis of channel engineering risk R5 correlated with other risks.
Figure 13. Linear regression analysis of channel engineering risk R5 correlated with other risks.
Water 17 02389 g013aWater 17 02389 g013bWater 17 02389 g013cWater 17 02389 g013dWater 17 02389 g013eWater 17 02389 g013f
Table 1. Safety risk index system for the Middle Route Project of the South-to-North Water Diversion Project.
Table 1. Safety risk index system for the Middle Route Project of the South-to-North Water Diversion Project.
RiskSymbolMarginal Probability Distribution
Natural riskFlood disasterR1Weibull
Earthquake disasterR2GEV
Geological disasterR3GEV
Freezing disasterR4Normal
Engineering riskChannel engineering riskR5Weibull
Pipeline engineering risksR6GEV
The buildings crossing the channel were damagedR7GEV
The water conveyance cross structure was damagedR8Normal
Control the risks of buildingsR9Weibull
Risk of water quality pollutionWater quality pollution in water sourcesR10GEV
Water quality pollution during the water transportation processR11Normal
Dispatch operation riskThe internal dispatching system malfunctionedR12Normal
The external dispatching system malfunctionedR13Normal
Economic riskThe operating cost of the project increasedR14Normal
The operating income of the project decreasedR15GEV
Social security riskTerrorist attack incidentR16Weibull
Group incidentR17GEV
Social public opinion eventsR18GEV
Cyber security incidentR19Normal
Table 2. Risk margin probability distribution function.
Table 2. Risk margin probability distribution function.
RiskDistributionParameter
Shape Parameter σ Position Parameter μ Scale Parameter λ
R1Weibull4.00641\0.10581
R2GEV−0.348180.075310.00043
R3GEV−0.313090.074890.00059
R4Normal\0.07441 0.00086
R5Weibull40.063\0.28573
R6GEV−0.312110.128560.00146
R7GEV−0.304240.127310.01175
R8Normal\0.13156 0.00948
R9Weibull5.59093\0.2226
R10GEV−0.317980.002130.13824
R11Normal\0.10094 0.01310
R12Normal\0.12317 0.00258
R13Normal\0.05933 0.00103
R14Normal\0.12855 0.00002
R15GEV−0.304250.000010.06107
R16Weibull7.61098\0.00017
R17GEV−0.308780.012160.00001
R18GEV−0.308410.018830.001
R19Normal\0.02689 0.00061
Table 3. The two-dimensional optimal Copula distribution function and its parameters and the goodness-of-fit test results.
Table 3. The two-dimensional optimal Copula distribution function and its parameters and the goodness-of-fit test results.
Serial NumberRiskOptimal Function TypeParameterAICBICRMSEKendall’s Rank Correlation Coefficient
1R1R2Frank0.22609−10,586.90−10,581.300.008750.02562
2R1R3Frank5.20840−8110.23−8104.630.009300.48249
3R2R7Frank8.66611−7848.54−7842.940.009660.63835
4R2R8Frank12.13522−7648.05−7642.450.010390.72976
5R2R13Frank2.01124−10,048.63−10,043.030.011110.22396
6R3R7Frank12.11366−7792.08−7786.480.008410.48341
7R3R8Frank8.40377−8435.53−8429.930.009310.45692
8R4R3Frank4.77837−8810.03−8804.430.009740.45692
9R5R9Clayton1.50496−1394.14−1388.540.048680.46416
10R5R11t[1,0.66940;0.66940,1]−1707.98−1696.780.035530.48295
11R5R12Frank2.07867−6808.12−6802.520.011240.22930
12R5R14t[1,0.89495;0.89495,1]−5394.58−5383.370.015540.71962
13R5R15t[1,0.3551;0.3551,1]−6627.74−6616.540.012990.23939
14R5R18t[1,0.33689;0.33689,1]−6070.92−6059.720.014570.22847
15R6R11Clayton0.47285−3314.81−3309.210.033470.20237
16R6R14Frank12.15076−7172.22−7166.620.010050.73022
17R6R15Frank2.17077−10336.32−10330.720.007730.23938
18R6R18Frank1.92754−8942.52−8936.920.011290.21631
19R7R5Frank5.06104−5426.67−5421.070.012730.47558
20R7R6Frank5.26073−8131.92−8126.320.010070.48747
21R7R9Clayton0.64112−3707.05−3701.450.041690.23823
22R8R5t[1,0.66581;0.66581,1]−5972.23−5961.030.017610.48244
23R8R6Frank5.07164−8552.33−8546.730.011080.47658
24R8R9Clayton0.66013−3933.82−3928.220.041750.24442
25R9R10Clayton0.65592−3384.94−3379.340.040370.24222
26R9R11Clayton2.07050−2299.61−2294.010.055680.48338
27R9R12Frank5.12378−1673.20−1667.600.038040.47808
28R9R14Frank11.57450−400.15−394.540.038990.73721
29R9R15Clayton0.58240−3880.59−3874.990.039560.21559
30R10R18Frank2.00107−8200.72−8195.120.011120.22418
Table 4. Multiple correlation regression coefficients and correlated risks of risks in neighbor causal layer.
Table 4. Multiple correlation regression coefficients and correlated risks of risks in neighbor causal layer.
Neighbor Causality Layer RiskSubject to Associated RisksMultiple Regression Coefficient
R12R1, R2, R3, R5, R6, R7, R8, R9, R160.4906
R14R1, R2, R3, R4, R5, R6, R7, R8, R9, R13, R160.7466
R15R5, R6, R7, R8, R9, R10, R11, R160.2419
R17R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R13, R160.2202
R18R1, R2, R3, R4, R5, R6, R7, R8, R9, R10, R11, R13, R160.3414
R19R1, R2, R3, R4, R9, R13, R160.0729
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, P.; Dong, F.; Lv, G.; Wang, Y.; Sheng, Y.; Cheng, F.; Wang, B. Multiple Correlation Analysis of Operational Safety of Long-Distance Water Diversion Project Based on Copula Bayesian Network. Water 2025, 17, 2389. https://doi.org/10.3390/w17162389

AMA Style

Li P, Dong F, Lv G, Wang Y, Sheng Y, Cheng F, Wang B. Multiple Correlation Analysis of Operational Safety of Long-Distance Water Diversion Project Based on Copula Bayesian Network. Water. 2025; 17(16):2389. https://doi.org/10.3390/w17162389

Chicago/Turabian Style

Li, Pengyuan, Fudong Dong, Guibin Lv, Yuansen Wang, Yongguo Sheng, Feng Cheng, and Bo Wang. 2025. "Multiple Correlation Analysis of Operational Safety of Long-Distance Water Diversion Project Based on Copula Bayesian Network" Water 17, no. 16: 2389. https://doi.org/10.3390/w17162389

APA Style

Li, P., Dong, F., Lv, G., Wang, Y., Sheng, Y., Cheng, F., & Wang, B. (2025). Multiple Correlation Analysis of Operational Safety of Long-Distance Water Diversion Project Based on Copula Bayesian Network. Water, 17(16), 2389. https://doi.org/10.3390/w17162389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop