2.2. An Ecological and Environmental Risk Assessment Framework for Inter-Basin Water Transfer Projects Based on the DPSIR Model
Risk assessment serves as the foundation of risk management. In contrast to comprehensive scoring methods that rely on subjective weights and fixed ranking rules, a comprehensive risk assessment approach is proposed in this study. In this approach, the optimal projection direction is adaptively determined using an optimization algorithm, a comprehensive projection value is objectively generated, and the risk levels are subsequently classified. The detailed procedure of this method is illustrated in
Figure 2.
The selection of risk evaluation indicators is a crucial step in the assessment process. In this study, a hierarchical analysis of the risk assessment indicator system was performed based on the DPSIR framework. For example, the enhancement of response measures can effectively alleviate pressures and improve the state, thereby mitigating negative impacts. Moreover, the deterioration of the state can further promote stronger response measures. The DPSIR framework was adopted as a logical structure for indicator system construction and classification, through which the interconnections among the different dimensions are clearly demonstrated. The logical relationships among the various factors are illustrated in
Figure 3.
To ensure the scientific validity and rationality of the evaluation, adherence to the principles of scientific rigor, sustainability, comprehensiveness, and operational feasibility was maintained. An integrated approach combining the DPSIR model, literature review methodology, and expert interviews was employed to establish the indicator system. The specific development process is illustrated in
Figure 4.
In this study, driving forces, pressures, state, impacts, and responses were adopted as the first-level indicators for the eco-environmental risk assessment of the inter-basin water diversion project. The conceptual basis of the DPSIR framework is derived from the environmental system causal chain analysis method proposed by institutions such as the European Environment Agency [
18]. This framework has been widely used for the construction of indicator systems in fields such as water resources management [
19] and ecological security [
20]. To determine the specific second-level indicators, a literature search was conducted in databases, including Web of Science, China National Knowledge Infrastructure (CNKI), and Wanfang Data, using keywords such as “inter-basin water diversion”, “eco-environmental assessment of water diversion projects”, and “eco-environmental risk indicator system for water diversion projects”. Only empirical research articles published from 1 January 2015 to 1 March 2026 were selected. The collected literature was read and analyzed, and candidate indicators were preliminarily summarized. After removing duplicates and refining the selection, the indicators were grouped according to the first-level categories, and a preliminary indicator library was established. Subsequently, interviews were conducted with a total of 20 experts from relevant professional fields, including environmental science and engineering, hydraulic engineering and hydrology and water resources, landscape ecology, and geographic information science, thereby ensuring the diversity of the expert panel. Indicators were retained if over 60% of experts deemed them important; those with significant disagreement underwent further analysis to determine their inclusion. Concurrently, indicators not covered by the current system but with substantial practical impact were identified and supplemented. For indicators that were difficult to quantify or for which data was hard to obtain, expert opinions were solicited regarding their handling. Finally, all expert opinions were integrated to further refine and optimize the preliminary indicator database. The final ecological and environmental risk assessment indicator system for inter-basin water diversion projects is presented in
Table 1.
Driving force is defined as the underlying causes and momentum of ecological and environmental change, primarily driven by socioeconomic and demographic factors. Therefore, the “Indicators for Building Ecological Counties, Cities, and Provinces” was referenced in this study, and the indicators
,
,
,
, and
were selected. Among these indicators,
is primarily focused on the spatial agglomeration intensity of the population and is regarded as a key driving force indicator for measuring the intensity of human activities and ecological disturbance in the region.
reveals the characteristics of natural population replacement and age structure. In Hefei, which is a rapidly urbanizing area, although the natural growth rate is lower than the migration rate, an indirect but non-negligible effect on the balance between water supply and demand is still exerted through its influence on household size and long-term water use habits.
characterizes the stage of urban–rural structural transformation, reflecting the overall changes in land use and industrial patterns. Taken together, these three indicators describe the socioeconomic driving forces from three perspectives: “spatial intensity”, “natural replacement”, and “structural transformation”. Pressure is defined as the impact exerted on the environment under the influence of driving forces, encompassing natural disasters and risks from human activities. Guided by the Technical Specifications for Ecological Environment Status Evaluation (HJ 192-2015) [
40], large amounts of
are emitted by urban industrial activities. After being deposited into water bodies via dry and wet deposition, water acidification is exacerbated, thereby exerting indirect but long-term negative impacts on the water quality of the water diversion project and on the aquatic ecosystems along the canal. Although the Chuhe Main Canal is not classified as a high-grade waterway, certain risks are posed by local navigation, maintenance operations, and the land-based transport of dangerous goods via bridges crossing the canal. In the event of an oil spill or a hazardous material leakage accident, the open water conveyance channel would be directly contaminated, leading to devastating consequences for the safety of downstream water supply and for the aquatic organisms inhabiting the canal. Therefore, it is of practical significance to regard these events as low-probability but high-consequence risk sources. Additionally, increased water resource consumption has been driven by rapid urban development, with vast ecological lands being converted into urban construction sites, hydrological cycles being disrupted, and soil erosion being exacerbated, ultimately resulting in damage to ecological landscapes. In summary,
,
,
,
, and
are identified as significant risk sources. The state S directly reflects the current environmental condition under pressure, primarily characterized by
,
, and
, representing regional water, air, and noise quality states, while
is used to characterize the functional state of landscape ecosystems. Impacts I encompass socioeconomic and ecological dimensions, with
,
,
, and
being selected. Responses R are defined as countermeasures implemented in response to driving forces, pressures, and states, primarily through economic and ecological–environmental management, and policy responses, with the indicators
,
,
,
,
, and
being selected. The calculation methods and data acquisition approaches for each indicator are detailed as follows:
(1) . This indicator is defined as the ratio of the permanent population in the study area to its total land area. A higher value of this indicator indicates greater demand for water resources, which impacts the supply–demand balance and water quality safety of water diversion projects.
(2) . The growth rate of GDP during the evaluation period relative to the previous period. This indicator reflects that economic expansion has driven increased demand for industrial and domestic water use, potentially intensifying water resource exploitation and pollution discharge.
(3) . The ratio of natural population growth to the average total population over a given period. Population growth is considered to directly increase water demand, placing long-term pressure on the water supply capacity of water diversion projects. In rapidly urbanizing areas, population increase is mainly driven by migration, whereas the natural growth rate is considered more as a reflection of population structure characteristics and serves as an indicator of the stage of population development. Consequently, the structure of water demand and the transmission pathways of ecological risks are indirectly influenced.
(4) . This is defined as the ratio of a region’s annual total water resources to its annual resident population, which is used to precisely reflect how a region’s water supply and demand capacity influences the ecological risks of water diversion projects. The lower the value of this indicator, the greater the dependence of the region on the water diversion project is observed, and the higher the operational pressure and ecological disturbance risk of the project are consequently induced.
(5) . This is defined as the ratio of the urban resident population to the total resident population in the study area. Data can be obtained directly from the National Bureau of Statistics. The accelerated urbanization process drives the expansion of built-up land, resulting in the conversion of cropland and ecological land into impervious surfaces. Consequently, the surface runoff coefficient is increased, and the runoff processes along channel corridors are altered. Concurrently, regional water demand pressures are intensified by the agglomeration of urban populations and industrial activities, while the loads of domestic and industrial pollutants are elevated. As a result, the water quality security of water diversion projects and the stability of ecosystem functions are significantly impacted.
(6) . This is defined as the ratio of the annual total emissions in the region to its annual GDP. Data is primarily obtained from the Pollutant Discharge Statistics Bulletin issued by the ecological environment authorities. Urban industrial emissions impact air quality and acid rain formation, indirectly affecting water acidification and ecosystem health.
(7)
. This indicator is typically calculated as the ratio of the total number of oil spills in the region over the past decade to the number of years covered in the statistics. Although the Chuhe Main Canal is not classified as a high-grade navigable waterway, regular passage of water conservancy patrol vessels, dredging boats, and a limited number of agricultural boats is carried out within the canal. Multiple highway bridges crossing the canal are situated along its route. In the event of a traffic accident involving a road-based hazardous material transport vehicle on these bridges, oil leakage into the water body could be triggered. Although the probability of such an event is considered extremely low, severe consequences would be posed, by which aquatic ecological integrity and water supply security are directly threatened. For specific details, see [
33].
(8) . This is defined as the proportion of water consumption in a study area relative to its total water resources. The higher the indicator, the more pronounced the water supply–demand imbalance is considered to be, and the greater the ecological stress risk.
(9)
. This is defined as the proportion of cultivated land and construction land combined relative to the total land area within the study region. Data should be obtained primarily from land use status records maintained by national land and spatial planning authorities. Land use changes are recognized to impact surface runoff, soil erosion, and ecological connectivity, thereby increasing the risk of ecological disturbance from engineering projects:
where
represents the area of cultivated land,
represents the area of construction land, and
represents the total land area of the region.
(10)
. This is defined as the regional average soil erosion intensity level, which can be determined by referring to the “Soil Erosion Classification and Grading Standard” (SL 190-2007) [
41] issued by the Ministry of Water Resources. Increased soil erosion is considered to accelerate sediment input, affecting water quality, causing channel siltation, and damaging riparian ecosystems:
where
represents the soil erosion intensity grade value for the i-th type,
denotes the area corresponding to that grade, and
signifies the total area of the region.
(11) . The ratio of compliant monitoring instances to total monitoring instances during the evaluation period, which is obtained through routine monitoring by ecological and environmental authorities, directly reflects the water quality status of the water diversion project.
(12)
. This indicator is used to measure biodiversity levels within water diversion areas. A decline in biodiversity is considered to indicate an increased risk of ecological degradation:
where
denotes the number of species,
represents the number of individuals in the
-th species, and
is the total number of individuals across all species,
.
(13)
. This indicator is determined based on the “Environmental Noise Quality Standards” (GB 3096-2008) [
42], with data being obtained through fixed-point monitoring and specialized reports. It reflects the current quality of the acoustic environment along the route. This noise pollution is considered to disrupt wildlife behavior and habitats, particularly affecting sensitive species such as birds and mammals, thereby compromising ecosystem stability.
(14)
. This is defined as the ratio of the number of days meeting air quality standards to the total number of valid monitoring days in the study area, which is determined based on the Environmental Air Quality Standards (GB 3095-2012) [
43]. These pollutants are considered to enter water bodies through wet and dry deposition, impacting water quality and the health of aquatic organisms.
(15)
. This indicator measures the relative reduction in biomass within aquatic ecosystems under environmental stress, which is used to quantify the direct negative consequences experienced by these ecosystems:
where
represents the initial biomass, and
denotes the current biomass. In this study,
was defined as the average biomass over three consecutive years before the operation of the Chuhe Main Canal project, with data sourced from reports published by the Hefei Municipal Bureau of Ecology and Environment and from field sampling surveys.
is obtained from standardized sampling data taken at the same season and the same monitoring section each year, and consistency in sampling method, section location, and analytical procedure was ensured.
(16) . This is defined as the ratio of annual household food expenditures to total annual consumption expenditures in a given region, which is considered to indirectly influence public awareness of ecological environments and participation in environmental protection efforts.
(17) . This is defined as the ratio of actual water supply to total water demand in the study area. Data can be obtained from the China Water Resources Bulletin. A low security rate is considered to trigger conflicts over socioeconomic and ecological water use.
(18) . This indicator is generally obtained through questionnaires, which are used to reflect the public’s perception and acceptance of the environmental impacts of water diversion projects.
(19) . This is defined as the proportion of ecological and environmental-related investments in the study area relative to the total GDP during the same period. This indicator reflects the national emphasis on safeguarding water security and restoring the ecology of water sources, as well as the level of fiscal investment. It is considered to influence risk prevention and control capabilities and ecological restoration capacity.
(20) . This is defined as the proportion of the tertiary industry’s value-added output in the total GDP of the study area during the same period, which can be obtained from the National Bureau of Statistics. This indicator is considered to help reduce industrial pollution emissions and alleviate environmental pressure.
(21)
. The ratio of centralized treatment capacity at wastewater treatment plants to total wastewater discharge volume is considered a core control measure, with its treatment rate directly reflecting the effectiveness of the response:
where
represents the centralized treatment volume of the wastewater treatment plant, and
represents the total volume of wastewater discharged.
(22) . The ratio of annual compliant industrial wastewater discharge volume to total annual industrial wastewater discharge volume, as referenced in the China Ecological Environment Statistical Yearbook, is considered a key measure for reducing point source pollution and safeguarding water quality.
(23) . Data for this indicator is typically evaluated and scored by experts through engineering document reviews and field surveys, directly impacting the effectiveness of biodiversity conservation efforts.
(24) . This is defined as the proportion of green coverage area within a study region relative to its total land area, which is considered to effectively mitigate engineering disturbances and enhance ecosystem service functions.
To validate the scientific rigor and reliability of the indicator system, this study conducted reliability and validity tests upon completion of the data collection process. For specific verification, refer to
Section 2.5.1 on data collection.
- (1)
Reliability test
Cronbach’s alpha coefficient is a commonly used measure of internal consistency for multidimensional scales or indicator systems. A higher value indicates stronger correlations among indicators and greater reliability of results [
44]. The formula for assessing internal consistency using Cronbach’s alpha coefficient is as follows:
where
represents the number of indicators,
denotes the variance of the
indicator, and
signifies the variance of the total score. Generally,
indicates a good level, while
signifies an excellent level.
- (2)
Validity test
The validity of the indicator system was assessed through KMO and Bartlett’s sphericity tests. The KMO test compares simple and partial correlation coefficients among variables; a KMO value closer to 1 indicates stronger inter-indicator correlations, making the system suitable for risk analysis [
45]. Bartlett’s sphericity test examines whether the correlation matrix is an identity matrix. A significant
indicates substantial correlations among variables, warranting further analysis [
46].
2.4. Risk Assessment Model
Unlike traditional water engineering projects such as reservoir operation and single-river channel regulation, inter-basin water diversion projects are characterized by large spatial spans, the coupling of multiple risks, limited sample data, and high-dimensional indicators. To scientifically evaluate the ecological and environmental risks of inter-basin water diversion projects, this study developed an evaluation model based on the Sparrow Search Algorithm with Projection Pursuit (SWSSA-PP). The model encompasses the construction of the projection pursuit model, with optimization via the improved Sparrow Search Algorithm, projection value calculation, and risk level determination.
2.4.1. Projection Tracking Model Construction
- (1)
Data collection and preprocessing
Let the indicator value sample set be , where represents the jth indicator value of the ith sample. n and p denote the number of samples and indicators, respectively. To mitigate the impact of varying indicator data dimensions on results, standardization is required. For benefit and cost-type indicators, the standardization formulas are given by Equations (9) and (10).
After preprocessing, the set is obtained. Given the limited number of subjects in this study, direct data mining calculations exhibit poor stability. Therefore, a random sample generation method based on risk level classification was employed to augment the dataset. Uniformly distributed random sampling was applied to the value ranges of each indicator, generating 200 combinations to form the synthetic data. The synthetic samples simulate typical combinations of indicators across different risk levels, covering the feature space of each level. Their value ranges align with the level intervals of real data. Together with the evaluation data, they form the computational data for data mining and exploration of weight information.
- (2)
Projection Indicator Function Construction
Integrate the dataset into a one-dimensional projection with projection direction , where the projection values are and satisfy .
Here, is the comprehensive evaluation value of the sample in this study, directly representing the level of ecological and environmental risk of the water diversion project. The larger the projection value, the higher the risk.
The projection indicator function
is defined as follows:
where
denotes the inter-class distance;
represents the local density of
;
is the mean of
;
is the
window radius, where
; and
is the unit step function:
- (3)
Projection Indicator Function Optimization
Building upon prior research findings [
47,
48], the optimal projection direction
is computed by maximizing the projection indicator function. The optimal projection function is defined as follows:
2.4.2. Improving the Sparrow Search Algorithm
The Sparrow Search Algorithm (SSA) [
49], proposed in 2020, is a novel optimization algorithm characterized by few parameters, a simple structure, and a relatively fast convergence speed. The real sample size of this study is relatively small, for which stable convergence of the conventional SSA is hard to attain within a limited iteration budget, and the algorithm is easily trapped in local optima during later stages as a result of reduced population diversity. Consequently, the incorporation of multiple improvement strategies is necessary. The multi-strategy improved SSA proposed by Liu [
50] is referenced in this study, by which population diversity is effectively increased, the trade-off between global search and local exploitation is balanced, and it is found to be especially applicable to small-sample high-dimensional optimization problems, specifically as follows:
- (1)
Increase the proportion of adaptive populations
The fixed proportion of individuals within each category in a sparrow population may reduce the efficiency of global search. This study employs the following formula to enhance the ratio coefficient between producers and followers [
50]:
where
and
represent the numbers of producers and followers, respectively;
denotes the ratio of
to
;
and
denote the current and maximum iteration counts;
denotes the population size; and
denotes a random number in the range [0, 1].
- (2)
Integrating the Whale Algorithm with bubble net predation strategy
When sparrows sense danger, they relocate to safe zones. The shrinking enclosure mechanism and spiral position update mechanism of the whale optimization algorithm are employed to update the sparrow producer positions, enhancing SSA’s resistance to local optima [
49]. The formula for updating producer positions is [
50]:
where
denotes the position of the ith sparrow in the jth dimension,
;
;
is a constant;
;
;
;
are random numbers in
;
denotes the optimal position of the producer at the current iteration; and
and
represent the warning threshold and safety threshold, respectively.
- (3)
Fusion-Enhanced Seagull Optimization Algorithm Operator
The spiral attack strategy of the Seagull Optimization Algorithm simulates the hierarchical structure and foraging behavior of seagulls [
51], addressing the local search limitations of SSA. The follower position formula is updated to [
50]:
where
denotes the radius of the spiral motion;
;
and
are constants;
linearly decreases from 2 to 0;
is a random number in [0, 1];
and
represent the current position and optimal position of the seagull; and
and
denote collision-free new positions and optimal positions, respectively.
The linear decay rate of the convergence factor
is excessively slow. This study employs a quadratic decay strategy to adjust the convergence factor, with the specific formula as follows [
50]:
Adaptive weight is a crucial parameter requiring adjustment in the algorithm [
52]. It is set larger in early iterations to broaden the global search scope; in later iterations, it is reduced to enhance the algorithm’s local exploitation capability. The formula is as follows [
50]:
The large stride length of the Leviathan flight algorithm facilitates expanding the search range in the early stages, reducing the risk of getting stuck in local optima; a smaller stride length in the later stages improves the quality of the algorithmic solution. The updated follower position formula is [
50]:
where
denotes the random search path; s represents the random stride, where
, with
and
following a normal distribution,
.
Assuming that 10–20% of sparrows are at risk, with initial positions randomly generated, the position update formula for the early-warning sparrow is [
50]:
where
is the step size control parameter;
is the sparrow step size control parameter;
is the current sparrow’s fitness values;
and
represent the global optimal and worst fitness values, respectively; and
is a constant chosen to be as small as possible.
2.4.3. Improved Sparrow Search Algorithm for Projection-Based Path Tracing (SWSSA-PP) Model
Step 1. Initialize SWSSA parameters and the sparrow population.
Based on the optimization requirements, key parameters are set and an initial sparrow population is generated, where each individual corresponds to a projection direction. The population size N is set within the range of 30 to 50; the iteration count t and maximum iteration limit are controlled between 100 and 500 to prevent premature convergence; the producer ratio is set to 0.2 and dynamically adjusted according to adaptive strategies; the warning threshold is , with a safety threshold of 0.8; the Whale Algorithm parameter ; the seagull operator parameters ; and the Lévy flight parameter . Dynamically adjust and based on iteration count. Each sparrow individual is represented as a P-dimensional vector , (), which is randomly generated within and then normalized.
Step 2. Read the indicator data from the evaluation metric system and standardize it using Equations (9) and (10).
Step 3. Calculate the fitness values for the initial population.
Employ the modified SSA to determine the optimal projection direction, using Equation (12) as the fitness function. For each individual , compute the projection values for all samples according to Equation (11). Calculate and using Equations (13) and (14); record the global optimal fitness and corresponding for the initial population.
Step 4. Iterative optimization.
(1) Calculate fitness values for the initial population using Equation (12) and sort them.
(2) Update the positions of producers, followers, and early-warning agents using Equations (18), (22) and (23).
(3) Calculate the fitness for all individuals after updating; if , update the global optimum; and if and , then .
(4) Repeat steps (2) to (3) until , and then output the optimal projection direction and projection value.
The pseudocode is presented as follows (Algorithm 1).
| Algorithm 1. The framework of the SWSSA |
| Input: |
| T_max: maximum iterations; N: number of sparrows; |
| SD_ratio: proportion of scouts; |
| Initialize population and set relevant parameters; |
| t = 0; SD_num = N × SD_ratio; PD_num = N × 0.2; |
| Output: x(best), f(g) |
| 1: While (t < T_max) |
| 2: Evaluate fitness values of all sparrows and find the current global best position |
| 3: Sort sparrows by fitness |
| 4: //Adaptive population proportion strategy |
| 5: a = 0.15 × (2 × t/T_max) − 0.1 × rand() + 0.1 |
| 6: PD_num = max(1, round(a × N)) |
| 7: Follow_num = N-PD_num |
| 8: SD_num = round(N × SD_ratio) |
| 9: //Update producers (the first PD_num best individuals) |
| 10: for i = 1 to PD_num |
| 11: Update the sparrow’s location according to Equation (18) |
| 12: end for |
| 13: //Update followers |
| 14: for i = PD_num + 1 to PD_num + Follow_num |
| 15: Update the sparrow’s location by using Equation (22) |
| 16: end for |
| 17: //Randomly select SD_num scouts to update |
| 18: rand_index = random sample of size SD_num from {1, 2, …, N} |
| 19: for each i in rand_index |
| 20: Update the sparrow’s location |
| 21: end for |
| 22: //Boundary handling and greedy selection |
| 23: Clip all positions to the feasible bounds |
| 24: Keep better solutions |
| 25: t = t + 1 |
| 26: end while |
| 27: return x(best), f(g) |
2.4.4. Determination of Risk Levels
- (1)
Data preparation and initialization
Extract the optimal projection values for all samples, the projection value for the standard sample set, and the projection value for the evaluation sample set. Integrate these into the input dataset {}. The projection values derived from the SWSSA-PP model in this study are continuous data after high-dimensional data reduction, and the numerical relationship of “higher risk corresponds to larger projection value” is realized by these values. Based on the numerical distribution of the projection values, objective and automated risk level classification is enabled by K-means clustering. This method is adapted to the mixed dataset of standard and evaluation samples in this study, by which the stability of risk level classification is guaranteed. The steps are as follows.
1. Select one projection value at random from the dataset as the first initial center .
2. Calculate the Euclidean distance between the remaining
and
, and then select the second center
using the following formula:
where
.
3. Repeat step 2 to calculate, in turn, the minimum distance between the projection values and the selected centers
and . Select , , and by calculating , ultimately obtaining the initial center set .
- (2)
Iterative clustering and convergence testing
Calculate the distance between each projection value and the five cluster centers, and then assign
to the cluster
with the smallest distance, i.e.,
. Based on the samples within each cluster, calculate a new cluster center using the following formula:
where
denotes the number of samples in the jth cluster.
Calculate the maximum change in cluster centers before and after the update . If or the iteration count reaches 100, terminate the iteration and output the cluster centers and the final cluster partition.
- (3)
Risk level mapping and classification
Sort the clusters by ascending order of cluster centers to correspond to five risk levels, as detailed in
Table 3.
Substitute the evaluation sample projection value into the clustering model one by one, and then calculate the distance between and the final . The risk level of each sample is then determined according to the risk value intervals corresponding to each risk level.
To further validate the applicability of K-means clustering, it was compared with the natural breakpoint method and fuzzy C-means clustering, which is commonly used in ecological environment assessment. The ratio of the sum of squares within clusters to the sum of squares between clusters (SSE/SST) and the contour coefficient were employed as quantitative evaluation metrics. A lower SSE/SST indicates higher intra-cluster cohesion and greater inter-cluster discrimination, while a contour coefficient closer to 1 signifies superior clustering performance. Detailed validation procedures are presented in
Section 2.5.2 on risk level calculation.