Next Article in Journal
Spatiotemporal Evolution and Dynamic Prediction of Bed Separation Due to Mining
Next Article in Special Issue
Wavelet-Fourier Network Combined with Advanced Preprocessing Techniques for Univariate Daily Rainfall Prediction
Previous Article in Journal
Leveraging Supervised Learning to Optimize Urban Greening Strategies for Combined Sewer Overflow Pollution Reduction
Previous Article in Special Issue
Climate Change and Hydrological Processes, 2nd Edition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessing Surface Water Quality Risks Under Climate Stress and Geopolitical Instability: An Information Systems Approach

by
Florentina Loredana Dragomir-Constantin
1 and
Alina Bărbulescu
2,*
1
Department of Information Systems and Cyber Operations, Carol I National Defence University, 050662 Bucharest, Romania
2
Faculty of Civil Engineering, Transilvania University of Brașov, 5 Turnului Street, 500152 Brașov, Romania
*
Author to whom correspondence should be addressed.
Water 2026, 18(9), 996; https://doi.org/10.3390/w18090996
Submission received: 20 March 2026 / Revised: 18 April 2026 / Accepted: 21 April 2026 / Published: 22 April 2026
(This article belongs to the Special Issue Climate Change and Hydrological Processes, 3rd Edition)

Abstract

Surface water systems are increasingly exposed to multiple pressures generated by climate variability, intensified water resource exploitation, and evolving geopolitical dynamics. This study provides a novel contribution by identifying critical threshold effects and non-linear interactions that influence nitrate concentrations through an integrated information systems framework. It develops an integrated information-system-based analytical framework that combines hydrological, climatic, geopolitical, and strategic indicators to shape the broader contextual framework within which hydrological and climatic pressures operate, rather than serving as direct predictors. Considering the nitrate concentration in rivers as a key parameter of water quality, the paper goes beyond univariate analysis of nitrite concentration, examining its relationship with four explanatory variables: the Water Exploitation Index Plus (WEI+), the number of heat stress days (Heat_Stress), the Geopolitical Risk Index (GPR), and a proxy variable representing the presence of strategic infrastructure (Nuclear_State) using a Reduced Error Pruning Tree (REPTree) decision tree algorithm with 10-fold cross-validation. The results indicate that climatic stress emerges as the primary predictor, with a critical threshold of approximately 7.83 heat stress days, beyond which nitrate concentrations increase significantly. Under conditions of high climatic stress and intensive water exploitation (WEI+ ≥ 67.39), predicted nitrate levels exceed 20 mg/L and can reach extreme values of up to 58.82 mg/L. In contrast, low hydrological pressure (WEI+ < 0.39) combined with moderate climatic stress is associated with very low nitrate concentrations, around 2.75 mg/L. The model demonstrates strong predictive performance, with a correlation coefficient of 0.976, a Mean Absolute Error (MAE) of 0.593, a Root Mean Squared Error (RMSE) of 2.046, and a Receiver Operating Characteristic (ROC) area exceeding 0.94 for classification tasks. While geopolitical and strategic variables do not act as direct predictors, they contribute to shaping the contextual framework influencing water resource management and environmental vulnerability. Overall, the study highlights the non-linear and systemic nature of water quality dynamics and demonstrates the effectiveness of decision tree-based models within integrated information systems for supporting environmental monitoring and decision-making under conditions of climate stress and geopolitical uncertainty.

1. Introduction

Surface water quality represents a fundamental component of water resource security and the functioning of aquatic ecosystems, carrying major implications for human health, agriculture, and sustainable economic development. In recent decades, hydrological systems have been subjected to increasing pressures generated by the intensification of anthropogenic activities, rising global water demand, and the effects of climate change [1,2,3,4,5]. These processes have contributed to the deterioration of water quality in numerous river basins by increasing pollutant concentrations in fluvial ecosystems [6,7,8].
Nitrate is a primary reactive form of dissolved nitrogen, which is naturally occurring, but elevated nitrate concentrations in surface waters present a critical threat to both public health and ecological integrity. Environmentally, nitrogen enrichment triggers eutrophication, a process characterized by toxic algal blooms, oxygen depletion (anoxia), and a subsequent loss of aquatic biodiversity [9,10].
From a public health perspective, contaminated drinking water is a primary vector for nitrate exposure. Excessive human exposure primarily occurs through contaminated drinking water consumption and specific dietary sources. According to [11], when nitrate concentrations in drinking water reach 47 mg/L, daily nitrate intake is more than doubled; at 84 mg/L, it nearly triples. Acute exposure is strongly linked to methemoglobinemia, particularly in infants [12,13]. Under anaerobic conditions in the digestive tract, nitrate is reduced to nitrite, which binds to hemoglobin, forming methemoglobin, effectively neutralizing the blood’s oxygen-carrying capacity [14]. Clinical symptoms range from cyanosis and tachycardia to asphyxia, becoming life-threatening when methemoglobin levels exceed 10% [13,15]. Consequently, regulatory bodies like the WHO have established a 50 mg/L limit to mitigate these acute risks [14].
Chronic ingestion is increasingly associated with long-term risks, including thyroid disorders, various cancers, and birth defects [16,17]. Consequently, monitoring the natural and anthropogenic drivers of nitrate dynamics—such as agricultural runoff and wastewater discharge—is essential for maintaining water security and socio-economic stability [18,19].
Climate change amplifies these pressures on hydrological systems. Rising temperatures and intensified drought episodes can reduce river flow and the natural dilution capacity for pollutants, leading to nutrient accumulation in surface waters [20,21,22,23], affecting the transport and transformation of nitrogen compounds in aquatic ecosystems [21,24]. Conversely, extreme precipitation events can increase surface runoff and the transport of chemical fertilizers and nutrients from agricultural areas into river systems [25,26,27,28]. In addition to agricultural pressures, inefficient nutrient management and the intensive use of water resources play an important role in modifying hydrological regimes and affecting the water quality [29,30].
Many studies [31,32,33] highlight that the relationship between water resource pressure and water quality is often characterized by non-linear processes that depend on the specific characteristics of river basins and the interactions between hydrological processes and human activities. Therefore, understanding the mechanisms that drive variations in nutrient concentrations in rivers requires analytical approaches capable of integrating multiple categories of factors [7,22,30].
Different indicators are used to evaluate water quality for specific uses [34,35,36,37,38,39,40,41,42,43]. They are built on various water parameters and reflect the cumulative impact of anthropogenic activities on drinking water quality and aquatic ecosystems. In recent years, specialized literature has begun to emphasize the need for an interdisciplinary approach to analysing hydrological systems that integrates social, institutional, and geopolitical dimensions of water resource management. In this context, beyond its environmental significance, surface water quality is increasingly recognized as a critical component of water security, representing a key dimension of national security and societal resilience, and highlighting the interdependence between hydrological processes, economic development, and geopolitical stability. Water is a strategic element in international relations, particularly transboundary river basins [44]. Therefore, the concept of water geopolitics highlights the interdependence between hydrological resources, food security, and geopolitical stability.
In many regions of the world, water resources are shared between multiple states, necessitating institutional cooperation mechanisms for the sustainable management of watersheds. Studies on water geopolitics emphasize that hydrological resources can become factors of cooperation or conflict between states, especially in regions characterized by high water stress and dependence on transboundary resources. In such cases, competition for access to water resources can generate conflict or political tensions between states [45]. A lack of transboundary cooperation or geopolitical instability can affect states’ capacity to implement effective environmental protection policies and water quality monitoring [46].
The development of digital technologies and information systems has generated new opportunities for the integrated analysis of environmental data. Machine learning (ML) methods are increasingly used in hydrology to identify complex patterns in large datasets and to model non-linear relationships between hydrological, climatic, and anthropogenic variables [47]. Recent studies highlight the potential of ML-based methods in water quality analysis, pollution prediction, and optimizing the hydrological resource management in the context of climate change [48,49,50].
Given the complexity of hydrological systems, traditional statistical analysis methods [51,52,53,54,55,56] may face limitations in identifying complex relationships between environmental variables. Therefore, ML algorithms, such as decision trees, artificial neural networks (ANNs), Random Forest, Support Vector Machines (SVM), Gradient Boosting, and ensemble methods, have been increasingly utilized to model different water parameters (e.g., nutrient concentrations, dissolved oxygen, turbidity), and analyse water quality [21,48,50,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71]. Deep Learning methods (DL), such as Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNNs), are used to capture complex temporal patterns in long-term monitoring data [61,62,71]. For instance, a key application is the prediction of nitrate concentrations, where ML models have shown high accuracy in predicting temporal and spatial variations in complex basins [72,73]. These methods can integrate large volumes of data from heterogeneous sources, including climatic, hydrological, and socio-economic data [49]. Furthermore, they contribute to the development of intelligent environmental monitoring systems and the improvement of decision-making processes in water management.
Additionally, digital information platforms allow for the development of predictive models and decision support tools [74,75].
The integration of Remote Sensing and Geographic Information Systems (GIS) has expanded ML capabilities to analyze processes at regional or global scales, using satellite imagery to monitor water quality in areas where in situ data are limited [76]. Hybrid models offer more robust results [62,77]. However, challenges remain regarding data quality, model interpretability, and transferability. Recent studies highlight the need for Explainable AI (XAI) to interpret the relationships between hydrological variables and environmental factors [78,79,80,81]. Despite these advancements, specialized literature continues to focus either on hydrological and climatic factors or on water parameters. The integration of hydrological, climatic, geopolitical, and strategic factors into a unified analytical framework for assessing surface water quality risks remains insufficiently explored in hydrological literature, despite their importance in environmental governance, transboundary cooperation, and institutional capacity for water resource management. This limitation reduces the ability to understand the systemic vulnerabilities of aquatic ecosystems in a global context characterized by climatic and geopolitical uncertainty.
Therefore, the novelty of the present study lies in proposing an integrated analytical framework for assessing risks affecting surface water quality by combining hydrological, climatic, geopolitical, and strategic indicators within an approach based on information systems and machine learning methods. The analysis focuses on nitrate concentration in rivers as a parameter of surface water quality and investigates its relationship with water resource pressure, climate stress, and the geopolitical context. By integrating heterogeneous datasets from international sources and using ML models to identify relationships between variables, this study contributes to the development of interdisciplinary approaches in hydrological system analysis and the improvement of analytical tools used in water resource management. This interdisciplinary integration allows for a more comprehensive understanding of water security by linking environmental processes with institutional and geopolitical dynamics.
To achieve this general objective, the research pursues the following specific objectives:
  • Integration of heterogeneous datasets from environmental monitoring systems, climate services, and geopolitical/strategic databases into a unified analytical dataset organized at the country–year observation level.
  • Assessment of the influence of hydrological pressure and climate stress on nitrate concentrations in rivers, utilizing the WEI+ and Heat_Stress indicators.
  • Investigation of the role of the geopolitical context and strategic infrastructures in shaping the vulnerability of water resource management systems, using GPR and Nuclear_State/Warheads indicators.
  • Proposal of an analytical framework based on information systems capable of supporting integrated monitoring and decision-making processes for water resource management under conditions of climate stress and geopolitical uncertainty.

2. Materials and Methods

2.1. Data Series

The analysis utilizes an integrated set of indicators sourced from international environmental, climate, and strategic security databases. The variables were selected to capture the interaction between hydrological pressures, climatic factors, and the geopolitical context that can influence the dynamics of surface water quality. The data were harmonized within a common analytical framework to enable the modeling of the relationships between these factors. We utilized the following variables:
  • The dependent variable (Nitrate) is the concentration of nitrates in rivers (Nitrate), extracted from the Eurostat environmental statistics database [82]. The indicator is expressed in mg/L. In the analytical model, this variable is treated as a continuous numerical variable and represents the target for the ML algorithm.
  • Hydrological Pressure: To capture the pressure on water resources, we used WEI+, which is available in the Eurostat sustainable development indicators database [83]. This indicator expresses the ratio (percentage) between the total volume of water abstracted and the available renewable freshwater resources, serving as a widely used tool in assessing hydrological stress. Within the dataset, the variable is treated as a continuous numerical variable expressed in percentages.
  • Climatic Dimension: The climatic dimension is represented by the variable Heat_Stress, defined as the annual number of days characterized by thermal stress, based on data provided by the Copernicus Climate Change Service—European State of the Climate [84]. This variable reflects the intensity of extreme temperature episodes that can influence hydrological processes, evapotranspiration, and the concentration of pollutants in surface waters. The variable is treated as a discrete numerical indicator, expressed as the number of days per year.
  • Geopolitical Context: The geopolitical context is captured through the GPR, developed by Caldara and Iacoviello [85]. This index measures the intensity of global geopolitical tensions based on the frequency of press articles about military conflicts, international tensions, and geopolitical risks. In the dataset [85], GPR is treated as a continuous numerical variable that reflects the level of geopolitical instability.
  • Strategic Variable (Nuclear_State): The model also includes a strategic variable represented by the number of nuclear warheads, based on data provided by the Stockholm International Peace Research Institute [86] and the Federation of American Scientists [87]. This variable is used as a proxy for the presence of strategic infrastructure and the security context, which can indirectly influence environmental governance and natural resources management. The variable is treated as a discrete numerical variable, expressed as the estimated number of nuclear warheads. It does not represent a direct environmental pressure but rather a structural proxy reflecting the scale and complexity of national strategic infrastructures and their potential indirect influence on environmental governance.
  • Temporal Variable (Year): In addition to the core variables, a temporal indicator (Year) was included in the analytical dataset to capture the evolution of nitrate concentration over time. This variable is treated as a discrete numerical variable. It allows the model to identify potential temporal patterns and structural changes in water quality dynamics. The inclusion of the Year enables the decision tree algorithm to detect time-dependent thresholds and shifts in the relationships between climatic stress, hydrological pressure, and nitrate concentration.
Table 1 summarizes the model variables, their types, and measurement units.
The resulting dataset is organized as a country–year panel, where each observation corresponds to a specific state–year pair. The analysis includes data for 30 European states for which all variables were available in the databases utilized during the study period 2013–2023, e.g., Austria, Belgium, Bulgaria, Croatia, the Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Italy, the Netherlands, Poland, Romania, Spain, and Sweden. We selected European states given (i) the availability and high standardization of Eurostat indicators for European states; (ii) the relevance of the European institutional and public policy framework in the field of water management; (iii) the necessity of maintaining comparability between the variables used.
The final structure of the dataset enables comparative analysis across states and investigation of temporal variations in the relationships among hydrological pressures, climatic factors, and the geopolitical context that may influence surface water quality.

2.2. Conceptual Framework

The study proposes an integrated analytical framework designed to evaluate the pressures affecting surface water quality by analyzing the combined influence of hydrological, climatic, geopolitical, and strategic factors. From a security perspective, the analytical framework can be interpreted through a Threat–Vulnerability–Indicator logic. Climatic stress and geopolitical instability represent potential drivers of environmental threats, hydrological pressure reflects structural vulnerabilities in water resource systems, while nitrate concentration functions as an indicator of environmental degradation affecting water security.
The conceptual model is based on the hypothesis that nitrate concentration in rivers—used as an indicator of surface water quality—is influenced by the interaction of several types of pressures acting simultaneously on hydrological systems. These pressures are associated with water resource exploitation, climate variability, the geopolitical context, and the presence of strategic infrastructures.
The relationship analyzed within the model can be expressed as:
Nitrate = f(WEI+, Heat_Stress, GPR, Nuclear_State),
where Nuclear_State is interpreted as a contextual variable reflecting structural and institutional conditions rather than a direct causal driver.
Within this conceptual model, it is assumed that increasing pressure on water resources and intensifying climatic stress can lead to higher pollutant concentrations in rivers by reducing natural dilution capacity and modifying hydrological processes within river basins. At the same time, geopolitical instability and the presence of strategic infrastructures can indirectly influence surface water quality through their impact on environmental governance, international cooperation regarding water resource management, and the efficiency of environmental monitoring systems. Consequently, the proposed conceptual framework integrates variables from different fields into an interdisciplinary approach, allowing for the analysis of complex relationships between hydrological, climatic, geopolitical, and strategic pressures and the dynamics of surface water quality.
The following research hypotheses are formulated:
H1. 
Hydrological pressure on water resources significantly influences nitrate concentration in rivers.
H2. 
Climatic stress contributes to variations in surface water quality. An increase in the number of days characterized by thermal stress (Heat_Stress) can influence hydrological and biochemical processes in river basins, affecting the transport and transformation of nutrients in aquatic ecosystems.
H3. 
The geopolitical context can indirectly influence the dynamics of surface water quality. High levels of GPR can affect transboundary cooperation in water resource management and the efficiency of institutional environmental monitoring mechanisms, generating additional vulnerabilities in water quality management. These variables are expected to exert indirect effects, shaping the context in which hydrological and climatic processes influence water quality.
H4. 
The presence of strategic infrastructures is associated with indirect pressures on the environment and surface water quality.
H5. 
The relationships between hydrological, climatic, geopolitical, and strategic factors and nitrate concentration are characterized by non-linear interactions.
The Research Questions are:
RQ1. 
To what extent do pressure on water resources and climatic stress influence nitrate concentrations in rivers?
RQ2. 
What role do geopolitical and strategic factors (represented by GPR and Nuclear_State), play in explaining variations in surface water quality?
RQ3. 
Can a decision tree model (REPTree) implemented in WEKA identify non-linear relationships and interaction patterns among hydrological, climatic, geopolitical, and strategic variables influencing nitrate concentrations?

2.3. Data Integration

Given the heterogeneous nature of the data sources used in this study, constructing the analytical dataset required a structured data integration process. This temporal alignment led to selecting the 2013–2023 interval as the optimal period, ensuring data completeness across all variables. This process aimed to ensure methodological consistency, indicator comparability, and compatibility of the final dataset with machine-learning-based analytical tools.
The process was conducted in four distinct stages:
  • Data collection: Indicators were gathered from multiple international sources, presented in Section 2.1. The various data structures and formats necessitated a standardization process for integration into a unified analytical system.
  • Harmonization of units and formats: Given that indicators from different sources often use distinct units of measurement or reporting formats, variables were standardized to ensure comparability across different states and time periods. This stage involved verifying units, converting variables where necessary, and unifying attribute names.
  • Temporal alignment: As databases offer different temporal coverages, we identified the common interval where all variables were available. Observations were only retained for country–year combinations where values for all indicators in the model could be identified or estimated.
  • Final Compilation: The harmonized variables were combined into an integrated country–year panel dataset. This structure allows for the simultaneous analysis of temporal variations and cross-state differences regarding the relationships between hydrological pressures, climatic stress, and the geopolitical context. The final dataset was initially structured in CSV (Comma-Separated Values) format for verification and cleaning. Subsequently, for the implementation of ML algorithms in WEKA, the dataset was converted into ARFF (Attribute–Relation File Format), the standard format used by this platform. We adopted the country-level aggregation to ensure compatibility with geopolitical and strategic indicators, which are inherently defined at the national level.

2.4. Missing Data Treatment

Datasets used in climatic and hydrological analyses frequently contain missing values due to limitations in monitoring and reporting systems. In this study, the proportion of missing data was evaluated using MissingRate, defined as:
Missing   rate   =   N m i s s i n g / N t o t a l ,
where N m i s s i n g represents the total number of missing values identified, and N t o t a l represents the total number of observations for the analyzed variables. Missing values were limited to 77 observations (approximately 4.7%), all associated with the Nitrate variable.
Given the relatively low proportion of missing data and their concentration within a single variable, the risk of systematic bias is considered minimal.
To avoid reducing the data volume, this study employed the ReplaceMissingValues method available in WEKA. This method is a standard preprocessing step that maintains all observations and ensures structural consistency. The algorithm applies the following rules:
  • For numerical variables: Missing values are replaced with the mean of the attribute.
  • For nominal variables: Missing values are replaced with the mode (the most frequent category).
After this preprocessing, we obtained 1650 data points across five variables.
To further assess the robustness of the results, an alternative imputation approach based on the k-nearest neighbors (k-NN) method was applied. The k-NN estimates missing values based on similarity between observations in the feature space. The model was re-estimated using the imputed dataset, and the results showed no significant changes in the structure of the decision tree or in the hierarchy of the main predictors. This confirms that the identified relationships are stable and not sensitive to the choice of imputation method.

2.5. Decision Tree Algorithm

Reduced Error Pruning Tree (REPTree) is a fast decision tree learner that builds a regression (or classification) tree using information gain (for classification) or variance reduction (for regression) and prunes it using reduced-error pruning (REP).
The algorithm operates through a series of distinct phases [88]:
  • Initial splitting: REPTree builds a decision tree by identifying the attribute that best splits the data. For the numerical “Nitrate” target, it uses variance reduction. The goal is to choose a split that minimizes the squared error (variance) of the values within each resulting branch.
  • Handling continuous and discrete variables: Since the dataset includes both continuous and discrete variables, REPTree treats them as follows:
    Continuous: It sorts the values and tests different threshold points, creating binary splits.
    Discrete: It treats them as numerical values to identify the most significant break points in the data distribution.
  • Pruning (The “REP” Component): To prevent overfitting (where the model learns “noise” in your country-year dataset), the algorithm uses Reduced Error Pruning. It holds back a portion of the data (the pruning set) to evaluate the tree. It then replaces subtrees with leaves (representing the average value) if the simplification does not increase the error on the pruning set.
  • Missing values: As already noted, REPTree can handle missing values by using the distributed method (splitting the instances with missing values among the branches) or by utilizing your pre-imputed means.
In the context of this research, the algorithm is particularly effective because it is designed to handle numerical targets and can capture the non-linear interactions between the hydrological, climatic, and geopolitical variables while maintaining a simplified, interpretable structure.
The key technical advantages are [89,90,91]:
  • Speed: It is often faster than standard CART or M5P algorithms because it only builds the tree once and uses simple pruning logic.
  • Interpretability: By pruning the tree, REPTree produces a model that is easy to visualize, allowing you to see exactly which factors lead to higher nitrate concentrations.
  • Non-linearity: Unlike linear regression, REPTree can identify “threshold effects” for instance, identifying that Geopolitical Risk only impacts water quality once it passes a certain critical value.

2.6. Modeling Strategy and Analytical Design

To capture the complexity of the relationships between variables, the study employs a multi-model analytical strategy based on three complementary decision tree structures:
(i)
An initial classification tree designed to identify ecological regimes of nitrate concentration based on threshold segmentation;
(ii)
An optimized regression tree aimed at predicting nitrate concentration as a continuous variable and identifying the hierarchy of predictors;
(iii)
An extended classification tree integrating geopolitical variables (GPR), used to assess the contextual influence of geopolitical instability on water quality dynamics.
These models are not independent but represent successive analytical layers of the same framework, allowing for the identification of both direct (climatic and hydrological) and indirect (geopolitical and strategic) influences on surface water quality.
During the classification tasks aimed at identifying distinct water quality regimes, the continuous variable Nitrate was transformed into categorical classes based on threshold values identified during exploratory analysis. This transformation enables the application of classification trees, which are used to identify clusters of country–year observations characterized by similar environmental and hydrological profiles.
The dual approach (regression and classification) allows for both quantitative prediction of nitrate levels and qualitative segmentation of water quality regimes.

2.7. Model Validation

To optimize the predictive accuracy of the REPTree model and prevent overfitting, a 10-fold cross-validation approach was employed during the training phase. This iterative process allowed for the robust tuning of model parameters—specifically the pruning depth and leaf constraints—by ensuring that the selection of the final model was based on its consistent performance across ten distinct data partitions. By averaging the error metrics over these iterations, the cross-validation provided a stable foundation for identifying the optimal tree structure capable of capturing non-linear relationships within the heterogeneous dataset.
The tuning process implicitly controls model complexity through pruning and leaf size optimization, ensuring a balance between model accuracy and generalization. In this procedure, the dataset is randomly divided into ten approximately equal subsets. The process is iterative: in each round, nine subsets are used for training, and the remaining subset is used for testing. This is repeated ten times so that every subset serves as the test set once. The final performance is calculated as the average of the results across all iterations.
The performance of regression models was evaluated using the following statistical indicators:
  • Mean Absolute Error (MAE): Expresses the average absolute error between observed and estimated values, providing a direct measure of prediction accuracy.
  • Root Mean Squared Error (RMSE): Reflects the average magnitude of prediction errors while giving higher weight to larger errors. This is essential for understanding the model’s sensitivity to outliers.
  • Correlation Coefficient: Measures the degree of association between the actual values of the dependent variable and the model’s predictions.
The performance of the classification models was done by [88,92]:
  • Accuracy: Represents the proportion of correctly classified instances out of the total number of observations. It provides an overall measure of the model’s ability to correctly identify different classes.
  • Precision: Expresses the proportion of correctly predicted positive instances out of all instances predicted as positive. It reflects the model’s ability to avoid false positive classifications.
  • Recall: Measures the proportion of actual positive instances that are correctly identified by the model. It indicates the model’s ability to detect relevant cases and minimize false negatives.
  • F1-score: Represents the harmonic mean of Precision and Recall, providing a balanced measure of model performance when both false positives and false negatives are important.
  • ROC Area: Reflects the model’s ability to distinguish between classes across different classification thresholds. A higher value indicates better discriminative performance, with values close to 1 suggesting excellent classification capability.

3. Results

3.1. Exploratory Analysis of Variable Relationships

The first stage of the analysis involved exploring the relationships between the variables included in the model using a scatter plot matrix generated in the WEKA environment. This graphical representation allows for the simultaneous examination of variable distributions and potential correlations. The results highlight several relevant patterns. First, the heat_stress variable shows a strong visual correlation with the geopolitical indicator GPR, suggesting that periods of high climatic stress are associated with more volatile geopolitical contexts or periods of global instability. Furthermore, an increasing trend is observed between heat_stress and year, reflecting the intensification of extreme temperature events over the analyzed timeframe.
The analysis of the relationship between WEI+ and nitrate indicates a relatively high dispersion of observations, suggesting the existence of non-linear relationships or intermediary factors influencing nitrate concentrations. Similarly, the nuclear_state variable presents a discrete distribution, concentrated around values specific to states with nuclear capabilities; this indicates that the indicator acts more as a structural proxy than as a continuous predictor (Figure 1).
These exploratory results suggest that the relationships between hydrological, climatic, and geopolitical variables are not strictly linear, justifying the use of a decision tree algorithm to identify complex patterns within the dataset.

3.2. Initial Structure of the Decision Tree

In the second stage of the analysis, the REPTree algorithm was applied to generate an initial decision tree structure aimed at modeling variations in nitrate concentrations based on the explanatory variables. In this preliminary structure, the Nitrate variable appears as the starting node for the segmentation process, with the first data split occurring at a threshold of approximately 16.09 mg/L. This initial segmentation suggests the existence of two distinct water quality regimes: one characterized by moderate nitrate levels, and another associated with higher pollutant concentrations. For observations below the 16.09 mg/L threshold, heat_stress emerged as the primary differentiating factor. The tree indicates that lower levels of thermal stress are associated with relatively lower nitrate concentrations, while higher values of this climatic indicator led to further segmentation based on water resource pressure (WEI+). This stage of the analysis (Figure 2) allowed for the identification of specific country groupings. For instance, countries such as Finland, Estonia, Ireland, the Netherlands, and Iceland were clustered within branches characterized by distinct hydrological profiles based on the interaction between climatic stress and water resource pressure. For observations above the initial threshold, the tree confirms that high levels of climatic stress, interacting with hydrological pressure, amplify pollutant concentrations in surface waters.
The indicators presented in Table 2 confirm the REPTree model’s ability to identify relevant patterns within the analyzed dataset. The correct classification rate of 85.76% indicates a high capacity of the decision tree to distinguish between different nitrate concentration regimes generated by the interaction between climatic stress and pressure on water resources. Furthermore, the high Kappa coefficient (0.853) indicates a very good level of agreement between the model’s classifications and the actual distribution of observations. The Precision (0.815) and Recall (0.893) indicators, along with the F1 score of 0.866, highlight a balanced model performance in correctly identifying relevant classes.
At the same time, the high ROC area (0.935) suggests an excellent ability of the model to separate observations belonging to different hydrological profiles. Overall, these results confirm the robustness of the initial decision tree structure and the relevance of climatic and hydrological variables in explaining variations in nitrate concentrations.

3.3. Final Decision Tree Structure and Predictive Logic

The final predictive analysis utilized the REPTree algorithm with a 10-fold cross-validation procedure on 330 country–year observations to establish a refined hierarchy of predictors (Figure 3).
In this optimized model, Heat_Stress is identified as the primary predictor influencing nitrate concentrations, followed by hydrological pressure (WEI+) and temporal variation (Year). The root node of this final tree is represented by the variable Heat_Stress, splitting the dataset at a critical threshold of 7.83 days. This confirms that climatic stress is the most significant driver influencing nitrate dynamics in the analyzed dataset. For observations characterized by lower climatic stress (Heat_Stress < 7.83), the model refines the predictions based on secondary thresholds of stress and hydrological pressure. Specifically, when Heat_Stress < 4.28, the model predicts a moderate nitrate concentration of approximately 16.99 mg/L (Rule 1). When heat stress is between 4.28 and 7.83, the influence of WEI+ becomes a determining factor. Low water exploitation (WEI+ < 0.39) leads to the lowest predicted nitrate level of 2.75 mg/L (Rule 3), while higher exploitation (WEI+ ≥ 0.39) results in 5.73 mg/L (Rule 3).
Under conditions of high climatic stress (Heat_Stress ≥ 7.83), hydrological pressure becomes the main differentiating factor. Moderate water exploitation (WEI+ < 67.39) results in a predicted concentration of 17.05 mg/L (Rule 4). Intensive water exploitation (WEI+ ≥ 67.39) increases predicted levels to 20.34 mg/L (Rule 5). The model further identifies a critical temporal shift for cases of extreme hydrological pressure: before 2019, the predicted concentration reached a peak of 58.82 mg/L (Rule 6), whereas after 2019, the model shows a relative reduction to 19.31 mg/L (Rule R7) (Table 3).
The robustness of this hierarchical structure is confirmed by the performance metrics in Table 4. The high correlation coefficient (0.976) indicates a very strong agreement between observed and predicted values. Furthermore, the low MAE (0.593) and RMSE (2.046) confirm that the model accurately captures the complex, non-linear relationships between climatic stress, hydrological pressure, and nitrate concentrations in rivers.
The decision tree analysis highlights a progressive evolution of the predictive structure used to explain variations in nitrate concentrations. The initial tree structure, presented in the previous section, allowed for the identification of ecological segmentation mechanisms associated with the interaction between climatic stress and pressure on water resources. At this stage, the analysis revealed the existence of distinct water quality regimes and groupings of countries characterized by similar hydrological profiles.
The optimized model presented in Figure 3 refines this initial structure and establishes a clearer predictive hierarchy of the factors influencing nitrate concentrations. The results indicate that climatic stress represents the primary predictor of observed variations, while pressure on water resources and the temporal variable contribute to explaining the differences between observations within the analyzed dataset.
This structure underscores the climatic–hydrological mechanism underlying the dynamics of nitrate pollution and provides an analytical basis for integrating additional factors that may indirectly influence the functioning of the hydrological system. The results emphasize that variations in nitrate concentration are primarily determined by the interaction between climatic stress and pressure on hydrological resources. However, these processes do not manifest in isolation; rather, they are influenced by the institutional and socio-political framework in which water resource management occurs. Consequently, the analysis can be extended by integrating contextual variables capable of reflecting these external conditions, offering a broader perspective on the factors influencing water quality dynamics.

3.4. Geopolitical and Strategic Context of Water Quality Risks

The third model generated by the REPTree algorithm introduces the geopolitical context into the previously identified climatic–hydrological structure and a hierarchy of predictors influencing nitrate concentration. In this structure, the GPR appears as the initial splitting variable, indicating that the general geopolitical context may play an indirect role in shaping the conditions that influence surface water quality.
The tree (Figure 4) separates observations based on the GPR threshold, distinguishing two main scenarios characterized by different levels of geopolitical risk.
For lower values of GPR, nitrate concentrations are primarily influenced by pressure on water resources (WEI+) and climate stress (Heat Stress). Under these conditions, the decision tree identifies several clusters of countries characterized by distinct combinations of these factors, including Finland, Latvia, Estonia, Iceland, and Bulgaria. For higher values of the geopolitical index, the model highlights a more complex interaction between hydrological and climatic variables. Within this branch of the tree, WEI+ and Heat Stress contribute to segmenting observations into distinct country groups, such as Ireland, Croatia, Greece, Malta, France, and Italy, each characterized by varying levels of water resource pressure and climate stress. Furthermore, in scenarios characterized by very high nitrate concentrations, the tree identifies terminal nodes associated with countries such as Belgium. This suggests the presence of significant hydrological and anthropogenic pressures on surface water systems in these regions.
The resuts indicates that:
  • The inclusion of Finland, Estonia, and Latvia (Northern-Baltic group) in the low GPR branch suggests that in politically stable contexts with lower water exploitation, nitrate levels are more directly a function of natural climate variability.
  • The inclusion of Italy, France, and Greece (Mediterranean/Western Group) under higher GPR values indicates that in these regions, water quality is managed (or stressed) within more volatile socio-economic frameworks, where irrigation and heat stress play a compounding role.
  • The inclusion of Belgium in the “very high concentration” terminal branches aligns with known data regarding high population density and intensive livestock farming, which create the “anthropogenic pressure” the model detected.
The performance indicators (Table 5) confirm the extended model’s ability to identify relevant patterns within the analyzed dataset.
The correct classification rate of 86.16% indicates a high performance of the decision tree in distinguishing observations characterized by different combinations of geopolitical context, pressure on water resources, and climatic stress. The Kappa coefficient (0.853) highlights a very good level of agreement between the model’s classifications and the actual distribution of the data. Simultaneously, the high values for Precision (0.875), Recall (0.927), and F1-score (0.867) indicate a balanced performance in identifying relevant classes. The ROC area of 0.942 confirms the model’s excellent capacity to separate observations belonging to different hydrological and geopolitical profiles, supporting the hypothesis that geopolitical factors can indirectly influence surface water quality dynamics. Overall, the results obtained by applying the REPTree algorithm highlight a hierarchical structure of factors influencing nitrate concentration dynamics in surface waters.

3.5. Comparative Analysis of the Decision Tree Models

The three decision structures presented in Figure 2, Figure 3 and Figure 4 do not represent independent models, but rather complementary analytical levels of the same explanatory framework. The initial tree highlights the ecological segmentation mechanism of nitrate concentrations, where the interaction between climatic stress and water resource pressure determines the differentiation of water quality regimes. The optimized tree refines this structure and identifies climatic stress as the primary predictor of nitrate variations, while introducing the temporal dimension of the phenomenon’s evolution. Finally, the integration of the geopolitical risk index extends the climatic–hydrological model by introducing the socio-political context in which these processes manifest.
To highlight the complementary role of the three decision structures identified in the analysis, Table 6 summarizes the main characteristics of each model, i.e., the type of tree used, the root variable identified by the algorithm, and the analytical role of each structure within the analyzed information system.
The performance of the models generated by the REPTree algorithm highlights the role of various decision structures in analyzing the relationships between hydrological, climatic, and geopolitical variables.
It is important to note that the three analyzed trees correspond to different types of predictive models, which justifies the use of distinct evaluation metrics. As such, the model presented in Figure 3 represents a regression tree used for estimating numerical values of nitrate concentrations; therefore, its performance is evaluated through specific regression indicators, such as the correlation coefficient, MAE, and RMSE. In contrast, the trees presented in Figure 2 and Figure 4 represent classification models, where the target variable is the distribution of observations at the state level. In these cases, model performance is evaluated using classification-specific metrics such as Accuracy, the Kappa coefficient, as well as Precision, Recall, and the F1-score.
The differences between these values do not reflect methodological inconsistencies, but rather the distinct nature of the analytical tasks addressed by each model. The initial tree (Figure 2) serves to identify distinct ecological water quality regimes and to segment observations based on the interaction between climatic stress and water resource pressure. The optimized tree (Figure 3) refines this analysis by constructing a predictive model capable of explaining nitrate concentration variations as a function of climatic and hydrological variables. Finally, the geopolitical contextual model (Figure 4) introduces the GPR variable and highlights the fact that the previously identified hydrological and climatic processes are indirectly influenced by the geopolitical context and institutional structure of states. Consequently, the three models must be interpreted as complementary analytical levels within the same explanatory framework, where climatic and hydrological pressures directly determine nitrate concentration dynamics, while geopolitical factors provide a contextual framework that influences how these pressures manifest at a regional level.
The choice of REPTree was motivated by its high interpretability, which is essential for identifying threshold effects and supporting decision-making processes in environmental management. However, a Random Forest model was applied as a comparative approach. For Heat_Stress, the model achieved a correlation coefficient of 0.9979, MAE = 0.218, and RMSE = 0.359. For WEI+, the performance remained strong but lower than for REPTree, with a correlation coefficient of 0.9801, MAE = 1.1594, and RMSE = 2.7965.
At the same time, the comparative analysis highlighted that the primary predictors identified remain consistent with those resulting from the REPTree models, specifically the dominant role of the Heat_Stress and WEI+ variables. This confirms the robustness of the identified relationship structures.
To assess the robustness of the model, a sensitivity analysis was conducted by examining the influence of extreme values and data preprocessing procedures. The analysis included the identification and temporary removal of outliers, as well as a comparison between different missing data imputation approaches. The results showed that the structure of the decision tree and the hierarchy of the main predictors remained stable, indicating a high level of model robustness.

4. Discussion

The results obtained from the analysis highlight that the dynamics of surface water quality are determined by the interaction between several categories of pressures, including hydrological, climatic, geopolitical, and strategic factors. Integrating these variables into a common analytical framework allows for a broader understanding of the mechanisms that influence nitrate concentrations in rivers.
The study indicates that the pressure on water resources, reflected by WEI+, represents a significant factor in explaining variations in nitrate concentrations. High values of this indicator suggest an intensive use of water resources, which can lead to a reduction in the natural dilution capacity of pollutants within river systems. When the available water volume decreases or is heavily exploited, the concentrations of nutrients and other pollutants can rise, even in the absence of a significant increase in pollutant loading. This result is consistent with existing literature, which emphasizes that pressure on water resources can amplify the effects of diffuse pollution and contribute to the deterioration of aquatic ecosystem quality [7,20,25,33,36,38].
The variable Heat_Stress, used as a climatic stress indicator, reflects the impact of extreme meteorological conditions on hydrological systems. An increase in the frequency of days characterized by high temperatures can affect both hydrological and biogeochemical processes within watersheds. High temperatures influence evaporation processes, surface runoff regimes, and the transformation of nitrogen in aquatic ecosystems. Thus, climatic variability represents a critical factor that must be considered when evaluating the vulnerability of surface water systems. The observed influence of climatic stress on nitrate concentrations is also supported by empirical evidence showing that increasing temperatures, drought episodes, and altered runoff regimes can intensify nutrient accumulation processes in surface waters. Such mechanisms have been documented in studies on catchment nutrient dynamics under climate change and water security stress [25,33]. The Random Forest results confirm the robustness of the identified relationships, particularly the central role of Heat_Stress and WEI+.
The consistency between the present results and previously reported empirical findings strengthens the validity of the proposed analytical framework. Although this study does not rely on primary experimental measurements, it integrates and extends existing empirical knowledge through a systemic, data-driven, and interpretable modeling approach.
A distinctive element of this study is the integration of GPR into the analysis of surface water quality. This distinction highlights that geopolitical factors operate at a systemic level, influencing institutional conditions and cooperation mechanisms rather than directly determining water quality outcomes. Although this variable does not act as a direct hydrological determinant, the results suggest that geopolitical factors can indirectly influence water resource management. Geopolitical instability can affect international cooperation in water resource management, in the case of transboundary river basins [93,94,95]. In such contexts, institutional environmental monitoring mechanisms and the implementation of water protection particularly policies may become less effective, thereby increasing the vulnerability of aquatic ecosystems [96].
The integration of the Nuclear_State variable introduces a strategic dimension to the institutional and infrastructural environment governing water resources. While not a direct hydrological determinant, this variable functions as a robust proxy indicator for the presence of highly complex and large-scale strategic infrastructures [97]. The absence of this variable in the final decision rules suggests that its influence is not direct but mediated through hydrological and climatic pressures, reinforcing its role as a contextual rather than predictive factor. The identified thresholds (e.g., Heat_Stress and WEI+) can support the development of early warning systems by signaling critical conditions under which water quality deterioration is likely to occur. These thresholds provide actionable insights for policymakers, enabling targeted interventions in water resource management.
States possessing such capabilities are typically characterized by extensive industrial, energy, and logistic systems that necessitate significant resource throughput [98]. These systems can generate profound indirect pressures on the environment, particularly through the intensification of economic activities and the high-volume utilization of natural resources required to maintain strategic autonomy.
Furthermore, the strategic and geopolitical context often shifts governance priorities, whereby “administrative rationalism” may prioritize industrial or energy security over ecological health [99]. Such shifts in the allocation of institutional resources can have significant implications for the implementation of environmental protection policies, potentially compromising the efficiency and transparency of water quality monitoring systems [100]. In this analytical framework, the Nuclear_State indicator captures these underlying structural and institutional pressures that traditional biogeophysical variables fail to account for.
Interestingly, the strategic variable capturing nuclear capabilities does not appear in the final decision rules, suggesting that environmental pressures related to water quality are not primarily driven by strategic or military factors, but rather by climatic stress and water resource exploitation. This supports the interpretation that geopolitical and strategic variables act as background conditions influencing the system, rather than as immediate drivers of water quality variation.
A key result of this study is demonstrating the utility of integrating heterogeneous datasets into a unified analytical framework. The use of a decision tree model allows for the identification of non-linear relationships and complex interactions between explanatory variables and water quality indicators. This approach highlights that the dynamics of surface water quality cannot be explained exclusively by hydrological or climatic variables; instead, they must be analyzed within a broader context that includes institutional, geopolitical, and strategic factors. Consequently, the integration of machine learning methods into hydrological studies can contribute to the development of more robust analytical tools for assessing water system vulnerabilities and supporting decision-making processes in water resource management.
This study proposes a data-driven analytical framework that differs fundamentally from traditional applications of Remote Sensing (RS) and Geographic Information Systems (GIS), while remaining complementary to them. In addition, the methodological choice of a single decision tree model is critically positioned against ensemble learning approaches, highlighting trade-offs between interpretability and predictive performance.
Remote Sensing (RS) has been widely applied in water quality studies due to its ability to provide synoptic and repetitive observations of large water bodies. Recent studies based on RS have demonstrated strong performance in monitoring spatial patterns of water quality, particularly through the estimation of optically detectable parameters such as chlorophyll-a, turbidity, and dissolved organic matter [101,102]. For example, Deng et al. [76] highlighted the effectiveness of satellite platforms such as Sentinel-2 and MODIS in capturing large-scale variations in water quality, emphasizing their role in extending monitoring coverage in data-scarce regions. Similarly, Pan et al. [78] show that RS and artificial intelligence approaches significantly improve prediction accuracy for river water quality, especially when integrated with time-series modeling techniques.
Operational Earth observation programs such as the Copernicus Programme further enhance the availability of standardized datasets for environmental monitoring and assessment. Despite its advantages, RS is constrained by atmospheric conditions such as cloud cover and by the indirect nature of water quality estimation, which often requires calibration with in situ measurements to ensure accuracy and reliability. Moreover, RS is inherently limited to observable physical variables and does not provide mechanisms for modeling causal relationships involving non-environmental drivers.
GIS extends the RS’s capabilities by enabling the integration, management, and visualization of spatial data [103]. GIS supports spatial analysis through overlay techniques and spatial modeling, playing a central role in environmental assessment and water resource management [104]. GIS-based studies have focused on the spatial representation and classification of water quality indicators. For instance, Chabuk et al. [6] applied Water Quality Index (WQI) methods combined with GIS tools to assess spatial variations in river systems, providing detailed cartographic insights into pollution distribution and regional water quality patterns.
Although GIS is a powerful tool for spatial analysis, its effectiveness is dependent on the quality and completeness of input data, as it does not independently generate environmental information but rather integrates and processes existing datasets [103]. These limitations highlight the need for complementary analytical approaches capable of modeling complex system dynamics.
The literature emphasizes that the integration of RS and GIS enhances the effectiveness of water quality monitoring systems. Remote sensing provides large-scale environmental observations, while GIS enables spatial structuring and analytical interpretation of these datasets. This integration is particularly effective for identifying spatial hotspots of pollution and assessing temporal changes in water quality [105]. However, both approaches require validation through ground-based measurements to ensure reliability and accuracy.
Nevertheless, GIS-based approaches are typically limited in their ability to capture complex non-linear relationships without integration with advanced analytical methods.
In contrast, the present study does not aim to retrieve water quality parameters from satellite imagery or to generate high-resolution spatial maps. Instead, it develops an integrated information systems framework that combines hydrological pressure, climatic stress, geopolitical instability, and strategic context within a unified analytical model. This approach allows for the identification of non-linear relationships, critical thresholds, and systemic interactions that are not directly observable through Remote Sensing or GIS-based methods.
Decision tree models are widely recognized for their ability to capture non-linear relationships and interaction effects while maintaining interpretability [106,107]. While ensemble learning methods such as Random Forest and Gradient Boosting often achieve higher predictive accuracy, they are frequently characterized as “black-box” models due to their limited interpretability [108,109]. In contrast, the single-tree approach adopted in this study provides transparent decision rules that allow for the identification of threshold effects and conditional dependencies. For example, RS-GIS literature often focuses on identifying spatial patterns of pollution, whereas the present approach identifies why such patterns emerge under specific combinations of hydrological and climatic stress.
The country–year panel structure (2013–2023) enables simultaneous analysis of temporal dynamics and cross-national variability. While this aggregation reduces spatial resolution compared to RS and GIS-based approaches, it enhances comparability and facilitates the identification of systemic drivers operating at the national level.
From a methodological perspective, robustness is ensured through structured preprocessing, including harmonization, temporal alignment, and missing data imputation. The application of k-nearest neighbors (k-NN) imputation is particularly relevant, as it preserves multivariate relationships and has been widely used in environmental datasets [110].
A primary aspect revealed by the study is the need to develop integrated monitoring systems capable of correlating environmental indicators with climatic variables and strategic–contextual indicators. In many cases, existing monitoring systems are limited by domain, which may reduce their ability to identify interactions between different types of pressures affecting water quality. Integrating data from diverse sources can contribute to the development of more robust information platforms to support the vulnerability analysis of hydrological systems. From a policy perspective, this kind of integration into a unified information system enhances the capacity of monitoring frameworks to detect emerging vulnerabilities and to support adaptive management strategies. The decision rules presented in Table 3 provide an interpretable structure that can be directly translated into monitoring indicators and policy-relevant thresholds.
The results suggest that hydrological pressure and climatic variability can amplify risks associated with nutrient pollution in surface waters. In this context, water resource management strategies should adopt adaptive approaches that account for evolving climatic conditions and increasing pressure on water resources. Tools such as the WEI+ indicator can be utilized to assess hydrological stress levels and to substantiate policies for sustainable water use. Simultaneously, integrating climatic indicators into early warning systems can help anticipate periods of high vulnerability for aquatic ecosystems.
A second aspect is represented by the importance of institutional and transboundary cooperation. In the case of transboundary river basins, the geopolitical context can influence cooperation mechanisms regarding water resource management. Institutional stability and international cooperation are essential for ensuring effective management and preventing the degradation of aquatic ecosystems. Integrating geopolitical indicators into water management analyses can help identify potential institutional vulnerabilities and support the development of more robust cooperation strategies for managing shared resources.
The third aspect is that the approach proposed in this study highlights the potential of using information systems and ML methods in the integrated analysis of factors influencing surface water quality. By using heterogeneous datasets and applying advanced analytical models, such systems can significantly support decision-making processes in water management.
The model robustness, supported by sensitivity tests, confirmed that the identified relationships are not driven by extreme observations or preprocessing choices. The results highlight the added value of interdisciplinary approaches, which enable the analysis of water quality not only as an environmental issue, but also as a component of broader water security frameworks. Consequently, the development of integrated information platforms for environmental monitoring can represent a valuable tool for public authorities and institutions responsible for water management, facilitating data-driven decisions and interdisciplinary analyses.

5. Conclusions

This study proposed an integrated analytical framework for assessing surface water quality by combining hydrological, climatic, geopolitical, and strategic indicators within a machine learning-based information system. From a broader perspective, this framework serves as an analytical tool for the early detection of environmental risks affecting water security. By integrating these heterogeneous datasets, the system identifies patterns of increasing environmental pressure, enabling decision-makers to anticipate emerging risks and strengthen the resilience of water governance under conditions of climatic uncertainty and geopolitical instability. The results highlight that nitrate dynamics in rivers are driven by the interaction of multiple pressures. Hydrological stress, reflected by the WEI+, and climatic variability, expressed through Heat_Stress, directly alter hydrological processes and reduce the natural dilution capacity of aquatic ecosystems.
The decision tree analysis identified climatic stress (Heat_Stress) as the primary predictor, with a critical threshold of approximately 7.83 days beyond which nitrate concentrations increase significantly. Under conditions of high climatic stress and intensive water exploitation (WEI+ ≥ 67.39), predicted nitrate levels exceed 20 mg/L and may reach extreme values of up to 58.82 mg/L in specific temporal contexts. In contrast, low hydrological pressure (WEI+ < 0.39) combined with moderate climatic stress is associated with very low nitrate concentrations, around 2.75 mg/L. These findings confirm the existence of non-linear relationships and threshold effects governing water quality dynamics.
The predictive performance of the regression model demonstrates a high level of accuracy, with a correlation coefficient of 0.976, a Mean Absolute Error (MAE) of 0.593, and a Root Mean Squared Error (RMSE) of 2.046. Similarly, the classification models achieved strong performance, with accuracy levels exceeding 85% and a ROC area above 0.94, indicating excellent discriminative capacity between different water quality regimes.
The integration of geopolitical factors further extends the analytical framework by highlighting the contextual role of institutional and socio-political conditions. Although the GPR does not act as a direct predictor, the results show that it influences the structure of relationships between climatic and hydrological variables, contributing to the differentiation of regional vulnerability profiles.
Simultaneously, geopolitical factors and the strategic dimension of the institutional environment, captured by the Nuclear_State variable, act as proxies for complex industrial infrastructures. These factors indirectly influence water resource governance and the efficiency of environmental monitoring systems, shifting institutional priorities away from ecological protection.
Overall, the study demonstrates that surface water quality is the result of complex interactions between climatic stress, hydrological pressure, and broader geopolitical conditions. The proposed information system-based framework, supported by decision tree models, provides a robust and interpretable tool for monitoring, predicting, and managing water quality risks under conditions of climate variability and geopolitical uncertainty. This research extends traditional hydrology by including contextual factors that influence environmental management systems, offering a unified model for strategic monitoring and early warning.
However, the study presents certain limitations. The analysis relies on a limited set of indicators and an aggregated country–year level of analysis, which may not fully capture the high-resolution dynamics of water pressure at a local or river basin scale. Moreover, the use of aggregated country-level data may not fully capture sub-basin variability in hydrological processes. Therefore, future research should incorporate higher-resolution datasets at the river basin or sub-basin level to improve the spatial accuracy of the analysis. This framework will be expanded by incorporating variables such as land-use indicators, specific agricultural pressures, and high-resolution sub-basin characteristics. Such enhancements would support the development of more robust information platforms for identifying vulnerabilities and ensuring the long-term stability of aquatic ecosystems in the face of global environmental and geopolitical change.

Author Contributions

Conceptualization, F.L.D.-C. and A.B.; methodology, F.L.D.-C.; software, F.L.D.-C.; validation, A.B.; formal analysis, A.B.; investigation, F.L.D.-C. and A.B.; resources, F.L.D.-C. and A.B.; data curation, F.L.D.-C.; writing—original draft preparation, F.L.D.-C. and A.B.; writing—review and editing, F.L.D.-C. and A.B.; visualization, F.L.D.-C.; supervision, A.B.; project administration, A.B.; funding acquisition, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. The dataset is available upon reasonable request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. World Health Organization. Strong Systems and Sound Investments: Evidence on and Key Insights into Accelerating Progress on Sanitation, Drinking-Water and Hygiene, UN-Water Global Analysis and Assessment of Sanitation and Drinking-Water (GLAAS) 2022 Report. Available online: https://www.who.int/publications/i/item/9789240065031 (accessed on 15 January 2026).
  2. World Health Organization. Water, Sanitation, Hygiene and Health: A Primer for Health Professionals; World Health Organization: Switzerland, Geneva, 2019; Available online: https://www.who.int/publications/i/item/WHO-CED-PHE-WSH-19.149 (accessed on 15 January 2026).
  3. Bărbulescu, A.; Barbes, L.; Dumitriu, C.S. Statistical assessment of the water quality using water quality indicators—Case study from India. In Water Safety, Security and Sustainability. Advanced Sciences and Technologies for Security Applications; Vaseashta, A., Maftei, C., Eds.; Springer: Cham, Switzerland; pp. 599–613.
  4. Nichita, C.; Voinea, S. Removal of the pharmaceutical pollutants from water using natural filter materials-experimental lab. Rom. Rep. Phys. 2024, 76, 706. [Google Scholar] [CrossRef]
  5. Tăban, C.I.; Sandu, A.; Oancea, S.; Stoia, M. Gross alpha/beta radioactivity of drinking water and relationships with quality parameters of water from Alba County, Romania. Rom. J. Phys. 2024, 69, 806. [Google Scholar]
  6. Chabuk, A.; Al-Madhlom, Q.; Al-Maliki, A.; Al-Ansari, N.; Hussain, H.; Knutsson, S. Water quality assessment along Tigris River (Iraq) using water quality index (WQI) and GIS software. Arab. J. Geosci. 2020, 13, 654. [Google Scholar] [CrossRef]
  7. Pueppke, S.G.; Nurtazin, S.T.; Graham, N.A.; Qi, J. Central Asia’s Ili River ecosystem as a wicked problem: Unraveling complex interrelationships at the interface of water, energy, and food. Water 2018, 10, 541. [Google Scholar] [CrossRef]
  8. Rosianu, A.-M.; Leru, P.M.; Stefan, S.; Iorga, G.; Marmureanu, L. Six-year monitoring of atmospheric pollen and major air pollutant concentrations in relation with meteorological factors in Bucharest, Romania. Rom. Rep. Phys. 2022, 74, 703. [Google Scholar]
  9. Grizzetti, B.; Bouraoui, F.; Billen, G.; van Grinsven, H.; Cardoso, A.C.; Thieu, V.; Garnier, J.; Curtis, C.; Howarth, R.; Johnes, P. Nitrogen as a threat to European water quality. In The European Nitrogen Assessment; Sutton, M.A., Ed.; Cambridge University Press: Cambridge, UK, 2011; pp. 379–404. [Google Scholar]
  10. Yousefi, H.; Karimi Douna, B. Risk of Nitrate Residues in Food Products and Drinking Water. Asian Pac. J. Environ. Cancer 2023, 6, 69–79. [Google Scholar] [CrossRef]
  11. Møller, H.; Landt, J.; Jensen, P.E.R.; Pedersen, E.; Autrup, H.; Jensen, O.L.E.M. Nitrate exposure from drinking water and diet in a Danish rural population. Int. J. Epidemiol. 1989, 18, 206–212. [Google Scholar] [CrossRef]
  12. Wolfe, A.H.; Patz, J.A. Reactive nitrogen and human health: Acute and long-term implications. AMBIO A J. Hum. Environ. 2002, 31, 120–125. [Google Scholar] [CrossRef]
  13. Comly, H.H. Cyanosis in infants caused by nitrates in well water. J. Am. Med. Assoc. 1945, 129, 112–116. [Google Scholar] [CrossRef]
  14. WHO. Guidelines for Drinking-Water Quality, Incorporating the First and Second Addenda, 4th ed.; World Health Organization: Switzerland, Geneva, 2022; p. 631. [Google Scholar]
  15. Ward, M.H.; deKok, T.M.; Levallois, P.; Brender, J.; Gulis, G.; Nolan, B.T.; VanDerslice, J. Workgroup Report: Drinking-Water Nitrate and Health─Recent Findings and Research Needs. Environ. Health Perspect. 2005, 113, 1607–1614. [Google Scholar] [CrossRef]
  16. Fewtrell, L. Drinking-water nitrate, methemoglobinemia, and global burden of disease: A discussion. Environ. Health Perspect. 2004, 112, 1371–1374. [Google Scholar] [CrossRef] [PubMed]
  17. Wang, J.; Liu, X.; Beusen, A.H.W.; Middelburg, J.J. Surface-Water Nitrate Exposure to World Populations Has Expanded and Intensified during 1970–2010. Environ. Sci. Technol. 2023, 57, 19395–19406. [Google Scholar] [CrossRef] [PubMed]
  18. Wanderi, E.W.; Gettel, G.M.; Singer, G.A.; Masese, F.O. Drivers of water quality in Afromontane-savanna rivers. Front. Environ. Sci. 2022, 10, 972153. [Google Scholar] [CrossRef]
  19. Alharbi, T.; El-Sorogy, A.S. Health Risk Assessment of Nitrate and Fluoride in the Groundwater of Central Saudi Arabia. Water 2023, 15, 2220. [Google Scholar] [CrossRef]
  20. Birsan, M.-V.; Sfîcă, L.; Amihăesei, V.-A.; Nita, I.-A.; Dogaru, D.; Lupu, L. Centennial Trends in Precipitation, Air Temperature, Evapotranspiration and Water Balance over Romania from Observational Data (1924–2023). Rom. J. Phys. 2025, 70, 805. [Google Scholar] [CrossRef]
  21. Barbulescu, A.; Maftei, C. Modeling the climate in the area of Techirghiol Lake (Romania). Rom. J. Phys. 2015, 60, 1163–1170. [Google Scholar]
  22. La Jeunesse, I.; Cirelli, C.; Aubin, D.; Larrue, C.; Deidda, R. Is climate change a threat for water uses in the Mediterranean region? Results from a survey at local scale. Sci. Total Environ. 2016, 543, 981–996. [Google Scholar] [CrossRef]
  23. Chiritescu, R.-V.; Luca, E.; Iorga, G. Observational study of major air pollutants over urban Romania in 2020 in comparison with 2019. Rom. Rep. Phys. 2024, 76, 702. [Google Scholar]
  24. Ene, A.; Bogdevich, O.; Culicov, A.S. Metals and natural radioactivity investigation of Danube River water in the lower sector. Rom. J. Phys. 2024, 69, 802. [Google Scholar] [CrossRef]
  25. Costa, D.; Sutter, C.; Shepherd, A.; Jarvie, H.; Wilson, H.; Elliott, J.; Liu, J.; Macrae, M. Impact of climate change on catchment nutrient dynamics: Insights from around the world. Environ. Rev. 2023, 31, 4–25. [Google Scholar] [CrossRef]
  26. Eekhout, J.P.C.; Hunink, J.E.; Terink, W.; de Vente, J. Why increased extreme precipitation under climate change negatively affects water security. Hydrol. Earth Syst. Sci. 2018, 22, 5935–5946. [Google Scholar] [CrossRef]
  27. Wang, Y.; Xu, H.; Zhao, X.; Kang, L.; Qiu, Y.; Paerl, H.; Zhu, G.; Li, H.; Zhu, M.; Qin, B.; et al. Rainfall impacts on nonpoint nitrogen and phosphorus dynamics in an agricultural river in subtropical montane reservoir region of southeast China. J. Environ. Sci. 2025, 149, 551–563. [Google Scholar] [CrossRef] [PubMed]
  28. Yang, X.; Li, T.; Hua, K.; Zhang, Y. Investigation of First Flushes in a Small Rural-Agricultural Catchment. Pol. J. Environ. Stud. 2015, 24, 381–389. [Google Scholar] [CrossRef] [PubMed]
  29. Krishnaswamy, U.R.; Moruzzi, R.B. Water pollution: How human activities have shaped the XXI century water crisis. In Water Resources and Environmental Sustainability; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
  30. Brears, R.C. Urban Water Security; Routledge: Abingdon, UK, 2016. [Google Scholar]
  31. Zhang, S.; Yan, X.; Feng, T.; Zhang, X.; Qiao, R.; Ren, Y.; Chen, Q. Unraveling nonlinear impacts of land use change on riverine water quality under future scenarios. Ecol. Indic. 2025, 179, 114258. [Google Scholar] [CrossRef]
  32. Khatri, N.; Tyagi, S. Influences of natural and anthropogenic factors on surface and groundwater quality in rural and urban areas. Front. Life Sci. 2015, 8, 23–39. [Google Scholar] [CrossRef]
  33. Best, J. Anthropogenic stresses on the world’s big rivers. Nat. Geosci. 2019, 12, 7–21. [Google Scholar] [CrossRef]
  34. Almeida, C.; Gonzalez, S.O.; Mallea, M.; Gonzalez, P. A recreational water quality index using chemical, physical and microbiological parameters. Environ. Sci. Pollut. Res. 2012, 19, 3400–3411. [Google Scholar] [CrossRef]
  35. Sutadian, A.D.; Muttil, N.; Yilmaz, A.G.; Perera, B.J.C. Development of a water quality index for rivers in West Java Province. Indonesia. Ecol. Indic. 2018, 85, 966–982. [Google Scholar] [CrossRef]
  36. Dojlido, J.A.N.; Raniszewski, J.; Woyciechowska, J. Water quality index applied to rivers in the Vistula River basin in Poland. Environ. Monit. Assess. 1994, 33, 33–42. [Google Scholar] [CrossRef]
  37. Rocchini, R.; Swain, L.G. The British Columbia Water Quality Index; Water Quality Branch, Environmental Protection Department British Columbia Ministry of Environment, Lands and Parks: Williams Lake, BC, Canada, 1995; 13p.
  38. Cude, C.G. Oregon water quality index: A tool for evaluating water quality management effectiveness. J. Am. Water Resour. Assoc. 2001, 37, 125–137. [Google Scholar] [CrossRef]
  39. Liou, S.-M.; Lo, S.-L.; Wang, S.-H. A Generalized Water Quality Index for Taiwan. Environ. Monit. Assess. 2004, 96, 35–52. [Google Scholar] [CrossRef] [PubMed]
  40. Uddin, M.G.; Nash, S.; Olbert, A.I. A review of water quality index models and their use for assessing surface water quality. Ecol. Indic. 2021, 122, 107218. [Google Scholar] [CrossRef]
  41. European Environment Agency (EEA). European Bathing Water Quality in 2024; EEA Publications: Copenhagen, Denmark, 2025. [Google Scholar]
  42. World Health Organization (WHO). Guidelines for Safe Recreational Water Environments. In Volume 1: Coastal and Fresh Waters; WHO: Geneva, Switzerland, 2021. [Google Scholar]
  43. Bărbulescu, A.; Barbeș, L. Integrated Assessment of Bathing Water Quality Along the Romanian Black Sea Coast. Water 2026, 18, 439. [Google Scholar] [CrossRef]
  44. du Plessis, A. Water as a source of conflict and global risk. In Freshwater Challenges of South Africa and its Upper Vaal River; du Plessis, A., Ed.; Springer: Cham, Switzerland, 2018; pp. 67–84. [Google Scholar]
  45. Chellaney, B. Water, Peace, and War: Confronting the Global Water Crisis; Rowman & Littlefield: Lanham, MD, USA, 2015. [Google Scholar]
  46. Daoudy, M.; Al-Saidi, M.; Al-Manji, A.; Ayoub, J.; Bateh, F. Troubled Waters in Conflict and a Changing Climate: Transboundary Basins Across the Middle East and North Africa. 2024. Available online: https://carnegieendowment.org/research/2025/05/troubled-waters-in-conflict-and-a-changing-climate-transboundary-basins-across-the-middle-east-and-north-africa (accessed on 10 March 2026).
  47. Elmotawakkil, A.; Enneya, N.; Bhagat, S.K.; Ouda, M.M.; Kumar, V. Advanced machine learning models for robust prediction of water quality index and classification. J. Hydroinform. 2025, 27, 299–319. [Google Scholar] [CrossRef]
  48. Pimenow, S.; Pimenowa, O.; Prus, P.; Niklas, A. The impact of artificial intelligence on the sustainability of regional ecosystems: Current challenges and future prospects. Sustainability 2025, 17, 4795. [Google Scholar] [CrossRef]
  49. Maldonado-Benitez, V.M.; Morales-Matamoros, O.; Hernández-Castillo, G. Towards resilient cities: Systematic review of artificial intelligence applications for water management. Water 2025, 17, 1978. [Google Scholar]
  50. León-Ovelar, R. Data science and public policies: Towards water security. In Smart Water Quality Monitoring: Artificial Intelligence Applications; Gutiérrez, D., Millán, P., Blasco, J., Eds.; Springer: Cham, Switzerland, 2026; pp. 1–28. [Google Scholar]
  51. Bărbulescu, A.; Dumitriu, C.Ș. Assessing Water Quality by Statistical Methods. Water 2021, 13, 1026. [Google Scholar] [CrossRef]
  52. Ahmadpour, A.; Mirhashemi, S.H.; Haghighatjou, P.; Foroughi, F. Comparison of the monthly streamflow forecasting in Maroon dam using HEC-HMS and SARIMA models. Sustain. Water Resour. Manag. 2022, 8, 158. [Google Scholar] [CrossRef]
  53. Barbulescu, A.; Nazzal, Y.; Howari, F. Assessing the Groundwater Quality in the Liwa Area, the United Arab Emirates. Water 2020, 12, 2816. [Google Scholar] [CrossRef]
  54. Bărbulescu, A.; Barbeş, L. Assessing the water quality of the Danube River (at Chiciu, Romania) by statistical methods. Environ. Earth Sci. 2020, 79, 122. [Google Scholar] [CrossRef]
  55. Zhang, X.; Wu, X.; Zhu, G.; Lu, X.; Wang, K. A seasonal ARIMA model based on the gravitational search algorithm (GSA) for runoff prediction. Water Supply 2022, 22, 6959–6977. [Google Scholar] [CrossRef]
  56. Lokman, A.; Ismail, W.Z.W.; Aziz, N.A.A. A review of water quality forecasting and classification using machine learning models and statistical analysis. Water 2025, 17, 2243. [Google Scholar] [CrossRef]
  57. Tyralis, H.; Papacharalampous, G.; Langousis, A. A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources. Water 2019, 11, 910. [Google Scholar] [CrossRef]
  58. Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
  59. Schmidt, L.; Heße, F.; Attinger, S.; Kumar, R. Challenges in applying machine learning models for hydrological inference: A case study for flooding events across Germany. Water Resour. Res. 2020, 56, e2019WR025924. [Google Scholar] [CrossRef]
  60. Bărbulescu, A.; Zhen, L. Forecasting the River Water Discharge by Artificial Intelligence Methods. Water 2024, 16, 1248. [Google Scholar] [CrossRef]
  61. Yan, X.; Zhang, T.; Du, W.; Meng, Q.; Xu, X. A comprehensive review of machine learning for water quality prediction over the past five years. J. Mar. Sci. Eng. 2024, 12, 159. [Google Scholar] [CrossRef]
  62. Xie, Z.; Liu, W.; Chen, S.; Yao, R.; Yang, C.; Zhang, X.; Li, J.; Wang, Y.; Zhang, Y. Machine learning approaches to identify hydrochemical processes and predict drinking water quality. J. Hydrol. Reg. Stud. 2025, 58, 102227. [Google Scholar] [CrossRef]
  63. Simian, D.; Șerban, M.E.; Bărbulescu, A. Machine Learning-Based Multifaceted Analysis Framework for Comparing and Selecting Water Quality Indices. Water Resour. Manag. 2025, 39, 847–863. [Google Scholar] [CrossRef]
  64. Sit, M.; Demiray, B.Z.; Xiang, Z.; Ewing, G.J.; Sermet, Y.; Demir, I. A comprehensive review of deep learning applications in hydrology and water resources. arXiv 2020, arXiv:2007.12269. [Google Scholar] [CrossRef]
  65. Li, Y.; Han, F.; Zheng, Y. Artificial intelligence in surface water quality research and management: Recent progress and future directions. Ecosyst. Health Sustain. 2026, 12, 0474. [Google Scholar] [CrossRef]
  66. Das, A. Prediction of urban surface water quality scenarios using water quality index, multivariate techniques, and machine learning models. Earth Syst. Environ. 2025, 10, 605–641. [Google Scholar] [CrossRef]
  67. Dorado-Guerra, D.Y.; Corzo-Pérez, G. Machine learning models to predict nitrate concentration in a river basin. Environ. Res. Commun. 2022, 4, 125012. [Google Scholar] [CrossRef]
  68. He, M.; Qian, Q.; Liu, X.; Zhang, J.; Curry, J. Recent progress on surface water quality models utilizing machine learning techniques. Water 2024, 16, 3616. [Google Scholar] [CrossRef]
  69. Helaly, M.A.; Rady, S.; Mabrouk, M.; Aref, M. Advancements in water quality prediction: A practical review of machine learning and deep learning approaches. Clust. Comput. 2025, 28, 598. [Google Scholar] [CrossRef]
  70. El-Magd, A.S.; Masoud, A.M.; Brink, H.G. Groundwater vulnerability under climate change: A machine learning framework. Earth Syst. Environ. 2025. [Google Scholar] [CrossRef]
  71. Zhen, L.; Bărbulescu, A. Comparative Analysis of Convolutional Neural Network-Long Short-Term Memory, Sparrow Search Algorithm-Backpropagation Neural Network, and Particle Swarm Optimization-Extreme Learning Machine Models for the Water Discharge of the Buzău River, Romania. Water 2024, 16, 289. [Google Scholar] [CrossRef]
  72. Das, B.K.; Paul, S.; Mandal, B.; Gogoi, P.; Paul, L. Integrating machine learning models for optimizing ecosystem health assessments through prediction of nitrate-N concentrations in the lower stretch of the Ganga River. Environ. Sci. Pollut. Res. 2025, 32, 4670–4689. [Google Scholar] [CrossRef]
  73. Karunanidhi, D.; Raj, M.R.H.; Subramani, T.; Wu, J. Source apportionment and prediction of groundwater nitrate using hydrochemistry and machine learning approaches. Environ. Geochem. Health 2026, 48, 138. [Google Scholar] [CrossRef]
  74. Dragomir, F.L.; Alexandrescu, G.; Postolache, F. Tools for hierarchical security modeling. In Proceedings of the 14th International Scientific Conference eLearning and Software for Education, Bucharest, Romania, 19–20 April 2018; Volume 4, pp. 34–38. [Google Scholar]
  75. Dragomir-Constantin, F.-L.; Beldiman, C.M.; Zlati, M. Informational approaches in modelling social and economic relations: Study on migration and access to services in the European Union. Systems 2025, 13, 469. [Google Scholar] [CrossRef]
  76. Deng, Y.; Zhang, Y.; Pan, D.; Yang, S.X.; Gharabaghi, B. Review of recent advances in remote sensing and machine learning methods for lake water quality management. Remote Sens. 2024, 16, 4196. [Google Scholar] [CrossRef]
  77. Pan, D.; Deng, Y.; Yang, S.X.; Gharabaghi, B. Recent advances in remote sensing and artificial intelligence for river water quality forecasting. Environments 2025, 12, 158. [Google Scholar] [CrossRef]
  78. Pandit, A.; Golden, H.E.; Christensen, J.R.; Lane, C.R.; Husic, A. Deep learning prediction and interpretation of riverine nitrate export across the Mississippi River Basin. Water Resour. Res. 2025, 61, e2024WR039207. [Google Scholar] [CrossRef]
  79. Muñoz-Alegría, J.A.; Núñez, J.; Oyarzún, R.; Chávez, C. A bibliometric systematic literature review of machine learning-based water quality prediction. Water 2025, 17, 2994. [Google Scholar] [CrossRef]
  80. Zhang, Z.; Wang, D.; Mei, Y.; Zhu, J.; Xiao, X. Developing an explainable deep learning module based on the LSTM framework for flood prediction. Front. Water 2025, 7, 1562842. [Google Scholar] [CrossRef]
  81. Zounemat-Kermani, M.; Kheimi, M. Explainable Artificial Intelligence in Hydrology: A Review. Water Resour. Manag. 2026, 40, 106. [Google Scholar] [CrossRef]
  82. European Environment Agency. Status of Nitrates in Rivers in European Countries. Available online: https://www.eea.europa.eu/en/analysis/indicators/nutrients-in-freshwater-in-europe-1763998761/status-of-nitrates-in-rivers-in-european-countries?activeTab=570bee2d-1316-48cf-adde-4b640f92119b (accessed on 1 February 2026).
  83. Eurostat. Water Exploitation Index, Plus (WEI+). Available online: https://ec.europa.eu/eurostat/databrowser/view/SDG_06_60/default/table?lang=en (accessed on 1 February 2026).
  84. Copernicus Climate Change Service—European State of the Climate (ESOTC). Available online: https://climate.copernicus.eu/esotc/2024 (accessed on 1 February 2026).
  85. Caldara, D.; Iacoviello, M. Measuring Geopolitical Risk. Am. Econ. Rev. 2022, 112, 1194–1225. [Google Scholar] [CrossRef]
  86. SIPRI Databases. Available online: https://www.sipri.org/databases (accessed on 1 February 2026).
  87. Status of World Nuclear Forces. Available online: https://fas.org/initiative/status-world-nuclear-forces/ (accessed on 1 February 2026).
  88. Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann: Cambridge, MA, USA, 2016. [Google Scholar]
  89. Elomaa, T.; Kaariainen, M. An Analysis of Reduced Error Pruning. J. Mach. Learn. Res. 2001, 1, 163–185. [Google Scholar] [CrossRef]
  90. Quinlan, J.R. Simplifying Decision Trees. Int. J. Man-Mach. Stud. 1987, 27, 221–234. [Google Scholar] [CrossRef]
  91. Esposito, F.; Malerba, D.; Semeraro, G. A Comparative Analysis of Methods for Pruning Decision Trees. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 476–493. [Google Scholar] [CrossRef]
  92. Bekkar, M.; Djemaa, H.K.; Alitouche, T.A. Evaluation Measures for Models Assessment over Imbalanced Data Sets. J. Inform. Eng. Appl. 2013, 3, 27–38. [Google Scholar]
  93. Mirumachi, N. Transboundary Water Politics in the Developing World; Routledge: London, UK, 2015. [Google Scholar]
  94. Zeitoun, M.; Warner, J.F. Hydro-hegemony: A framework for analysis of trans-boundary water conflicts. Water Policy 2006, 8, 435–460. [Google Scholar] [CrossRef]
  95. Grey, D.; Sadoff, C. Beyond the river: The benefits of cooperation on international rivers. Water Sci. Technol. 2003, 47, 91–96. [Google Scholar] [CrossRef]
  96. Azizi, M.A.; Leandro, J. Factors Affecting Transboundary Water Disputes: Nile, Indus, and Euphrates–Tigris River Basins. Water 2025, 17, 525. [Google Scholar] [CrossRef]
  97. Hecht, G. The Radiance of France: Nuclear Power and National Identity After World War II; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
  98. Joerges, B.; Shinn, T. Instrumentation Between Science, State and Industry; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2001. [Google Scholar]
  99. Dryzek, J.S. The Politics of the Earth: Environmental Discourses, 4th ed.; Oxford University Press: Oxford, UK, 2021. [Google Scholar]
  100. Khan, K.; Khurshid, A.; Cifuentes-Faura, J. Is geopolitics a new risk to environmental policy in the European union? J. Environ. Manag. 2023, 345, 118868. [Google Scholar] [CrossRef]
  101. Jensen, J.R. Remote Sensing of the Environment: An Earth Resource Perspective, 2nd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2007. [Google Scholar]
  102. Lillesand, T.M.; Kiefer, R.W.; Chipman, J.W. Remote Sensing and Image Interpretation, 7th ed.; Wiley: New York, NY, USA, 2015. [Google Scholar]
  103. Longley, P.A.; Goodchild, M.F.; Maguire, D.J.; Rhind, D.W. Geographic Information Systems and Science, 3rd ed.; Wiley: Chichester, UK, 2015. [Google Scholar]
  104. Maguire, D.J.; Batty, M.; Goodchild, M.F. GIS, Spatial Analysis, and Modeling; ESRI Press: Redlands, CA, USA, 2005. [Google Scholar]
  105. Usali, N.; Ismail, M.H. Use of Remote Sensing and GIS in Monitoring Water Quality. J. Sustain. Dev. 2010, 3, 228. [Google Scholar] [CrossRef]
  106. Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: San Mateo, CA, USA, 1993. [Google Scholar]
  107. Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees; CRC Press: Boca Raton, FL, USA, 1984. [Google Scholar]
  108. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  109. Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
  110. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
Figure 1. Scatter plot matrix illustrating the relationships between the analyzed variables.
Figure 1. Scatter plot matrix illustrating the relationships between the analyzed variables.
Water 18 00996 g001
Figure 2. Initial tree.
Figure 2. Initial tree.
Water 18 00996 g002
Figure 3. Decision tree model explaining nitrate concentration based on climatic and hydrological predictors.
Figure 3. Decision tree model explaining nitrate concentration based on climatic and hydrological predictors.
Water 18 00996 g003
Figure 4. The second tree: Geopolitical and Strategic Context of Water Quality Risks.
Figure 4. The second tree: Geopolitical and Strategic Context of Water Quality Risks.
Water 18 00996 g004
Table 1. Summary of variables.
Table 1. Summary of variables.
CategoryIndicatorData TypeUnit
StrategicNuclear warheadsDiscrete numericNumber of warheads
GeopoliticalGPRContinuousIndex
ClimaticHeat stress daysDiscrete numericDays/year
Hydrological PressureWEI+Continuous%
Water Quality (Target)Nitrate concentration in riversContinuousmg/L
Table 2. Performance metrics for the initial tree.
Table 2. Performance metrics for the initial tree.
IndicatorValue
Accuracy85.76%
Kappa statistics0.853
Precision0.815
Recall0.893
F1-score0.866
ROC Area0.935
Table 3. Decision rules table.
Table 3. Decision rules table.
RuleConditionsNitrate (mg/L)Interpretation
R1Heat_Stress < 4.2816.99Moderate nitrate level under low climate stress conditions
R24.28 ≤ Heat_Stress < 7.83 and WEI+ < 0.392.75Very low nitrate level under low hydrological pressure conditions
R34.28 ≤ Heat_Stress < 7.83 and WEI+ ≥ 0.395.73Low nitrate level under moderate hydrological pressure conditions
R4Heat_Stress ≥ 7.83 and WEI+ < 67.3917.05Moderate nitrate level under high climate stress
R5Heat_Stress ≥ 7.83 and WEI+ ≥ 67.3920.34High nitrate level associated with intense hydrological pressure
R6Heat_Stress ≥ 7.83 and WEI+ ≥ 67.39 and Year < 201958.82Very high nitrate levels under specific temporal conditions. This temporal threshold was identified by the algorithm and may reflect structural changes in environmental or policy conditions during the analyzed period.
R7Heat_Stress ≥ 7.83 and WEI+ ≥ 67.39 and Year ≥ 201919.31Relative reduction in concentrations after 2019
Table 4. Performance metrics of the second tree.
Table 4. Performance metrics of the second tree.
IndicatorValue
Correlation coefficient (r)0.976
Mean Absolute Error (MAE)0.593
Root Mean Squared Error (RMSE)2.046
Relative Absolute Error9.79%
Relative Absolute Error21.72%
Table 5. Performance metrics for the third tree.
Table 5. Performance metrics for the third tree.
IndicatorValue
Accuracy86.16%
Kappa statistic0.853
Precision0.875
Recall0.927
F1-score0.867
ROC Area0.942
Table 6. Comparative structure of the decision tree models.
Table 6. Comparative structure of the decision tree models.
ModelFigure
No.
Tree TypeRoot
Variable
Main
Predictors
Analytical Role
Initial environmental
segmentation model
Figure 2Classification
tree
Nitrate thresholdHeat_Stress,
WEI+
Identifies ecological regimes of water quality and clusters of countries with similar hydrological profiles
Optimized predictive
model
Figure 3Regression
tree
Heat_StressWEI+, YearExplains variations in nitrate concentration based on climatic stress and hydrological pressure
Geopolitical contextual
model
Figure 4Classification
tree
GPRNitrate, WEI+, Heat_StressIntegrates geopolitical risk into the analysis, highlighting how environmental processes operate within broader socio-political contexts
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dragomir-Constantin, F.L.; Bărbulescu, A. Assessing Surface Water Quality Risks Under Climate Stress and Geopolitical Instability: An Information Systems Approach. Water 2026, 18, 996. https://doi.org/10.3390/w18090996

AMA Style

Dragomir-Constantin FL, Bărbulescu A. Assessing Surface Water Quality Risks Under Climate Stress and Geopolitical Instability: An Information Systems Approach. Water. 2026; 18(9):996. https://doi.org/10.3390/w18090996

Chicago/Turabian Style

Dragomir-Constantin, Florentina Loredana, and Alina Bărbulescu. 2026. "Assessing Surface Water Quality Risks Under Climate Stress and Geopolitical Instability: An Information Systems Approach" Water 18, no. 9: 996. https://doi.org/10.3390/w18090996

APA Style

Dragomir-Constantin, F. L., & Bărbulescu, A. (2026). Assessing Surface Water Quality Risks Under Climate Stress and Geopolitical Instability: An Information Systems Approach. Water, 18(9), 996. https://doi.org/10.3390/w18090996

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop