Next Article in Journal
A Study on the Measurement and Spatial Non-Equilibrium of Marine New-Quality Productivity in China: Differences, Polarization, and Causes
Previous Article in Journal
Simulation Experiment on the Effect of Saline Reclaimed Water Recharge on Soil Water and Salt Migration in Xinjiang, China
Previous Article in Special Issue
Multi-Model Comparison of Hydrologic Simulation Performance Using DWAT, PRMS, and TANK Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integration of Satellite-Derived Meteorological Inputs into SWAT, XGBoost, WGAN, and Hybrid Modelling Frameworks for Climate Change-Driven Streamflow Simulation in a Data-Scarce Region

by
Sefa Nur Yeşilyurt
1,* and
Gülay Onuşluel Gül
2
1
Graduate School of Natural and Applied Sciences, Dokuz Eylül University, Izmir 35390, Türkiye
2
Department of Civil Engineering, Dokuz Eylül University, Izmir 35390, Türkiye
*
Author to whom correspondence should be addressed.
Water 2026, 18(2), 239; https://doi.org/10.3390/w18020239
Submission received: 16 December 2025 / Revised: 8 January 2026 / Accepted: 14 January 2026 / Published: 16 January 2026
(This article belongs to the Special Issue Application of Hydrological Modelling to Water Resources Management)

Abstract

The pressure of climate change on water resources has made the development of reliable hydrological models increasingly important, especially for data-scarce regions. However, due to the limited availability of ground-based observations, it considerably affects the accuracy of models developed using these inputs. This also limits the ability to investigate future hydrological behavior. Satellite-based data sources have emerged as an alternative to address this challenge and have received significant attention. However, the transferability of these datasets across different model classes has not been widely explored. This paper evaluates the transferability of satellite-derived inputs to eleven types of models, including process-based (SWAT), data-driven methods (XGBoost and WGAN), and hybrid model structures that utilize SWAT outputs with AI models. SHAP has been applied to overcome the black-box limitations of AI models and gain insights into fundamental hydrometeorological processes. In addition, uncertainty analysis was performed for all models, enabling a more comprehensive evaluation of performance. The results indicate that hybrid models using SWAT combined with WGAN can achieve better predictive accuracy than the SWAT model based on ground observation. While the baseline SWAT model achieved satisfactory performance during the validation period (NSE ≈ 0.86, KGE ≈ 0.80), the hybrid SWAT + WGAN framework improved simulation skill, reaching NSE ≈ 0.90 and KGE ≈ 0.89 during validation. Models forced with satellite-derived meteorological inputs additionally performed as well as those forced using station-based observations, validating the feasibility of using satellite products as alternative data sources. The future hydrological status of the basin was assessed based on the best-performing hybrid model and CMIP6 climate projections, showing a clear drought signal in the flows and long-term reductions in average flows reaching up to 58%. Overall, the findings indicate that the proposed framework provides a consistent approach for data-scarce basins. Future applications may benefit from integrating spatio-temporal learning frameworks and ensemble-based uncertainty quantification to enhance robustness under changing climate conditions.

1. Introduction

With increasing extremes in hydrological events, robust analysis of hydrological systems is becoming increasingly important, increasing the need for reliable streamflow estimations to support sustainable watershed management under changing climate conditions [1,2]. The most crucial factor to consider when developing accurate streamflow models is the availability of high-quality data. However, unfortunately, ground-based meteorological observation data are insufficient across many regions of the world. This limits the performance of hydrological models. Models trained on limited data often fail to generalize well. This situation has forced researchers to find alternative sources [3,4,5,6,7,8,9]. Among these alternatives, satellite-based meteorological datasets have emerged as the most promising option, providing systematic temporal and spatial coverage while ensuring continuity [10,11,12,13]. Although numerous satellite datasets are available in the literature, satellite-based datasets such as CHIRPS (Climate Hazards Group InfraRed Precipitation with Station data), GPM IMERG (Global Precipitation Measurement—Integrated Multi-satellitE Retrievals for GPM), and ERA5 reanalysis product of the European Centre for Medium-Range Weather Forecasts (ECMWF) are among the most widely used to provide the necessary inputs for modeling in regions with insufficient observational data [4,5]. However, these data sources typically include key meteorological variables such as precipitation and temperature. Therefore, they cannot provide all the meteorological variables required for comprehensive hydrological modeling. Moreover, a large number of meteorological variables are required to develop a hydrological model, and this can only be achieved with a dataset that provides all required variables with the same spatial and temporal resolution. Among the globally available products, NASA POWER (Prediction of Worldwide Energy Resources) is unique, as it provides multiple variables—including precipitation, temperature, humidity, wind speed, and solar radiation—at the same spatial resolution [14,15]. Existing research demonstrates that data from the NASA POWER Project are particularly well-suited for hydrological modeling in data-scarce basins, and also shows that the project provides multiple reliable meteorological variables [14,15,16,17]. Therefore, NASA POWER was selected in this study as the primary satellite-based meteorological data source.
The choice of the modelling paradigm is as important (in terms of performance and interpretability) as the availability of data. Physically based models like SWAT (Soil and Water Assessment Tool, version 2012) generally mimic hydrological behaviors like runoff generation, infiltration, evaporation and groundwater processes through empirical expressions [1,18]. These are physically interpretable models, which offer an insight into the hydrological cycle and therefore are not always straightforward to apply in practice. In addition, these models are extremely sensitive to parameter uncertainty [19,20]. AI algorithms like ANN (Artificial Neural Networks), SVR (Support Vector Regression), XGBoost (eXtreme Gradient Boosting) and WGAN (Wasserstein Generative Adversarial Networks) learn non-linear relationships from data alone, without any explicit representation of the underlying hydrological processes. This hinders the application of these models, although they can theoretically achieve a much greater accuracy than any physically based model. Because these models are black-box systems, they cannot provide researchers with sufficient insight into the hydrological processes that occurred in the catchment under study [21,22,23,24]. Therefore, to address the lack of explainability of black-box models, hybrid approaches that combine the strengths of process-based and data-driven models have been developed. In such hybrid models, the physical component acts as an encoder that produces intermediate outputs, while the artificial intelligence component acts as a decoder that reduces residual errors in the simulated discharge [25,26]. In this study, advanced algorithms such as XGBoost and WGAN were selected to improve hybrid model performance while preserving process consistency [27,28]. Accordingly, three family members were created: the physically based SWAT version, the data-driven XGBoost and the WGAN versions, and hybrid members (SWAT–XGBoost and SWAT–WGAN), where SWAT output is post-processed by AI algorithms. Despite these improvements, a major drawback still remains: the limited interpretability of data-driven components. While data-driven models can yield higher predictive accuracy than physically based models, a lack of transparency can hinder scientific understanding and undermine model reliability. Therefore, the recent use of explainable artificial intelligence (XAI) techniques, such as the SHapley Additive Explanations (SHAP) method, has made it possible to quantify the contribution of nonlinear relationships in hydrological models to specific model outputs [29,30]. SHAP analysis has been shown to be useful to determine the leading hydrological drivers and determine the interactions between the meteorological variables and ensure data-driven versions remain consistent with the processes within the watersheds [31]. SHAP has received significant attention in recent years, but its integration into hybrid or physically based hydrological models, as well as its ability to capture fundamental hydrological processes, have not yet been sufficiently explored in the literature [32,33]. To address this gap, the study extends SHAP analysis to advanced machine learning models, including XGBoost and WGAN, as well as to SWAT–AI hybrid frameworks, aiming to relate gains in predictive skill to clearer and more process-consistent interpretations.
In the final stage of this study, the response to future hydro-climatic conditions is examined by utilizing bias-corrected CMIP6 (Coupled Model Intercomparison Project Phase 6) projections under the Shared Socioeconomic Pathways (SSP2-4.5 and SSP5-8.5) pathways. This assessment was carried out using 3 scenarios (MPI-ESM1-2-HR (Max Planck Institute Earth System Model, version 1.2, High Resolution); GFDL-ESM4 (Geophysical Fluid Dynamics Laboratory Earth System Model, version 4); EC-Earth3 (EC-Earth Earth System Model, version 3)) based on bias-corrected CMIP6 projections [34,35]. Specifically, this study examines four central research questions: (i) Evaluates the accuracy of satellite-derived meteorological inputs substituting observed data, (ii) Discusses the capability of machine learning techniques to improve the performance of physically based models including SWAT, (iii) Establishes the ability by independent machine learning models to simulate process-based characteristics, and (iv) Assesses the projected future hydro-meteorological regime of the basin based on CMIP6 climate scenarios.
To address these research questions in a unified manner, this study establishes a consistent experimental design. Unlike previous studies that only examined satellite data performance [5,15], hybrid post-processing [18,22,36] or climate projections [35,37], this study is an integrated study in which both satellite data performance is examined, model hybridizations are performed and changes in the streamflow rates according to climate projections are revealed. To ensure a fair comparison, all models were developed and tested under the same boundary conditions. This study builds an end-to-end and fully reproducible framework that jointly assesses the hydrological usability of satellite-derived meteorological datasets, compares the accuracy of process-based data-intensive and data-driven data-sparsity and hybrid modelling paradigms, and explores the future hydroclimatic sensitivity of the Büyük Menderes drainage basin to CMIP6 scenarios. By integrating the physical interpretable nature of process-based modelling with the flexibility of data-driven models and the transparency of explainable AI models, this work presents an innovative data-efficient and physically consistent hydrological modelling framework. Aside from improving the predictive accuracy, the suggested framework also improves strength on the interpretability aspect by giving an end-to-end transparent and resilient and transferable solution to the sustainable management of the data-scarce watersheds.

2. Materials

2.1. Study Area

The Büyük Menderes Basin in western Türkiye was selected as the study area (see Figure 1). The basin, with its complex hydrological dynamics, exhibits heterogeneity in climate, topography, and land use. The Büyük Menderes Basin, one of the largest basins in the country, exhibits the characteristics of a typical Mediterranean climate with hot and dry summers and mild and rainy winters. The basin covers an area of approximately 25,987 km2 and contains surface water bodies that are highly sensitive to climate change. Furthermore, these water bodies, which are of great importance for the agricultural needs of the basin, provide a highly suitable environment for evaluating the performance of satellite-based data and hybrid hydrological models [38]. The specific sub-basin was selected intentionally to represent a reference Mediterranean and semi-arid hydrological setting with limited availability of long-term in situ observations. The lack of large hydraulic regulation structures at the outlet of the catchment ensures that natural rainfall–runoff processes are studied here, without any anthropogenic disruption. In addition, the daily discharge record at the outlet gauging station provides a valuable basis for testing model transferability and uncertainty in a data-scarce environment. Therefore, the results of this sub-basin can be extrapolated to other Mediterranean and semi-arid areas characterized by similar climatic variability and data scarcity. The selected sub-basin is shown in Figure 1.

2.2. Observed Data

The datasets used to create the models were compiled from both national and global sources. Precipitation, minimum, maximum, and average temperature values, relative humidity, and wind speed data for the SWAT model were obtained from station 17886 operated by the Turkish State Meteorological Service (MGM). Since no hourly direct solar radiation (sol) measurements were available, hourly ERA5 reanalysis data were obtained and subsequently aggregated into daily values for ensuring temporal consistency. The streamflow data were acquired from station E07A001 of the General Directorate of State Hydraulic Works-DSI, the locations of the stations are shown in Figure 1. In this study, the period of 1994–2018 was considered as the observation period. The first 70% of the data was used for the calibration, and the remaining 30% for the validation process.

2.3. Satellite-Based Meteorological Data

The NASA POWER (Prediction of Worldwide Energy Resources) dataset is a globally available satellite-based meteorological product developed by NASA’s Global Modeling and Assimilation Office (GMAO) to support hydrological, agricultural, and renewable energy applications [39]. NASA POWER is a data product that provides all the meteorological variables required for models. The dataset, derived from the MERRA-2 (Modern-Era Retrospective analysis for Research and Applications, Version 2) and GEOS (Goddard Earth Observing System) reanalysis datasets, was chosen because it provides spatially and temporally consistent meteorological data across all variables. NASA POWER provides ease of implementation because it provides access to most data sets required for running complex and multivariable models. The dataset is openly accessible through NASA’s web portal (https://power.larc.nasa.gov/) which provides automated queries and standardized time-series downloads for any global coordinate.

3. Methods

The aim of this study is to investigate the usability of satellite-derived meteorological data instead of ground observations in hydrological models established in data-limited regions. Therefore, three different modeling approaches are proposed. First, a frequently used process-based model is employed, followed by data-driven models, which are increasingly utilized and offer greater ease of use than process-based models. Finally, hybrid versions of these models are applied (Figure 2).

3.1. SWAT Hydrological Modelling Framework

The Soil and Water Assessment Tool (SWAT) is a physically based, semi-distributed model developed by the United States Department of Agriculture (USDA). The model was developed to simulate hydrological processes at the basin scale [40]. The open-access model has a wide range of applications. Therefore, it has been widely used in numerous studies on general hydrological modeling, sediment transport modeling, land use change modeling, and climate change simulation [7,27,41].
SWAT examines multiple components together to represent the hydrological cycle, requiring topography, land use, soil data, and meteorological data. The basin is divided into Hydrologic Response Units (HRUs) which allow for detailed analysis at the HRU level. Surface runoff, evapotranspiration, infiltration, and baseflow are included in the model. In addition to these fundamental components, SWAT explicitly represents the interactions between surface and subsurface hydrological processes through conceptual storage and flows. Incoming precipitation is separated into surface runoff, soil water storage, and infiltration into shallow groundwater. Lateral flow and return flow contribute to stream discharge depending on soil properties, topography, and previous moisture conditions. Evaporation experienced in the soil and vegetation layer regulates water availability in the root zone. Based on this physically focused framework, SWAT can simultaneously capture the fast and slow responses of hydrological processes in the watershed (Figure 3) [40]. Although model outputs were evaluated at a monthly time scale in this study, all meteorological inputs were processed at daily resolution in accordance with SWAT requirements, ensuring that rainfall–runoff generation processes were resolved at the native daily time step.
The SWAT-CUP (SWAT Calibration and Uncertainty Programs) interface, integrated with the SWAT model and using the SUFI-2 (Sequential Uncertainty Fitting, version 2), GLUE (Generalized Likelihood Uncertainty Estimation), ParaSol (Parameter Solution) and GA (Genetic Algorithm) algorithms, are used for model calibration and uncertainty analysis. The SUFI-2 method, which is used in this study, adjusts parameter ranges to minimize differences between observed and simulated runoff. SUFI-2 is a semi-automated calibration and uncertainty analysis approach that accounts for multiple sources of uncertainty by iteratively updating parameter ranges based on model performance and uncertainty bounds. It was selected due to its computational efficiency, robustness in handling parameter uncertainty, and its proven applicability in large-scale and data-scarce hydrological basins. In this process, it uses both global and local sensitivity analyses to identify the most influential hydrological parameters and improve model performance. The selected sensitive parameters, their optimized values, and calibration settings follow the standard definitions and physical interpretations provided in the SWAT-CUP user manual [20].

3.2. Artificial Intelligence Methods

Chen and Guestrin [42] introduced the eXtreme Gradient Boosting (XGBoost) algorithm, which is an ensemble learning method based on decision trees. The concept behind the model is to build a bunch of decision trees (n_estimators) that train on different sub-samples of the dataset each time they run. So, the model improves by going step-by-step towards finding the best solution. At this point, unlike normal decision trees, XGBoost can internally determine the optimal number of leaf nodes, which refer to the terminal nodes where the model generates its final predictions. It also includes model pruning, which involves removing unnecessary branches or leaf nodes to reduce model complexity and prevent overfitting, as well as system optimization and the ability to process missing values. It is therefore able to effectively model numerical and categorical variables. These characteristics have rendered XGBoost an appealing Machine Learning technique for dealing with nonlinear relationships in complex hydrological and climate systems [36,42,43,44].
The Wasserstein Generative Adversarial Network (WGAN) is an improved variant of the original Generative Adversarial Network (GAN), which was first introduced by Arjovsky et al. [45] and later refined by Gulrajani et al. [46]. As a result, WGAN not only stabilizes the training process but also produces output distributions that more clearly reflect the underlying differences in the data. The model is composed of two core architectures: a Generator to produce synthetic data resembling real observations, and a Critic to discriminate how much the generated samples are sampled from the real distribution. This architecture may help the model to memorize complex patterns more convincingly, preventing some frequently seen issues such as training instability or mode collapse [6]. WGAN has been used in hydrological studies for realistic hydroclimatic data generation, rainfall-runoff modeling, and bias correction in model outputs [6,47]. In this study, the WGAN model was implemented using a carefully selected configuration that provided stable training behavior, consisting of fully connected generator and critic networks with two hidden layers and 256 neurons per layer, LeakyReLU activation functions, five critic updates per generator iteration (Ncritic = 5), and a gradient penalty coefficient of λ = 12.0 to enforce the Lipschitz constraint.
Hyperparameter optimization for both XGBoost and WGAN configurations was performed using Optuna, which is an automatic hyperparameter optimization tool that efficiently searches for the best parameter combinations using adaptive, data-driven search algorithms. This makes it especially useful for complex machine learning models. Optuna significantly improved model skill by efficiently searching for the values of the used hyperparameters. This prevented overfitting and improved the predictive performance of the models [48]. Hyperparameter optimization with Optuna was conducted using blocked time-series cross-validation, where model validation was always performed on forward-in-time segments to preserve temporal dependencies and avoid information leakage. In addition, early stopping was applied during both the cross-validation stage and the final model training to prevent excessive model complexity. Model performance was ultimately evaluated on an independent chronological holdout dataset, which was excluded from the optimization process, providing a robust assessment of the generalization capability of both XGBoost and WGAN models.

3.3. SHapley Additive exPlanations (SHAP)

The SHAP technique was applied for explanations of machine learning and hybrid models. SHAP measures how much contribution does each feature give to the model by applying a value to every input attribute considering a particular model output. In the present work, SHAP analysis was applied to estimate the relative importance and physical relevance of meteorological conditions (precipitation, temperature, and solar radiation) in both data-driven approaches and hybrid models. Recent research also indicates that this approach illustrates how SHAP analysis significantly contributes to enhancing data-driven models and their capability to represent hydrological knowledge. Nevertheless, it should be noted that SHAP-based interpretations may be influenced by multicollinearity among input variables, particularly when lagged and rolling features are used. In such cases, the contribution of correlated predictors may be distributed across multiple features rather than attributed to a single variable. Therefore, SHAP results in this study are interpreted in a relative sense, focusing on dominant patterns and physical consistency rather than strict causal importance [6,9,36,47].

3.4. Climate Change Scenarios and Downscaling

High-resolution climate projection datasets were utilized to assess the potential effects of climate change on hydrological processes in the Büyük Menderes River Basin. These projections, integrated into the SWAT framework, were obtained from the NEX-GDDP-CMIP6 (NASA Earth Exchange Global Daily Downscaled Projections—Coupled Model Intercomparison Project Phase 6) archive, which provides statistically downscaled outputs from the Coupled Model Intercomparison Project Phase 6 (CMIP6). Prior to integration, the datasets were bias-corrected using the Empirical Quantile Mapping (EQM) method to reduce systematic deviations between observed and modeled climate variables. EQM was selected in this study because, considering the input sensitivity of hydrological models, it is known to preserve the underlying probability distribution of climate variables, including extreme events, while introducing minimal distortion compared to alternative correction techniques. Compared to simpler approaches such as the delta change method, which primarily adjusts mean climate signals, and quantile delta mapping, which may affect the temporal characteristics of extremes, EQM enables a more consistent representation of the full empirical distribution. This approach provided a more reliable depiction of the climatic signal and strengthened the robustness of future hydro-climatic projections [33]. The selected datasets enable detailed regional-scale hydrological impact assessments due to their high spatial and temporal resolution. Three Global Climate Models (GCMs) were chosen based on their proven skill in representing regional hydro-climatic conditions: MPI-ESM1-2-HR, which offers high-resolution land–atmosphere interaction representation; GFDL-ESM4, known for consistent long-term simulations; and EC-Earth3, which realistically captures temperature and precipitation dynamics under varying emission pathways. Two Shared Socioeconomic Pathways (SSPs) were applied to represent distinct emission trajectories: SSP2-4.5 (medium stabilization) and SSP5-8.5 (high-emission scenario). Comparing these pathways provides a comprehensive perspective on moderate versus severe climate futures, thereby improving understanding of potential hydrological shifts under different socioeconomic and policy conditions [34,49,50].

3.5. Model Evaluation and Performance Metrics

Model performances were analyzed using four universally accepted statistical metrics whose formulations are presented in Table 1. The overall accuracy and correlation between simulated and observed discharge values were investigated by calculating NSE, KGE, and the Coefficient of Determination, R2. Model performance metrics are calculated to depict the average and squared deviations of simulated values from observed ones by determining the RMSE. NSE and KGE measure the fit, correlation, bias and variability of simulated streamflow compared to observed streamflow overall, while RMSE provides a measure of errors in simulations by punishing large errors. The joint application of these criteria enables the robust assessment regarding model accuracy, bias and hydrological consistency. In hydrological modeling applications, an assessment of the model performance using error statistics like RMSE or NSE may be insufficient due to uncertainties that are introduced with respect to data inputs, model structure, and parameterization, which can produce significant effects on calibrating streamflow. Thus, in the current study, a prediction uncertainty analysis was conducted to enhance the interpretability and reliability of the results. In contrast to parameter-based uncertainty methods, the approach does not assume any distribution for model residuals and hence ensures a fair comparison of uncertainty across models or datasets. For each of the model setups, residuals between observed and simulated streamflow were estimated, and a 95% predictability band was determined based on the 2.5th and 97.5th percentiles of the residual distribution. To quantitatively describe model uncertainty, we derived the p-factor and r-factor based on the fraction of observations within this uncertainty band and the width of the uncertainty bandwidth normalized by the standard deviation of observed streamflow. By jointly considering these two indicators, this approach balances coverage and magnitude of uncertainty estimation compared to traditional accuracy metrics, thus offering additional information regarding the reliability of model predictions [51].

4. Results

4.1. Evaluation of Meteorological Data for the Climate Scenario

Meteorological datasets for climate scenarios were evaluated by comparing them with observed records on a monthly scale. For this purpose, the evaluation was carried out using precipitation, maximum and minimum temperatures, and relative humidity values which are the main variables in the hydrologic modeling study. The results showed that in some months, the maximum and minimum values in climate scenarios occurred at different times compared to the observations, and significant differences were observed in the monthly distribution of the variables.
As shown in Figure 4, precipitation values show a general trend toward higher values over several months, with particularly pronounced increases and irregular patterns observed in the EC-Earth3 model. Similarly, Figure 5, Figure 6 and Figure 7 illustrate seasonal shifts in temperature and relative humidity values.
This situation relates to the structural changes expected to occur in the temporal distribution of atmospheric processes under climate change. Shifts in the start and end times of seasons, especially in variables such as precipitation and relative humidity, indicate that the hydrometeorological regime may change in the future not only in terms of magnitude but also in terms of interseasonal transitions and temporal continuity. Therefore, it is extremely important to consider the impact of seasonal shifts in future assessments. These results are consistent with those reported in similar studies in the literature [52,53,54,55,56].

4.2. Performance Results of the SWAT Model

Within this study, 11 different model configurations were developed to assess streamflow dynamics within the Büyük Menderes River Basin. At this stage, three different modelling strategies were adopted, and their structural characteristics and fundamental differences are summarized in Table 2.
First, the M1 model, driven by ground-based observations, was set up with the use of the ArcSWAT interface. Model calibration was performed in SWAT-CUP using the SUFI-2 algorithm. The calibration procedure started by performing sensitivity analysis, where the 20 most influential parameters were identified and optimized. The optimal values of the most sensitive SWAT parameters identified through the SUFI-2 calibration procedure for the M1 and M3 configurations are summarized in Table 3, providing transparency and reproducibility of the calibration process. The model has achieved a very good performance, with an NSE of 0.80, demonstrating an adequate fit to the monthly observed streamflow and showing further improvement during the validation period. Moreover, the time-series comparison showed that the model managed to preserve both in terms of magnitude and timing of observed flow. Subsequently, the ArcSWAT model was re-implemented using the meteorological satellite inputs. In the M2 configuration, the parameter set obtained from M1 was kept unchanged and used in SWAT-CUP for model recalibration. Performance was found close to M1 (NSE ≈ 0.78), despite some discrepancies that happened in periods of high flow. From this result, it is evident that satellite-based meteorological data can be a very effective complement or surrogate for ground observations in cases of missing records or measurement errors without disrupting the model structure. However, it should be noted that the transfer of parameter values between different datasets may introduce structural inconsistencies and therefore requires careful and controlled application. Lastly, a fully satellite-driven configuration (M3) was established in which model parameters were independently calibrated using only satellite inputs. It should be noted that within the SUFI-2 framework, parameter values are explored within predefined uncertainty ranges, and the reported values represent optimal parameter combinations that yield the best model performance rather than unique or fixed physical constants. Results indicated that M3 was able to achieve performances similar to M1, with NSE values close to 0.8 and reduced error in peak flows with respect to M2. Thus, the M3 configuration represents the reference framework for the evaluation of the feasibility of satellite-based meteorological data as an input surrogate to ground observation data in the hydrological modelling chain. Figure 8 provides a visual summary of the three SWAT configurations (M1–M3), highlighting their input structures and the differences in calibration settings.

4.3. Performance Results of AI-Based Models

In the second part of this study, attention was given to the second research question—whether AI methods can replace the process-based SWAT model. To answer this, the XGBoost algorithm was implemented, which is widely recognized for its superior performance among classical ML methods. The Optuna framework was used to optimize hyperparameters of the model; thus, overfitting was avoided, and the model demonstrated strong generalization capability. To ensure a fairer evaluation process, the naming convention was aligned with the evaluation structure used in the SWAT model, and the terms ‘calibration’ and ‘validation’ were adopted. It is important to maintain the time series chronological structure, thus keeping the comparison feasible with the SWAT calibration–validation framework. Lastly, similar to the SWAT experiment, a three-stage modelling strategy was employed in XGBoost. Their structural characteristics of the models and fundamental differences are summarized in Table 4.
XGBoost and WGAN are built upon an identical framework regarding data structure and the formulation of input features. The input list includes precipitation (pcp), maximum temperature (tmax), minimum temperature (tmin), mean temperature (tmean), relative humidity (relhum), wind speed (wnd), and solar radiation (sol), while the output is the monthly streamflow (Q). Temporal dependencies are introduced in both configurations by a set of lagged variables defined as LAGS = (1, 2, 3, 12, 24) and three-month moving averages computed with ROLLS = (3). Such a setting allows both models to learn the short-term (1–3 months), medium-term (12 months), and long-term (24 months) hydro-meteorological effects simultaneously. Meanwhile, the short-term fluctuations in this input series are smoothed out by a moving average filter. The chosen lag structure is hydrologically based, representing various memory components of the catchment system. Short-term lags (of 1–3 months) denote direct runoff generation and short-term persistence of soil moisture, while the 12-month lag is indicative of seasonal storage and release mechanisms linked with the annually varying hydroclimatic cycle. The 24-month lag reflects long-term groundwater response and multi-seasonal memory effects impacting delayed streamflow contributions. The three-month moving average also smoothes out high-frequency variation and represents accumulated antecedent wetness, which is often associated with the time series autocorrelation of streamflow in monthly discharge data. Thus, both XGBoost and WGAN perform the streamflow prediction based on the same feature matrix but under different learning paradigms.
The M4 model, despite relying solely on ground-based meteorological inputs and having a relatively simple structure, delivered a strong predictive performance. The results showed that XGBoost, with an NSE of 0.87, slightly outperformed SWAT, which achieved an NSE of 0.86, indicating that XGBoost provided a marginally higher predictive capability in this setting. Next, in order to define the M5 configuration, the satellite meteorological data were brought in by using the optimized hyperparameters. M5 reproduced quite well M4; however, minor discrepancies were observed at extreme flow conditions, most probably related to inconsistencies between the ground and satellite data, as already observed in the SWAT M2 setup. Finally, the M6 configuration consisted of the XGBoost model, independently recalibrated using only satellite data. This version of the model reached an NSE of about 0.87, in very good agreement with observations, showing robustness under data-scarce conditions as well (Figure 9).
Even though XGBoost was able to prove that ML models could replace process-based approaches, their “black-box” nature remained a key limitation. In this way, the contribution of SWAT-derived streamflow targets, the influence of antecedent hydrological conditions represented through lagged inputs, the role of weighted moving averages, and the seasonal signal encoded via sine–cosine transformations were quantitatively interpreted, making the hybrid model substantially more explainable. Figure 10 shows the SHAP-based interpretability result of the XGBoost model. The feature importance ranking indicated that pcp_roll3 (the 3-month moving average of precipitation) was by a significant margin the most important predictor of streamflow, followed by total precipitation, maximum temperature, and relative humidity. In the SHAP dependence plot, a strongly nonlinear response is revealed: for low values of pcp_roll3 the SHAP values are close to zero or even slightly negative, showing that the predictor has a limited impact on the predicted discharge. Beyond a threshold of accumulation of precipitation, SHAP values start to grow fast, thus indicating that the model strongly responds to increased cumulative rainfall. Here, rolls represent the moving average operation (temporal smoothing), while lags represent the lagged values of each input variable (past observations capturing memory effects).
These results hydrologically imply that the model captures the essence of threshold-driven runoff generation, where limited precipitation under dry antecedent conditions yields extremely low flow and sustained or intense precipitation over several consecutive months leads to a sudden increase in streamflow due to the saturation of soil and increased surface runoff. On the other hand, the effect of temperature and humidity demonstrates the indirect influence of evapotranspiration and atmospheric moisture conditions on available water. The SHAP analysis confirmed that XGBoost emulates a physically consistent rainfall–runoff relationship, with precipitation persistence being the primary cause of streamflow variability.
Following the same framework, the WGAN model was developed and evaluated (M7–M9). At this stage, another three different modelling strategies were formed, and their characteristics are given in Table 5.
The WGAN model trained on observed data (M7) performed comparably to SWAT, while the satellite-based configurations (M8 and M9) exhibited lower accuracy. This behavior can be explained by the probabilistic learning mechanism of WGAN, which is optimized to approximate the overall data distribution rather than explicitly fitting rare extreme events. Notably, mode averaging can occur during the adversarial training procedure where the generator focuses on high density parts of target distribution and smooths the tails. Furthermore, the considered loss function highlights global distributional similarity that in turn can lead to reduced sensitivity against low-frequency and high-magnitude extremes leading to a systematic underestimation of peak flows (Figure 11).
Figure 12 shows the SHAP-derived results of interpretability for the WGAN. According to the ranking of feature importance, precipitation (pcp) is the dominant predictor of streamflow, and its lagged and smoothed versions follow along with that (pcp_lag1, pcp_lag3, pcp_roll3). This implies that the dominant dependence on rainfall persistence depicts hydrological memory. It is observed that the SHAP plot shows that SHAP values remain around zero during low rainfall conditions but increase when the threshold is exceeded, thus indicating an intense rainfall–runoff response. The performed SHAP analysis confirms that the cumulative and the lag-dependent effect on the precipitation is captured by the WGAN in a physically consistent manner.

4.4. Results of Hybrid Modelling Approaches (SWAT + XGBoost and SWAT + WGAN)

To overcome the structural limitations and systematic biases of the process-based SWAT model such as parametrization-induced uncertainty and calibration-related biases in runoff generation, the third stage of the study implemented an integrated hybrid correction framework. Such an end has been met by hybridizing the SWAT output with machine learning models, namely XGBoost and WGAN. The structural details of the hybrid modelling configurations are presented in Table 6.
In the M10 setting, the SWAT-simulated streamflow was used as an input to an XGBoost model. To make the model more stable and to ensure that it correctly reflects day-to-day changes in discharge, lagged streamflow values were also included as additional predictors. Furthermore, sinusoidal sin–cos seasonality transformations were applied as additional predictors to explicitly represent the periodic hydrological patterns. Such an outcome hybrid model greatly enhanced SWAT’s performance to yield an NSE of 0.86 upon validation (Figure 13). This hybrid configuration demonstrated substantial improvements in reproducing both peak and low-flow conditions, providing clear evidence that it effectively corrected the systematic biases inherent in the SWAT simulations.
Figure 14 displays the SHAP-derived interpretability results from the SWAT + XGBoost hybrid model. Ranking the feature importance analysis demonstrates that SWAT-simulated discharge output rank highly (SWAT_output, Q) is by far the strongest predictor within the post-processing system, seconded by the seasonally varying cosine term and the discharge lag variables (Q_lag3, Q_lag12 and Q_lag24). The results show that the hybrid model primarily relies on the streamflow simulated by SWAT. Lagged values and the seasonal sin–cos terms then help the model adjust and refine this initial estimate. The SHAP dependence plots support this behavior. As the SWAT-predicted discharge increases, the corresponding SHAP values also rise almost linearly. This means that higher SWAT estimates are consistently translated into higher corrected discharge outputs in the hybrid model. This behavior illustrates effectively the XGBoost component to serve the purpose of the correction bias layer and the nonlinear calibration layer by learning systematic SWAT output deviations and retaining the overall physical consistency.
The action suggests the hybrid picks up the process-based strengths of SWAT yet benefits from the flexible nature of XGBoost to alleviate residual biases due to parameter uncertainty or limitations within input data. Practically, the SWAT + XGBoost hybrid thus retains the physical basis of the simulation but enhances prediction accuracy, especially within high-flow regimes and seasonal transition periods.
Using the same procedure, the M11 configuration was created by merging the SWAT and WGAN models. This configuration achieved the best overall performance by obtaining a validation NSE of 0.9, representing the best-performing configuration within this research. Figure 15 shows the SHAP-based results on the interpretation of the SWAT + WGAN hybrid model. Panel (a) shows the ranking of the feature importance and indicates that the SWAT discharge output (Q) is the most dominant predictor within the hybrid configuration. This suggests that the physical model output is relied upon strongly by the WGAN model to produce precise predictions of streamflow. Also among the dominant contributors were the cosine and sine transformations, which were included to accommodate seasonal cycles, indicating the dominant influence that the seasonally related patterns exert on the model’s prediction pattern. Also included as supplementary temporal information were the input discharge values at various lags (Q_lag3, Q_lag2, Q_lag12), which captured the short- to medium-term dependence on the antecedent flow. Panel (b) is the SHAP dependence plot on the SWAT output. A sharp, almost linear relationship exists between the SWAT discharge and its corresponding SHAP effect, indicating that the hybrid retains an explicit and constant sensitivity to the physical model’s output. When a secondary seasonality-related feature is added, the modeling relationship is much better captured, resulting in stronger SHAP effects.
Overall, the obtained results from Figure 15 also affirm that not only does the SWAT + WGAN model take advantage of the prediction capability of the physical model but also effectively incorporates temporal memory and seasonal signals, thereby producing physically consistent streamflow simulations.
These outcomes demonstrate that the integration of physically based and AI models can significantly enhance hydrological simulation performance. By unifying process-based knowledge from SWAT and the data-driven flexibility between XGBoost and WGAN, the hybrid models successfully rectified SWAT’s inherent structural and input-related biases. Such unification not only elevated the prediction performance but also maintained physical consistency and enabled the models to perceive nonlinear rainfall–runoff reactions and long-term hydrological behaviors more reliably.

4.5. Overall Comparison and Determination of the Most Effective Modelling Framework

Based on the comparisons among 11 models with various implementation strategies, it was observed that all the models performed generally well. This finding illustrates that not only can physically based but also AI–based methods effectively simulate hydrological processes. The hybrid architecture models, particularly the SWAT + WGAN setting, proved to enhance substantially the accuracy of streamflow simulation. Such hybrid models effectively rectified the SWAT model’s input-related uncertainty and structure-oriented biases by integrating physically oriented process information and data-oriented learning potential. Furthermore, the XGBoost model also duplicated and sometimes exceeded the SWAT model’s accuracy level, suggesting data-driven approaches can rival physically based models in performance level. Another critical result is that the models developed using satellite-based meteorological data showed remarkably similar performance to corresponding models developed with ground observation. This implies that satellite data can serve as a reliable substitute for observed coverage that is limited within basins and can partially replace ground-based input information in hydrological modelling research (Figure 16). In Figure 16a, the black-framed cells indicate the best-performing results across the evaluated performance metrics, highlighting the models that achieved superior overall accuracy during both calibration and validation phases.
In addition, the box plot examination of the calibration and validation measures also offered further information regarding the steadiness and distribution of model performance. Opaque interquartile ranges derived for the hybrid models, especially at the validation stage, validated their strength and ability to generalize. On the other hand, wider distributions for the purely data-driven models represent increased input data quality and calibration sensitivity. Overall, the performance metrics corroborated the insights derived from the heatmap analysis, further emphasizing that hybrid models and well-calibrated AI approaches achieve higher accuracy and stability across multiple simulation time scales (Figure 16b,c).
A comprehensive uncertainty analysis was performed for all modelling configurations to complement the performance-based evaluation. The results demonstrate that the uncertainty metrics are fully consistent with the conventional hydrological performance indicators. In particular, hybrid models exhibited more concentrated uncertainty ranges, confirming their superior robustness. Among all configurations, the SWAT–WGAN hybrid model yielded the lowest r-factor values, indicating the narrowest predictive uncertainty while maintaining high observational coverage. As an example, the uncertainty band graph for the M11 model, which has the highest performance, is given in Figure 17. As illustrated in Figure 17, the observed streamflow is largely encompassed by the 95% uncertainty band, while the band widens during high-flow events, reflecting increased uncertainty under peak-flow conditions and a hydrologically consistent uncertainty behavior. Band graphs generated for the other models are presented in Figure A1, Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7, Figure A8, Figure A9 and Figure A10. These findings corroborate the performance metrics and highlight the ability of physically informed hybrid frameworks to reduce predictive uncertainty compared to standalone physically based or purely data-driven models (Figure 18).

4.6. Model Integration of Climate Change Scenarios

With the top-performing configuration, SWAT + WGAN, the last phase of the study examined the effect of future climatic change scenarios on the streamflow within the Büyük Menderes River Basin. Simulations were generated by using three climate models (MPI-ESM1-2, GFDL-ESM4, and EC-Earth3) and two Shared Socioeconomic Pathways (SSP245 and SSP585) for three future intervals (2018–2040, 2041–2060, and 2061–2099). Statistical properties of anticipated flows for each variant are tabulated in Figure 19.
Generally, the results show an overall decreasing trend in mean streamflow through all scenarios. For example, compared to the reference (1995–2017), the mean discharge will be reduced by up to 58% through MPI-ESM1-2 SSP245 and by about 42% through SSP585, indicating an observable drying trend within the hydrological regime within the basin. Moderate increases have been recorded in some instances (particularly EC-Earth3 SSP585 in the 2018–2040 period), which are a consequence of the seasonal shift in meteorological inputs. In accordance with the decreasing magnitudes of the flows, the increasing skewness and kurtosis indicators suggest more instances of extreme events, and longer durations of low-flow periods and more rapidly increasing peaks. This anticipated growth in the irregularity of the flows underscores the basin’s elevated sensitivity to climatic forcing upon its hydrological properties. These results are significant as they reflect the potential future changes in the availability and irregularity of surface waters in the basin and serve as a guide for water resources planning efforts in the region (Figure 19).

5. Discussions

The results of this study show that the selection of meteorological data source and modeling method are dominant factors causing considerable variability in hydrological model performance. This comparison went beyond the customary ways of comparing only one type of model by using three types of models, namely observation-driven, satellite-driven, and parameter-transferred models and providing a systematic comparison. The results show that satellite data are suitable and alternative sources to the station-based data for physically based hydrological models. Unlike most of the previous studies that are based on ANN or SVR, this study applies more advanced algorithms, including XGBoost and WGAN, which employ meteorological inputs identical to SWAT. These findings indicate that model performance depends not only on the algorithm but also on data structure and process representation, implying that even complex models like SWAT can be further improved when supported by well-organized algorithms. As most of the model configurations exhibited robust prediction abilities with NSE values between 0.78 and 0.98 and KGE values between 0.76 and 0.93, supporting the credibility of the framework presented here. SWAT-XGBoost and SWAT-WGAN, which are proposed as encoder-decoder-based hybrid models have also proven their capability in enhancing the forecasting accuracy, even better than physically based or data-driven counterparts, with the hybrid AI-augmented structures providing the highest accuracy increments. Of these models, the SWAT + WGAN structure exhibited the most resilient performance at validation (NS = 0.90, KGE = 0.89, R2 = 0.90, RMSE = 1.87), while SWAT + XGBoost realized near-perfect calibration accuracy (NS = 0.98, KGE = 0.93, R2 = 0.98). Such evidence highlights that the unification of physical process depiction and AI-derived basis correction could reduce structure error effectively and elevate hydrological realism. Another notable finding is that satellite-derived meteorological data can bridge the ground observation gap, which is a highly applicable improvement for data-scarce basins. Additionally, climate projections from an ensemble of GCMs (MPI-ESM1-2 HR, GFDL ESM4, and EC-Earth3) under the SSP2-4.5 and SSP5-8.5 scenarios showed a decrease in average flow and a higher skewness in future periods. Additionally, average flow has been observed during certain periods in the Ec-Earth3 scenario. This situation has emerged as a result of seasonal shifts in the scenario’s meteorological data. Such combined evidence certifies the competency of the proposed framework to inject physical consistency with AI flexibility to predict future climate impacts within an integrated modelling paradigm that remains functional under varying data availability and climate regimes. In addition to the performance-based evaluation and interpretability analyses, the physical consistency of the proposed hybrid framework was carefully considered. Apart from interpretability analyses like SHAP, physical meaning within the hybrid models proposed is guaranteed to a large extent by the explicit integration of physically grounded SWAT simulations as the core of the modelling framework. In such hybrid frameworks, AI models are not substitutes for hydrological process representations, but rather correctors that capture residual nonlinear links and systematic biases. Thereby, basic hydrological behaviors of the training sample set in terms of seasonal flow variations, runoff generating mechanisms and mass balances reflected by the original SWAT model are retained to increase the prediction accuracy with data-driven adjustments.

6. Conclusions

In this study, we propose a modeling approach that effectively combines process-based and data-driven modeling. It systematically explores the consistency of input data, generalization of the model, and projection skill for climate change impacts between physically based, statistical/machine learning based and hybrid approaches. This study is one of the most comprehensive comparative analyses to date. The research results enrich the literature of hydrological model comparison by investigating eleven different model structures including different meteorological inputs, calibration approaches and hybrid models.
The study results essentially yielded the following findings: (i) the SWAT model performs adequately and reliably when calibrated with observational data; (ii) using satellite-derived meteorological inputs allows the models to retain satisfactory performance, indicating that satellite products can serve as practical alternatives in data-limited basins; (iii) data-driven approaches such as XGBoost and WGAN also produced high predictive accuracy under the available data conditions, showing that—depending on the user’s expertise, the purpose of the study, and the structure or volume of the available data—these models may be deliberately selected as alternatives to SWAT or other process-based frameworks; (iv) when a physical model such as SWAT is hybridized with data-driven components, its predictive performance increases, showing that physically based models can be further enhanced through algorithmic support; and (v) under different future climate scenarios, most simulations reveal a tendency toward reduced mean flows. In the presented study, multiple AI, hydrological, and hybrid models were evaluated together, demonstrating their applicability to basins at different scales and their extensibility. The next step in developing this work is to adapt spatial-temporal AI topologies, including transformers or graph neural networks, and to implement uncertainty quantification through ensemble hybridization strategies. Overall, this study advances hydrological modeling beyond its traditional scope by demonstrating that AI-assisted hybridization can achieve both higher predictive performance and larger, physically interpretable functions. This framework is considered data-efficient and technology-compatible, suitable for resilient and adaptable basin management under changing climate conditions.

Author Contributions

Conceptualization, S.N.Y. and G.O.G.; methodology, S.N.Y.; software, S.N.Y.; validation, S.N.Y. and G.O.G.; formal analysis, S.N.Y.; investigation, S.N.Y.; resources, S.N.Y. and G.O.G.; data curation, S.N.Y.; writing—original draft preparation, S.N.Y. and G.O.G.; writing—review and editing, G.O.G.; visualization, S.N.Y.; supervision, G.O.G.; project administration, G.O.G.; funding acquisition, G.O.G. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Scientific and Technological Research Council of Türkiye (TUBİTAK), Directorate for Scientist Support Programs (BİDEB), under the 2211-E Domestic Direct PhD Scholarship Program. The application number of the scholarship is 1649B032205678.

Data Availability Statement

The DEM data used in this study were obtained from the GLO-30 dataset provided by the Copernicus programme. All libraries and input data used for the SWAT model setup were sourced from the official SWAT website (https://swat.tamu.edu/). Ground-based meteorological and hydrological observation data were acquired from the Turkish State Meteorological Service (MGM) and the General Directorate of State Hydraulic Works (DSI), respectively. The Hydrological Soil Group (HSG) data were obtained from the Global Hydrological Soil Groups (HYSOGs250m) dataset. Satellite-based meteorological data were retrieved from the NASA POWER database (https://power.larc.nasa.gov/). Future climate projections were based on the NASA NEX-GDDP-CMIP6 dataset.

Acknowledgments

The authors extend their sincere appreciation to the institutions and platforms that supported this study by providing access to the required datasets. We express our thanks to the Copernicus programme for the GLO-30 elevation data, to NASA’s POWER project for the satellite-based meteorological inputs, and to the SWAT development team for making their modelling resources and documentation publicly available. We also thank the Turkish State Meteorological Service (MGM) and the General Directorate of State Hydraulic Works (DSI) for providing ground-based observation data. We further thank NASA for providing access to the NEX-GDDP-CMIP6 climate projections. The authors also extend their appreciation to the anonymous reviewers for their valuable feedback and suggestions, which helped improve the quality of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANNArtifical Neural Networks
AIArtifical Intelligence
CHIRPSClimate Hazards Group InfraRed Precipitation with Station data
CMIP6Coupled Model Intercomparison Project Phase 6
DSIGeneral Directorate of State Hydraulic Works, Türkiye
Ec-Earth3EC-Earth Earth System Model, version 3
GMAOGlobal Modeling and Assimilation Office
GFDL-ESM4Geophysical Fluid Dynamics Laboratory Earth System Model, version 4
SHAPSHapley Additive exPlanations
SSPShared Socioeconomic Pathways
SVRSupport Vector Regression
SWATSoil and Water Assessment Tool
MGMGeneral Directorate of Meteorology, Türkiye
MPI-ESM1-2Max Planck Institute Earth System Model, version 1.2
XGBoostExtreme Gradient Boosting
WGANWasserstein Generative Adversarial Network

Appendix A

Figure A1. Uncertainty band of M1 model output and observed flow.
Figure A1. Uncertainty band of M1 model output and observed flow.
Water 18 00239 g0a1
Figure A2. Uncertainty band of M2 model output and observed flow.
Figure A2. Uncertainty band of M2 model output and observed flow.
Water 18 00239 g0a2
Figure A3. Uncertainty band of M3 model output and observed flow.
Figure A3. Uncertainty band of M3 model output and observed flow.
Water 18 00239 g0a3
Figure A4. Uncertainty band of M4 model output and observed flow.
Figure A4. Uncertainty band of M4 model output and observed flow.
Water 18 00239 g0a4
Figure A5. Uncertainty band of M5 model output and observed flow.
Figure A5. Uncertainty band of M5 model output and observed flow.
Water 18 00239 g0a5
Figure A6. Uncertainty band of M6 model output and observed flow.
Figure A6. Uncertainty band of M6 model output and observed flow.
Water 18 00239 g0a6
Figure A7. Uncertainty band of M7 model output and observed flow.
Figure A7. Uncertainty band of M7 model output and observed flow.
Water 18 00239 g0a7
Figure A8. Uncertainty band of M8 model output and observed flow.
Figure A8. Uncertainty band of M8 model output and observed flow.
Water 18 00239 g0a8
Figure A9. Uncertainty band of M9 model output and observed flow.
Figure A9. Uncertainty band of M9 model output and observed flow.
Water 18 00239 g0a9
Figure A10. Uncertainty band of M10 model output and observed flow.
Figure A10. Uncertainty band of M10 model output and observed flow.
Water 18 00239 g0a10

References

  1. Goswami, G.; Prasad, R.K.; Mandal, S. Streamflow variability under SSP2-4.5 and SSP5-8.5 climate scenarios using QSWAT plus for Subansiri River Basin in Arunachal Pradesh, India. Theor. Appl. Climatol. 2025, 156, 260. [Google Scholar] [CrossRef]
  2. Kundzewicz, Z.W. Climate change impacts on the hydrological cycle. Ecohydrol. Hydrobiol. 2008, 8, 195–203. [Google Scholar] [CrossRef]
  3. Anand, V.; Oinam, B.; Wieprecht, S.; Singh, S.K.; Srinivasan, R. Enhancing hydrological model calibration through hybrid strategies in data-scarce regions. Hydrol. Process. 2024, 38, 15084. [Google Scholar] [CrossRef]
  4. Benkirane, M.; Amazirh, A.; Laftouhi, N.E.; Khabba, S.; Chehbouni, A. Assessment of GPM satellite precipitation performance after bias correction, for hydrological modeling in a semi-arid watershed (High Atlas Mountain, Morocco). Atmosphere 2023, 14, 794. [Google Scholar] [CrossRef]
  5. Brocca, L.; Massari, C.; Pellarin, T.; Filippucci, P.; Ciabatta, L.; Camici, S.; Kerr, Y.H.; Fernández-Prieto, D. River flow prediction in data scarce regions: Soil moisture integrated satellite rainfall products outperform rain gauge observations in West Africa. Sci. Rep. 2020, 10, 12517. [Google Scholar] [CrossRef]
  6. Lee, S.; Kim, J.; Lee, G.; Hong, J.; Bae, J.H.; Lim, K.J. Prediction of aquatic ecosystem health indices through machine learning models using the WGAN-based data augmentation method. Sustainability 2021, 13, 10435. [Google Scholar] [CrossRef]
  7. Nyeko, M. Hydrologic modelling of data scarce basin with SWAT model: Capabilities and limitations. Water. Resour. Manag. 2015, 29, 81–94. [Google Scholar] [CrossRef]
  8. Tran, T.N.D.; Lakshmi, V. Visualization-driven hydrologic assessment using gridded precipitation products. Hydrol. Process. 2024, 38, e15286. [Google Scholar] [CrossRef]
  9. Wang, K.; Shi, H.; Chen, J.; Li, T. An improved operation-based reservoir scheme integrated with variable infiltration capacity model for multiyear and multipurpose reservoirs. J. Hydrol. 2019, 571, 365–375. [Google Scholar] [CrossRef]
  10. Alfieri, L.; Avanzi, F.; Delogu, F.; Gabellani, S.; Bruno, G.; Campo, L.; Libertino, A.; Massari, C.; Tarpanelli, A.; Rains, D.; et al. High resolution satellite products improve hydrological modeling in northern Italy. Hydrol. Earth Syst. Sci. 2022, 26, 3921–3939. [Google Scholar] [CrossRef]
  11. Hrour, Y.; Thomas, Z.; Rousseau-Gueutin, P.; Ait-Brahim, Y.; Fovet, O. Enhancing hydrological modeling with bias-corrected satellite weather data in data-scarce catchments: A comparative analysis of SWAT and GR4J models. Front. Water 2025, 7, 1582589. [Google Scholar] [CrossRef]
  12. Wongchuig, S.; Paiva, R.; Siqueira, V.; Papa, F.; Fleischmann, A.; Biancamaria, S.; Al Bitar, A. Multi-satellite data assimilation for large-scale hydrological-hydrodynamic prediction: Proof of concept in the Amazon basin. Water Resour. Res. 2024, 60, e2024WR037155. [Google Scholar] [CrossRef]
  13. Massari, C.; Crow, W.; Brocca, L. An assessment of the performance of global rainfall estimates without ground-based observations. Hydrol. Earth Syst. Sci. 2017, 21, 4347–4361. [Google Scholar] [CrossRef]
  14. Bloom, S.; da Silva, A.; Dee, D.; Bosilovich, M.; Chern, J.D.; Pawson, S.; Schubert, S.; Wu, M.L.; Sienkiewicz, M.; Stajner, I. Documentation and Validation of the Goddard Earth Observing System (GEOS) Data Assimilation System, Version 4; NASA/TM; NASA: Washington, DC, USA, 2005; Volume 26.
  15. Bosilovich, M.G.; Robertson, F.R.; Takacs, L.; Molod, A.; Mocko, D. Atmospheric water balance and variability in the MERRA-2 reanalysis. J. Clim. 2017, 30, 1177–1196. [Google Scholar] [CrossRef]
  16. Ebert, E.E. Methods for verifying satellite precipitation estimates. In Measuring Precipitation from Space: EURAINSAT and the Future; Springer: Berlin/Heidelberg, Germany, 2007; pp. 345–356. [Google Scholar]
  17. Hegyi, B.; Stackhouse, P.W.; Taylor, P.; Patadia, F. NASA POWER: Providing present and future climate services based on NASA data for the energy, agricultural, and sustainable buildings communities. In Proceedings of the 104th American Meteorological Society (AMS) Annual Meeting, Baltimore, MD, USA, 28 January–1 February 2024. [Google Scholar]
  18. Yang, S.; Yang, D.; Chen, J.; Santisirisomboon, J.; Lu, W.; Zhao, B. A physical process and machine learning combined hydrological model for daily streamflow simulations of large watersheds with limited observation data. J. Hydrol. 2020, 590, 125206. [Google Scholar] [CrossRef]
  19. Abbaspour, K.C.; Yang, J.; Maximov, I.; Siber, R.; Bogner, K.; Mieleitner, J.; Srinivasan, R. Modeling hydrology and water quality in the pre-alpine/alpine Thur watershed using SWAT. J. Hydrol. 2007, 333, 413–430. [Google Scholar] [CrossRef]
  20. Abbaspour, K.C. SWAT Calibration and Uncertainty Programs—A User Manual; Swiss Federal Institute of Aquatic Science and Technology: Eawag, Switzerland, 2015. [Google Scholar]
  21. Núñez, J.; Cortés, C.B.; Yáñez, M.A. Explainable artificial intelligence in hydrology: Interpreting black-box snowmelt-driven streamflow predictions in an arid Andean basin of north-central Chile. Water 2023, 15, 3369. [Google Scholar] [CrossRef]
  22. Wu, S.; Dong, Z.; Guzmán, S.M.; Conde, G.; Wang, W.; Zhu, S.; Meng, J. Two-step hybrid model for monthly runoff prediction utilizing integrated machine learning algorithms and dual signal decompositions. Ecol. Inform. 2024, 84, 102914. [Google Scholar] [CrossRef]
  23. Westra, S.; Brown, C.; Lall, U.; Sharma, A. Modeling multivariable hydrological series: Principal component analysis or independent component analysis? Water Resour. Res. 2007, 43, W06429. [Google Scholar] [CrossRef]
  24. Zhou, S.; Liu, Z.; Wang, M.; Gan, W.; Zhao, Z.; Wu, Z. Impacts of building configurations on urban stormwater management at a block scale using XGBoost. Sustain. Cities Soc. 2022, 87, 104235. [Google Scholar] [CrossRef]
  25. Höge, M.; Scheidegger, A.; Baity-Jesi, M.; Albert, C.; Fenicia, F. Improving hydrologic models for predictions and process understanding using neural ODEs. Hydrol. Earth Syst. Sci. 2022, 26, 5085–5102. [Google Scholar] [CrossRef]
  26. Yin, H.; Zhao, L.; Zhu, M.; Zhang, Y. Runoff prediction in gauged and ungauged basins using transformer-XAJ model. J. Hydrol. 2025, 662, 133954. [Google Scholar] [CrossRef]
  27. Kassem, A.A.; Raheem, A.M.; Khidir, K.M.; Alkattan, M. Predicting daily Khazir basin flow using SWAT and hybrid SWAT-ANN models. Ain Shams Eng. J. 2020, 11, 435–443. [Google Scholar] [CrossRef]
  28. Xiao, C.; Mohammaditab, M. Evaluation of the impact of hydrological changes on reservoir water management: A comparative analysis of the CanESM5 model and the optimized SWAT-SVR-LSTM. Heliyon 2024, 10, e37208. [Google Scholar] [CrossRef]
  29. Basakın, E.E.; Stoy, P.C.; Demirel, M.C.; Ozdogan, M.; Otkin, J.A. Combined drought index using high-resolution hydrological models and explainable artificial intelligence techniques in Türkiye. Remote Sens. 2024, 16, 3799. [Google Scholar] [CrossRef]
  30. Mushtaq, H.; Akhtar, T.; Masood, A.; Saeed, F. Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate-sensitive upper Indus catchments. Theor. Appl. Climatol. 2024, 155, 5525–5542. [Google Scholar] [CrossRef]
  31. Heydarizad, M.; Pumijumnong, N.; Minaei, M.; Salari, P.; Sorí, R.; Mohammadabadi, H.G. Exploring stable isotope patterns in monthly precipitation across Southeast Asia using contemporary deep learning models and SHapley Additive exPlanations (SHAP) techniques. Isot. Environ. Health Stud. 2025, 61, 547–568. [Google Scholar] [CrossRef] [PubMed]
  32. Asadi, S.; Jimeno-Sáez, P.; López-Ballesteros, A.; Senent-Aparicio, J. Comparison and integration of physical and interpretable AI-driven models for rainfall-runoff simulation. Results Eng. 2024, 24, 103048. [Google Scholar] [CrossRef]
  33. Parasar, P.; Krishna, A.P. Explainable AI-driven assessment of hydroclimatic interactions shaping river discharge dynamics in a monsoonal basin. Sci. Rep. 2025, 15, 27302. [Google Scholar] [CrossRef]
  34. IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2021; p. 3949. [Google Scholar]
  35. Karimazadeh, K.; Yi, J. Modeling hydrological responses of watershed under climate change scenarios using machine learning techniques. Water Resour. Manag. 2023, 37, 5235–5254. [Google Scholar] [CrossRef]
  36. Deng, C.; Jiang, X.; Jiang, C.; Nie, T.; Lei, Y.; Yang, A. Insights into teleconnection mechanism of extreme precipitation events based on the SHAP-XGBoost model: Evidence from Hekou-Longmen section in China. Nat. Hazards 2025, 121, 7447–7468. [Google Scholar] [CrossRef]
  37. Trenberth, K.E. The impact of climate change and variability on heavy precipitation, floods, and droughts. Encycl. Hydrol. Sci. 2008, 17. [Google Scholar] [CrossRef]
  38. Ministry of Agriculture and Forestry; General Directorate of Water Management. Büyük Menderes River Basin Management Plan. In Project: Technical Assistance for the Conversion of River Basin Protection Action Plans into River Basin Management Plans (TR2011/0327.21-05-01-001); Ministry of Agriculture and Forestry: Ankara, Turkey, 2018. [Google Scholar]
  39. Stackhouse, P.W.; Westberg, D.; Hoell, J.M.; Chandler, W.S.; Zhang, T. Prediction of Worldwide Energy Resource (POWER)–Agroclimatology Methodology (1.0° Latitude by 1.0° Longitude Spatial Resolution). NASA POWER Project Documentation. 2015. Available online: https://power.larc.nasa.gov/ (accessed on 15 December 2025).
  40. Arnold, J.G.; Allen, P.M.; Bernhardt, G. A comprehensive surface-groundwater flow model. J. Hydrol. 1993, 142, 47–69. [Google Scholar] [CrossRef]
  41. Onuşluel Gül, G.; Rosbjerg, D. Modelling of hydrologic processes and potential response to climate change through the use of a multisite SWAT. Water Environ. J. 2010, 24, 21–31. [Google Scholar] [CrossRef]
  42. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  43. Kruk, M. SHAP-NET, a network based on Shapley values as a new tool to improve the explainability of the XGBoost-SHAP model for the problem of water quality. Environ. Model. Softw. 2025, 188, 106403. [Google Scholar] [CrossRef]
  44. Liu, X.; Zhou, P.; Lin, Y.; Sun, S.; Zhang, H.; Xu, W.; Yang, S. Influencing factors and risk assessment of precipitation-induced flooding in Zhengzhou, China, based on random forest and XGBoost algorithms. Int. J. Environ. Res. Public Health 2022, 19, 16544. [Google Scholar] [CrossRef] [PubMed]
  45. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
  46. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2017. [Google Scholar]
  47. Zhu, B.; Hu, X. GWGAN-based realization process of gravel soil for hydraulic property simulation. Appl. Sci. 2024, 14, 9873. [Google Scholar] [CrossRef]
  48. Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
  49. O’Neill, B.C.; Tebaldi, C.; Van Vuuren, D.P.; Eyring, V.; Friedlingstein, P.; Hurtt, G.; Sanderson, B.M. The Scenario Model Intercomparison Project (ScenarioMIP) for CMIP6. Geosci. Model Dev. 2016, 9, 3461–3482. [Google Scholar] [CrossRef]
  50. Eyring, V.; Bony, S.; Meehl, G.A.; Senior, C.A.; Stevens, B.; Stouffer, R.J.; Taylor, K.E. Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev. 2016, 9, 1937–1958. [Google Scholar] [CrossRef]
  51. Gupta, H.V.; Kling, H.; Yilmaz, K.K.; Martinez, G.F. Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modeling. J. Hydrol. 2009, 377, 80–91. [Google Scholar] [CrossRef]
  52. Guyasa, A.K.; Zhang, D.; Guan, Y.; Niyongabo, A.; Ziyuan, W.; Yang, Y. Climate change impacts on river flow and extreme hydrological events in the Tendaho Catchment, Ethiopia. J. Water Clim. Change 2025, 16, 3243–3274. [Google Scholar] [CrossRef]
  53. Li, B.; Tan, L.; Zhang, X.; Qi, J.; Marek, G.W.; Li, Y.; Dong, X.; Zhao, W.; Chen, T.; Feng, P.; et al. Modeling streamflow response under changing environment using a modified SWAT model with enhanced representation of CO2 effects. J. Hydrol. Reg. Stud. 2023, 50, 101547. [Google Scholar] [CrossRef]
  54. Chakilu, G.G.; Sándor, S.; Zoltán, T.; Phinzi, K. Climate change and the response of streamflow of watersheds under the high emission scenario in Lake Tana sub-basin, upper Blue Nile basin, Ethiopia. J. Hydrol. Reg. Stud. 2022, 42, 101175. [Google Scholar] [CrossRef]
  55. Ebodé, V.B.; Dzana, J.G.; Nkiaka, E.; Nnomo, B.N.; Braun, J.J.; Riotte, J. Effects of climate and anthropogenic changes on current and future variability in flows in the So’o River Basin (south of Cameroon). Hydrol. Res. 2022, 53, 1203–1220. [Google Scholar] [CrossRef]
  56. Aruho Tusingwiire, M.; Tumutungire, M.D.; Sempewo, J.I.; Semiyaga, S. Impacts of climate and land use/cover change on mini-hydropower generation in River Kyambura watershed in South Western part of Uganda. Water Pract. Technol. 2023, 18, 1576–1597. [Google Scholar] [CrossRef]
Figure 1. Büyük Menderes Basin, selected subbasin and observation stations.
Figure 1. Büyük Menderes Basin, selected subbasin and observation stations.
Water 18 00239 g001
Figure 2. Overall methodological framework of the study.
Figure 2. Overall methodological framework of the study.
Water 18 00239 g002
Figure 3. SWAT hydrological cycle.
Figure 3. SWAT hydrological cycle.
Water 18 00239 g003
Figure 4. Monthly comparison of precipitation values.
Figure 4. Monthly comparison of precipitation values.
Water 18 00239 g004
Figure 5. Monthly comparison of maximum temperature values.
Figure 5. Monthly comparison of maximum temperature values.
Water 18 00239 g005
Figure 6. Monthly comparison of minimum temperature values.
Figure 6. Monthly comparison of minimum temperature values.
Water 18 00239 g006
Figure 7. Monthly comparison of relative humidity values.
Figure 7. Monthly comparison of relative humidity values.
Water 18 00239 g007
Figure 8. Time-series comparison of observed and simulated flows for M1–M3.
Figure 8. Time-series comparison of observed and simulated flows for M1–M3.
Water 18 00239 g008
Figure 9. Time-series comparison of observed and simulated flows for M4–M6.
Figure 9. Time-series comparison of observed and simulated flows for M4–M6.
Water 18 00239 g009
Figure 10. (a) SHAP feature importance for the XGBoost model (b) SHAP dependence plot for the XGBoost model.
Figure 10. (a) SHAP feature importance for the XGBoost model (b) SHAP dependence plot for the XGBoost model.
Water 18 00239 g010
Figure 11. Time-series comparison of observed and simulated flows for M7–M9.
Figure 11. Time-series comparison of observed and simulated flows for M7–M9.
Water 18 00239 g011
Figure 12. (a) SHAP feature importance for the WGAN model (b) SHAP dependence plot for the WGAN model.
Figure 12. (a) SHAP feature importance for the WGAN model (b) SHAP dependence plot for the WGAN model.
Water 18 00239 g012
Figure 13. Performance of Hybrid models M10–M11.
Figure 13. Performance of Hybrid models M10–M11.
Water 18 00239 g013
Figure 14. (a) SHAP feature significance for the SWAT + XGBoost model (b) SHAP dependency graph for the SWAT + XGBoost model.
Figure 14. (a) SHAP feature significance for the SWAT + XGBoost model (b) SHAP dependency graph for the SWAT + XGBoost model.
Water 18 00239 g014
Figure 15. (a) SHAP feature significance for the SWAT + WGAN model (b) SHAP dependency graph for the SWAT + WGAN model.
Figure 15. (a) SHAP feature significance for the SWAT + WGAN model (b) SHAP dependency graph for the SWAT + WGAN model.
Water 18 00239 g015
Figure 16. (a) Performance heatmap of all models (b) Calibration metric distributions (c) Validation metric distributions.
Figure 16. (a) Performance heatmap of all models (b) Calibration metric distributions (c) Validation metric distributions.
Water 18 00239 g016
Figure 17. Uncertainty band of M11 model output and observed flow.
Figure 17. Uncertainty band of M11 model output and observed flow.
Water 18 00239 g017
Figure 18. Uncertainty metrics (p-factor and r-factor) for all model configurations.
Figure 18. Uncertainty metrics (p-factor and r-factor) for all model configurations.
Water 18 00239 g018
Figure 19. Statistical characteristics of projected flows under six GCM-SSP scenarios.
Figure 19. Statistical characteristics of projected flows under six GCM-SSP scenarios.
Water 18 00239 g019
Table 1. Model Performance Metrics.
Table 1. Model Performance Metrics.
MetricFormulaSymbols
NSE N S E = 1 Q o b s Q s i m 2 Q o b s Q o b s ¯ 2 Q o b s : observed dischargebr
Q s i m : simulated discharge
Q o b s ¯ : mean of observed values
KGE K G E = 1 ( r 1 ) 2 + ( α 1 ) 2 + ( β 1 ) 2 r : correlation coefficient
α : variability ratio
β : bias ratio
R2 R 2 = Q o b s Q o b s ¯ Q s i m Q s i m ¯ Q o b s Q o b s ¯ 2 Q s i m Q s i m ¯ 2 2
RMSE R M S E = 1 n Q o b s Q s i m 2 n: number of observations
Table 2. Overview of the modelling strategies (SWAT).
Table 2. Overview of the modelling strategies (SWAT).
ModelDefinition
M1SWAT forced with ground-based meteorological data
M2SWAT forced with satellite-based meteorological data using M1-calibrated parameters.
M3SWAT fully recalibrated using satellite-based meteorological data only.
Table 3. Optimal values of the most sensitive SWAT parameters obtained through SUFI-2 calibration for the M1 and M3 configurations.
Table 3. Optimal values of the most sensitive SWAT parameters obtained through SUFI-2 calibration for the M1 and M3 configurations.
ParameterBest ValueParameterBest Value
M1M3M1M3
r__CN2.mgt−0.0950.012v__TRNSRCH.bsn0.290.28
v__EPCO.bsn0.260.99v__DEPIMP_BSN.bsn1899.404639.60
v__FFCB.bsn0.330.15v__TDRAIN_BSN.bsn23.3223.52
v__TIMP.bsn0.170.37v__CH_K2.rte49.50103.32
v__EVLAI.bsn3.47−0.36v__GW_DELAY.gw452.73444.90
r__SLSUBBSN.hru−0.45-v__RCHRG_DP.gw0.240.12
r__OV_N.hru156.6918.44v__GWQMN.gw1918.622383.70
v__LAT_TTIME.hru5.60176.31v__SURLAG.bsn14.788.90
v__SLSOIL.hru103.8393.97v__SURLAG.hru0.120.27
v__MSK_X.bsn0.200.07r__HRU_SLP.hru-4.33
v__EVRCH.bsn0.410.22
Table 4. Overview of the modelling strategies (XGBoost).
Table 4. Overview of the modelling strategies (XGBoost).
ModelDefinition
M4XGBoost model trained with ground-based meteorological data
M5XGBoost forced with satellite-based meteorological data using M1-calibrated parameters.
M6XGBoost model trained with satellite-based meteorological data
Table 5. Overview of the modelling strategies (WGAN).
Table 5. Overview of the modelling strategies (WGAN).
ModelDefinition
M7WGAN model trained with ground-based meteorological data
M8WGAN forced with satellite-based meteorological data using M1-calibrated parameters.
M9WGAN model trained with satellite-based meteorological data
Table 6. Overview of the two modelling strategies.
Table 6. Overview of the two modelling strategies.
ModelDefinition
M10The hybrid model in which the residual errors of the SWAT outputs are corrected using XGBoost.
M11The hybrid model in which the residual errors of the SWAT outputs are corrected using WGAN.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yeşilyurt, S.N.; Onuşluel Gül, G. Integration of Satellite-Derived Meteorological Inputs into SWAT, XGBoost, WGAN, and Hybrid Modelling Frameworks for Climate Change-Driven Streamflow Simulation in a Data-Scarce Region. Water 2026, 18, 239. https://doi.org/10.3390/w18020239

AMA Style

Yeşilyurt SN, Onuşluel Gül G. Integration of Satellite-Derived Meteorological Inputs into SWAT, XGBoost, WGAN, and Hybrid Modelling Frameworks for Climate Change-Driven Streamflow Simulation in a Data-Scarce Region. Water. 2026; 18(2):239. https://doi.org/10.3390/w18020239

Chicago/Turabian Style

Yeşilyurt, Sefa Nur, and Gülay Onuşluel Gül. 2026. "Integration of Satellite-Derived Meteorological Inputs into SWAT, XGBoost, WGAN, and Hybrid Modelling Frameworks for Climate Change-Driven Streamflow Simulation in a Data-Scarce Region" Water 18, no. 2: 239. https://doi.org/10.3390/w18020239

APA Style

Yeşilyurt, S. N., & Onuşluel Gül, G. (2026). Integration of Satellite-Derived Meteorological Inputs into SWAT, XGBoost, WGAN, and Hybrid Modelling Frameworks for Climate Change-Driven Streamflow Simulation in a Data-Scarce Region. Water, 18(2), 239. https://doi.org/10.3390/w18020239

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop