Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review

Gacu, Jerome G.; Monjardin, Cris Edward F.; Mangulabnan, Ronald Gabriel T.; Mendez, Jerime Chris F.

doi:10.3390/w17182722

Open AccessReview

Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review

by

Jerome G. Gacu

^1,2

,

Cris Edward F. Monjardin

^3,*

,

Ronald Gabriel T. Mangulabnan

^3,4

and

Jerime Chris F. Mendez

^3,5

¹

Department of Civil Engineering, Romblon State University, Odiongan 5505, Romblon, Philippines

²

Disaster Prevention Research Institute, Kyoto University, Kyoto 611-0011, Japan

³

School of Civil, Environmental and Geological Engineering, Mapua University, Manila 1002, Philippines

⁴

Department of Civil Engineering, National University, Philippines, Manila 1008, Philippines

⁵

Engineering Department, College of Engineering and Computational Sciences, Partido State University, Goa 4422, Camarines Sur, Philippines

^*

Author to whom correspondence should be addressed.

Water 2025, 17(18), 2722; https://doi.org/10.3390/w17182722

Submission received: 3 August 2025 / Revised: 28 August 2025 / Accepted: 12 September 2025 / Published: 14 September 2025

(This article belongs to the Special Issue Application of Machine Learning in Hydrologic Sciences)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Streamflow prediction in ungauged watersheds remains a critical challenge in hydrological science due to the absence of in situ measurements, particularly in remote, data-scarce, and developing regions. This review synthesizes recent advancements in artificial intelligence (AI) for streamflow modeling, focusing on machine learning (ML), deep learning (DL), and hybrid modeling frameworks. Three core methodological domains are examined: regionalization techniques that transfer models from gauged to ungauged basins using physiographic similarity and transfer learning; synthetic data generation through proxy variables such as NDVI, soil moisture, and digital elevation models; and model performance evaluation using both deterministic and probabilistic metrics. Findings from recent literature consistently demonstrate that AI-based models, especially Long Short-Term Memory (LSTM) networks and hybrid attention-based architectures, outperform traditional conceptual and physically based models in capturing nonlinear hydrological responses across diverse climatic and physiographic settings. The integration of AI with remote sensing enhances generalizability, particularly in ungauged and human-impacted basins. This review also addresses several persistent research gaps, including inconsistencies in model evaluation protocols, limited transferability across heterogeneous regions, a lack of reproducibility and open-source tools, and insufficient integration of physical hydrological knowledge into AI models. To bridge these gaps, future research should prioritize the development of physics-informed AI frameworks, standardized benchmarking datasets, uncertainty quantification methods, and interpretable modeling tools to support robust, scalable, and operational streamflow forecasting in ungauged watersheds.

Keywords:

streamflow prediction; ungauged basins; artificial intelligence; machine learning; remote sensing; regionalization; proxy variables; hydrological modeling; model evaluation; deep learning

1. Introduction

Hydrological models are vital tools in understanding and managing water resources, with streamflow prediction as one of their critical applications [1,2]. Accurate streamflow prediction is fundamental in water resource management, including flood management, irrigation design, reservoir operation, drought mitigation, and ecosystem protection [3,4]. For traditional hydrological models, such as the Soil and Water Assessment Tool (SWAT) [5], the Variable Infiltration Capacity (VIC) model, and Hydrologic Engineering Center’s Hydrologic Modeling System (HEC-HMS) [6,7], and Hydrologiska Byråns Vattenbalansavdelning (HBV) [8], these are widely applied for such purposes and typically require extensive physical data inputs, including meteorological, topographical, and hydrological information [9,10,11,12,13,14]. These models vary from empirical to physically based, but their performance often suffers when applied to new scenarios due to the complex interactions among the Earth’s systems [1]. Thus, the effectiveness of these models diminishes significantly in ungauged or poorly monitored watersheds where data are scarce or unavailable [15,16,17].

In recent years, the increasing impacts of climate variability, extreme weather events, and land-use changes have further underscored the need for reliable streamflow prediction models across diverse hydrological contexts. Traditional models, while grounded in physical principles, often require intensive calibration and may struggle with non-stationary behaviors in basins experiencing rapid environmental transformation [13]. Furthermore, operational deployment of these models is frequently constrained by computational cost, data availability, and expert intervention. These limitations have driven growing interest in data-driven approaches, particularly artificial intelligence (AI), which can offer adaptable, scalable, and near-real-time solutions for hydrological forecasting without depending solely on dense ground-based monitoring networks [18,19]. This paradigm shift positions AI not merely as a stopgap for data scarcity but as a transformative complement to traditional hydrological science.

Ungauged watersheds are catchment areas lacking direct hydrological measurements, particularly streamflow data, which are essential for understanding and managing water resources [20]. The significant barrier to effective water resource planning is the limited availability of gauging stations that are prevalent in developing regions [21,22]. For instance, in Canada, nearly 90% of the terrestrial area is under-monitored, with almost 40% classified as ungauged based on the surface water monitoring density standards developed by the World Meteorological Organization [23,24]. In the Sahel region of Africa, many watersheds are still ungauged or poorly gauged, leading to challenges in the assessment of water flowing into the lake and the overall runoff over their watersheds [25]. Similarly, in the Congo River Basin, despite its abundant freshwater resources, there is a lack of comprehensive hydrological data, making it challenging to assess the impacts of climate change on water availability [26]. The implications of this data scarcity are far-reaching—not only does it hinder local water resource planning, but it also affects transboundary water governance, climate change adaptation, and disaster risk reduction efforts. The lack of reliable hydrological information in many parts of the world is primarily attributed to institutional challenges, including insufficient funding for monitoring infrastructure, limited technical capacity, and fragmented data management systems [27]. In some regions, geopolitical or regulatory barriers also impede the sharing and standardization of hydrological data. These systemic issues highlight the urgent need for scalable, cost-effective modeling approaches that can operate with limited in situ data. In this context, AI and satellite remote sensing have emerged as promising tools that can bridge data gaps, enhance model accuracy, and inform decision-making in ungauged and under-monitored basins [22,28].

A recent global study compiled daily streamflow data for 48,651 basins, yet this accounts for only a small fraction of the world’s watersheds, indicating that most remain ungauged or poorly gauged [29]. As reported by the International Association of Hydrological Sciences (IAHS), from 2003 to 2012, there was little progress made in resolving the lack of data due to ungauged or poorly gauged basins [27]. This widespread data scarcity poses a challenge to traditional hydrological modeling. Moreover, a small percentage of the world’s watersheds are equipped with stream gauges that are unevenly distributed globally [30]. In addition, from the existing gauged catchments worldwide, only 31.7% of gauging stations could accurately monitor total catchment discharge [31], which highlights the exacerbation of the challenges of obtaining accurate data. The absence of monitoring infrastructure in ungauged watersheds leads to significant difficulties in understanding their hydrological behavior. Without streamflow data, it becomes challenging to calibrate hydrological models accurately, assess water availability, or predict flood events [32]. Considering these challenges, the hydrological community has increasingly turned to alternative data sources and modeling techniques to overcome the limitations of sparse ground observations. Satellite-based dataset, such as Global Precipitation Measurement) (GPM) [33], Soil Moisture Active Passive (SMAP) [34], Moderate Resolution Imaging Spectroradiometer (MODIS) [35], and Gravity Recovery and Climate Experiment (GRACE) [36], now provide consistent, spatially distributed inputs for rainfall, soil moisture, vegetation cover, and water storage at regional to global scales. Moreover, large sample open-access hydrological databases, including Catchment Attributes and Meteorology for Large-Sample Studies (CAMELS) [37], HYdrological dataset for the study of climate change in Eastern Canada (HYSETS) [38], and Global Runoff Data Centre (GRDC) [39], have enabled the training of generalized models and benchmarking of AI-based approaches [40,41,42]. These resources have paved the way for the application of machine learning and deep learning techniques that can infer complex hydrological behavior from indirect indicators, estimate streamflow in ungauged locations, and simulate water cycle dynamics with minimal in situ calibration. This paradigm shift toward scalable, data-driven modeling is not only helping fill observational gaps but is also fostering reproducible and transferable workflows in global hydrological research [43].

In response, recent advances in AI present alternatives for hydrological modeling, specifically for streamflow modeling. These data-driven methods are capable of learning complex and nonlinear relationships between hydrological inputs and outputs without explicitly modeling the underlying physical processes [44]. Meanwhile, regional learning and transferability AI techniques can provide a viable approach through learning data from gauged watersheds and applying it to ungauged ones [43,45]. The developments in neural networks, support vector machines, ensemble models, and long short-term memory (LSTM) networks also show optimistic results in hydrological applications [46,47,48]. Machine learning algorithms, such as LSTM [49] networks and Extreme Gradient Boosting (XGBoost) [47], have demonstrated superior performance in streamflow prediction compared to traditional models. For instance, Arsenault et al. (2023) applied LSTM networks for continuous streamflow prediction in ungauged basins and found that they outperformed traditional hydrological models in most cases [50]. Similarly, Alipour (2023) utilized XGBoost for streamflow prediction in data-scarce regions, highlighting the importance of feature engineering and model explainability [51]. While specific percentages of AI adoption in hydrological modeling are not readily available, bibliometric analyses indicate a significant increase in AI applications within the field in post-2018, particularly in areas like streamflow [52] and groundwater prediction [53], water quality assessments [54], and remote sensing [55]. As a result, AI-based models are increasingly recognized for their potential to overcome data scarcity limitations in ungauged basins and enhance streamflow prediction capabilities. However, despite the growing application of AI in hydrology, several gaps in the literature remain unaddressed. While previous reviews have highlighted the use of AI in general hydrological modeling, few have explicitly focused on its role in ungauged or data-scarce watersheds, where it holds the greatest potential impact. In addition, limited attention has been given to recent advances in remote sensing integration, hybrid frameworks that combine physical models with machine learning, and explainable AI techniques that improve model transparency. Moreover, the challenge of transferability and generalization across diverse hydrological regions remains inadequately explored. These research gaps underscore the need for a focused and up-to-date synthesis of AI applications in ungauged basins, particularly in the context of real-world water resource management.

Thus, this review aims to synthesize the growing body of literature on the application of AI for streamflow prediction in ungauged watersheds. Specifically, it seeks to: (1) identify and categorize key AI techniques used in hydrological modeling; (2) assess their performance and adaptability under data-scarce conditions; (3) compare them with traditional physically based models in terms of accuracy, generalizability, and scalability; (4) examine the integration of satellite remote sensing and proxy datasets into AI workflows; and (5) discuss methodological limitations and emerging research directions. By addressing these objectives, particularly in the context of ungauged basins, this review responds to several critical gaps in the current literature, including the limited focus on transferability, remote sensing integration, and explainable AI. Ultimately, the review aims to provide a comprehensive reference for researchers, modelers, and decision-makers working to improve hydrological predictions in ungauged and poorly monitored regions.

2. Materials and Methods

This review followed a systematic literature search protocol aligned with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [56] to identify and evaluate relevant studies on the application of artificial intelligence (AI) in streamflow prediction for ungauged watersheds. The search was conducted across major academic databases, namely Scopus, ScienceDirect, IEEE Xplore, MDPI, and Google Scholar, as well as additional credible sources such as books, conference proceedings, and institutional repositories. The structured screening and selection process is illustrated in Figure 1.

The search strategy was developed using six primary keywords: Artificial Intelligence, Hydrological Modeling, Streamflow Prediction, Ungauged Watersheds, Machine Learning, and Data-Driven Models. These terms were used in various Boolean combinations (e.g., “Artificial Intelligence” AND “Ungauged Watersheds”; “Machine Learning” AND “Streamflow Prediction”) to ensure a comprehensive search. Filters were applied to include only peer-reviewed journal articles written in English, published between 2000 and 2025, and relevant to hydrology, water resources engineering, environmental science, or data-driven modeling.

Studies were included if they employed AI or machine learning (ML) techniques—such as artificial neural networks (ANN), support vector machines (SVM), random forests (RF), long short-term memory (LSTM) networks, or hybrid/ensemble models—for streamflow prediction specifically in ungauged or data-scarce basins. Research integrating remote sensing, transfer learning, or other advanced data-driven methods within the context of ungauged watersheds was also considered. Exclusion criteria comprised studies focusing on lacking an AI-based modeling component, or non-peer-reviewed and non-English-language publications (see Figure 1, Eligibility section).

A total of 239 records were initially identified. After removing duplicates and screening titles and abstracts, 163 full-text articles were assessed for eligibility. Based on the inclusion criteria, 120 studies were ultimately included in this review. An additional 33 studies were also cited to support specific ideas and contextual discussions. The screening and selection were independently conducted by two reviewers to minimize selection bias, with disagreements resolved through discussion or consultation with a third reviewer. The final set of studies was further classified by AI model type (e.g., ML, DL, hybrid), data sources (e.g., meteorological, remote sensing, topographic), geographic location, application context (e.g., flood prediction, drought modeling), and model evaluation methods (e.g., cross-validation, split-sample testing, NSE, KGE, RMSE). Results were synthesized narratively and thematically to identify trends, compare approaches, and highlight methodological gaps.

To address the challenges of synthesizing heterogeneous studies, the study employed bibliometric analysis (co-citation networks, keyword clustering, and thematic mapping) as an evidence-based structuring tool. This systematic mapping approach reduces narrative bias by revealing objective relationships among AI techniques, application domains, and research gaps. Our narrative synthesis was guided by these bibliometric clusters, ensuring transparent and replicable categorization. The VOSviewer version 1.6.20-generated keyword co-occurrence network (Figure 2) shows several clustered themes, where larger nodes such as artificial intelligence, machine learning, streamflow prediction, and hydrological model dominate the landscape. The density of links between these nodes indicates strong interconnections, suggesting that recent studies increasingly combine traditional hydrological modeling with advanced AI approaches. Although the dataset covers 2000–2025, most publications are concentrated from 2018 to 2023, reflecting the surge of interest in applying deep learning and hybrid models during this period. This clustering highlights not only the rapid growth of AI-driven hydrology but also the thematic evolution of research priorities in the last five years.

To provide a temporal perspective, the reviewed studies were grouped by year of publication. As illustrated in Figure 3, a substantial share of the works appeared between 2021 and 2025, reflecting the increasing emphasis on recent applications of artificial intelligence in hydrology. The figure also shows that research activity accelerated markedly during this period compared to earlier years, highlighting a surge of interest in AI-driven modeling for ungauged watersheds. While this review emphasizes contemporary advancements, earlier studies were also incorporated when methodologically significant or foundational to the evolution of AI approaches. These earlier contributions serve as important baselines and offer methodological insights that continue to inform current research. Reference management software was used to organize citations and eliminate duplicates. At the same time, the screening and selection process was independently conducted by two reviewers, with disagreements resolved through discussion or, when necessary, consultation with a third reviewer.

The bibliometric (Figure 2) and geographic analysis (Figure 3) reveal a significant regional imbalance: approximately 47% of reviewed studies originate from Asia and 35% from North America, while Africa and Oceania together represent less than 5%. This underrepresentation likely reflects persistent data scarcity, limited computational resources, and research funding disparities in developing regions. Such bias has practical implications, as AI models trained on data-rich catchments may not generalize to ungauged or data-poor basins common in the Global South. Future research should prioritize transfer learning, parameter regionalization, and collaborative data-sharing frameworks to enhance model applicability across diverse hydroclimatic conditions.

3. AI in Hydrology/Watershed Management

3.1. Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds

Streamflow information is essential in a wide range of applications, including the planning and design of water resource projects, dry season water management, climate forecasting, early warning systems, and the assessment of hydropower potential at both regional and local scales [8]. That is, hydrological modeling highlights its importance in the prediction of stream flow. Over the years, advancements in weather radar technology have enabled the generation of high-resolution spatiotemporal meteorological datasets, facilitating more accurate modeling [57,58].

Specifically, the distributed and lumped hydrological models. Distributing the hydrological framework, subdividing watersheds into smaller units to capture the spatial heterogeneity of watershed features and atmospheric inputs [59,60]. Lumped models, although simpler, fail to capture such fine-scale variability, particularly problematic in ungauged basins where spatial inference is crucial [61].

However, these traditional hydrological models, whether distributed or lumped, often struggle to deliver reliable predictions in ungauged basins due to a lack of calibration data and the high uncertainty associated with model parameterization. This has motivated the integration of AI techniques that can learn directly from available data and complement or replace physically based approaches in such data-scarce environments [37,43].

3.1.1. Distributed Hydrologic Modeling (DHM)

One approach involves the use of distributed hydrologic modeling, which leverages spatial data and physically based models to simulate hydrological processes [62,63,64]. Under which are the physically based distributed models that simulate hydrological processes by representing the spatial variability of watershed characteristics [8,41]. This model is designed to assess the feasibility of developing a physically based mathematical model that represents a complete hydrologic system [65]. Models like MIKE Systeme Hydrologique Europeen (SHE) [63], Distributed Hydrology Soil Vegetation Model (DHSVM), and Visualizing Ecosystem Land Management Assessments (VELMA) have been employed to capture the effects of land use changes on streamflow regimes across various scales [66]. These models require detailed input data, including topography, soil properties, and land cover, to accurately represent hydrological processes. For example, Ibrahim and Cordery [67] developed a physically based hydrological model designed to realistically simulate runoff generation using commonly available geophysical data from ungauged basins. The model utilized monthly rainfall, climatic data, and representative soil characteristics as input parameters. While Nyabeze [68] applied a distributed parameter estimation approach to simulate runoff across various poorly gauged catchments in Zimbabwe. Runoff was modeled independently for each segment, with each segment representing a distinct combination of rainfall patterns, soil conditions, vegetation cover, and land use, all key factors influencing runoff behavior. However, physically based DHMs are computationally intensive due to their detailed representation of hydrological processes [69]. In contrast, AI-based or hybrid models can reduce computational load by learning statistical relationships among hydrological variables without explicitly modeling each physical process. Studies suggest that combining DHM frameworks with AI components, such as neural networks for parameter estimation or surrogate modeling, can improve efficiency while preserving spatial fidelity [70,71]. This complexity can limit their applicability, especially for real-time forecasting or large-scale applications [41]. Findings indicated that, although current modeling techniques are capable of handling specific components, such as one- or two-dimensional unsteady soil moisture flow in heterogeneous soils, three-dimensional steady-state groundwater flow in non-homogeneous anisotropic formations, and one-dimensional open channel flow with lateral inflow—the present level of sophistication remains inadequate for constructing a fully integrated, physically based hydrologic model [65]. Efforts to simplify models or leverage cloud computing resources aim to address these challenges [72].

Meanwhile, incorporating real-time observations into DHMs enhances its predictive capabilities, which is mainly used in Data Assimilation Techniques [73]. The Ensemble Kalman Filter (EnKF) has been utilized to assimilate streamflow observations, update model states, and improve forecasts. Wang and Babovic proposed a hybrid data assimilation method that combines the Kalman Filter with local linear models to improve water level forecasts in nonlinear systems, enhancing accuracy even in data-scarce regions [74]. Similarly, variational assimilation approaches have been applied to update soil moisture states, benefiting both streamflow and soil moisture predictions [75]. These advances illustrate the potential for AI-assisted data assimilation, where machine learning algorithms are used to update model states or correct biases in observation data dynamically. Deep learning-based surrogate models have also been proposed to approximate the behavior of complex data assimilation frameworks, improving computational efficiency in real-time forecasting systems [76].

Another one is the integration of remote sensing to DHMs. Satellite remote sensing (SRS) provides valuable data for model calibration, especially in data-scarce regions [76]. Recent studies have explored the integration of SRS-derived precipitation (GPM, Climate Hazards Group InfraRed Precipitation or CHIRPS), evapotranspiration, and soil moisture (SMAP, MODIS, GRACE) with AI-based models to enhance predictive accuracy in ungauged basins [77,78,79]. These combinations enable indirect observation of hydrological variables that would otherwise be unavailable, making them highly relevant for machine learning applications. Studies have explored the simultaneous calibration of hydrological models using streamflow and SRS data, aiming to reduce parameter uncertainty and improve model realism. However, integrating multiple data sources can introduce conflicting information, necessitating careful consideration of data quality and relevance [75]. But the presence of numerous parameter sets yielding similar model outputs (equifinality) complicates model calibration. Incorporating additional data sources, such as SRS, can help constrain parameter estimates, but may also introduce conflicting information, highlighting the need for robust calibration strategies [66]. While distributed models offer detailed spatial representation, their practical limitations in ungauged contexts highlight the growing need for data-driven and hybrid modeling frameworks—many of which leverage the learning capacity of AI to address data scarcity and uncertainty in hydrological prediction.

3.1.2. Conceptual and Lumped Hydrological Modeling

Conceptual and lumped rainfall–runoff models, such as Hydrologiska Byråns Vattenbalansavdelning (HBV), Génie Rural à 4 paramètres Journalier (GR4J), and (Nedbør-Afstrømnings-Model) NAM, continue to play a crucial role in streamflow prediction within ungauged basins, owing to their moderate complexity and low data demands. A recent study employing nested cross-validation across seven semi-arid sub-catchments in Morocco found Nash–Sutcliffe efficiencies ranging from 0.5 to 0.8 and Kling-Gupta efficiencies between 0.1 and 0.9, highlighting the HBV model’s spatial transferability and moderate-to-high performance in data-scarce regions [80,81,82,83,84]. Meanwhile, GR4J and NAM have demonstrated similarly promising results: Kuana et al. (2024), for example, reported that GR4J parameters regionalized via spatial proximity and similarity criteria achieved daily NSEs above 0.7 in over 65% of test basins, underscoring the method’s feasibility [85].

Despite their operational simplicity, lumped models often face criticism for their inability to capture the spatial heterogeneity of hydrological processes, particularly in basins with significant variations in land use, topography, and rainfall distribution. However, their reduced computational demands and ease of implementation continue to make them attractive for large-scale or rapid assessments, especially when paired with modern parameter estimation or regionalization techniques [86,87].

Parameter regionalization methods are essential to adapting lumped models for ungauged contexts. Spatial proximity, the most straightforward approach, relies on donor catchments close to the target basin and often provides surprisingly robust performance in homogenous terrains. Physical similarity methods, which match basins based on physiographic and climatic attributes such as size, elevation, soil, and land cover, offer enhanced parameter transfer accuracy, as validated in regionalization case studies [84]. Regression and clustering approaches, including ensemble methods like Quantile Random Forests and general-purpose time-series feature selection, have further improved estimation of model parameters by capturing nonlinear dependencies and uncertainty bounds when calibrating in ungauged catchments [88].

Recent studies have explored coupling conceptual models with machine learning algorithms to optimize parameter calibration, improve forecasting accuracy, and reduce uncertainty. For example, support vector machines and Bayesian optimization have been used to calibrate HBV parameters [63] under sparse data conditions, achieving improved generalization in comparative tests [89]. Such hybrid approaches bridge the gap between empirical transferability and model-based understanding, offering practical solutions for ungauged settings. Long Short-Term Memory (LSTM) networks have emerged as a dominant deep learning approach in streamflow modeling, owing to their ability to capture long-range temporal dependencies in hydrological processes. Applications across different temporal scales, daily [43], hourly, and monthly, show that data resolution and dataset length substantially influence model accuracy and generalization capacity. Typically, dataset partitioning follows a training-validation-testing scheme, with proportions such as 70–15–15% or 80–10–10% being standard in hydrological AI applications [46]. The performance of LSTMs is also susceptible to hyperparameter selection, including the number of hidden layers, learning rate, batch size, and dropout regularization, which must be optimized through cross-validation or automated tuning techniques [19]. To address limitations in pure data-driven models, hybrid algorithms have been increasingly applied, combining LSTM with physical models, feature extraction methods, or uncertainty quantification frameworks, thereby enhancing robustness in ungauged or data-scarce basins. These considerations position LSTM and its variants as a bridge between conceptual lumped models (HBV, GR4J, NAM) and emerging AI-driven approaches, highlighting their adaptability across different hydrological contexts. Hydrological signature transfer adds a deeper behavioral dimension to model regionalization by aligning indicators such as Flow Duration Curves (FDC) [83], Baseflow Index (BFI), and seasonality across basins. An innovative study framework simultaneously predicted FDCs using process-based and regionalized data, proving effective in ungauged basins [90]. Studies have compared rainfall–runoff model performance calibrated against FDCs and found that maintaining hydrograph timing often yields better predictions than using FDCs alone. However, combining FDC slope with rising-limb metrics helps improve signature-based regionalization [81]. A novel hydrologic efficiency metric (SHE) further refined signature regionalization by aligning FDC slope, BFI, and seasonal amplitude. SHE matched parameter-estimation accuracy within one signature unit in 78% of catchments [91]. Additionally, a study introduced a model-free method to estimate FDCs in partially gauged basins by conditioning donor-site FDCs on precipitation indices, emphasizing that FDC shape reflects both basin and climate influences [92].

Table 1 outlines the comparative overview of distributed and conceptual hydrological models, highlighting their respective strengths, limitations, and suitability for ungauged watershed applications. In summary, conceptual and lumped models remain essential tools in hydrology, particularly when adapted through advanced regionalization strategies or hybridized with AI methods. While their structural simplicity may limit representation of spatial processes, ongoing innovations, such as behavioral signature alignment and data-driven parameter estimation, demonstrate their evolving potential. These developments support the broader transition toward integrated modeling approaches that can achieve robust performance in both gauged and ungauged contexts.

3.2. Applications and Evolution of AI in Hydrology

The integration of AI into hydrology represents a transformative shift in how hydrological processes are understood, modeled, and predicted. AI, broadly defined as the ability of machines to mimic human cognitive functions, has been adopted across environmental sciences due to its capacity to handle complex, nonlinear, and high-dimensional data. The main AI approaches in hydrology, including their strengths and limitations, are summarized in Table 2. In hydrology, its evolution can be traced from early machine learning (ML) applications to recent advances in deep learning (DL) and hybrid AI frameworks. These tools have enabled researchers to analyze vast hydrometeorological datasets, often derived from satellites, ground sensors, and remote sensing technologies, to produce more accurate streamflow predictions and other hydrological forecasts. These AI techniques can automatically extract patterns from large-scale datasets, such as satellite-based precipitation, soil moisture, and topographic information (digital elevation models or DEMs), thereby improving flood prediction and water availability assessments in ungauged regions.

Early implementations of AI in hydrology used Artificial Neural Networks (ANNs), which were adequate but limited in scope and interpretability. Over time, more advanced architectures like Support Vector Machines (SVM), Random Forests (RF), and Gradient Boosting Machines (GBM) gained traction due to their improved performance in classification and regression tasks. A review by Mosavi et al. (2019) documented the broad adoption of ML in flood forecasting and hydrological modeling [44]. Concurrently, Shen (2018) argued for a data-driven revolution in hydrology, highlighting the role of AI in capturing spatiotemporal dependencies often missed by conventional methods [19].

AI applications in hydrology can be broadly classified into three methodological categories: classical machine learning (ML), deep learning (DL), and hybrid AI or physics-informed models. ML algorithms such as RF and SVM have demonstrated excellent capabilities in predicting floods, mapping drought susceptibility, and estimating evapotranspiration. For example, Seleem et al. (2022) found that RF models outperformed other techniques in flood susceptibility mapping [94]. Likewise, Li et al. (2021) successfully used ML to forecast droughts in data-sparse regions [95]. AI techniques such as neuro-fuzzy systems have been applied for rainfall-runoff modeling in local river basins [104]. These models are generally easier to interpret than DL methods, but they still face challenges with extrapolation beyond the training range.

Deep learning, a subset of ML, has been particularly influential in modeling time-series hydrological data. Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTM), and Convolutional Neural Networks (CNNs) dominate this domain. In In the study of Kratzert et al. (2019) [43] demonstrated the power of LSTMs in rainfall-runoff modeling across thousands of basins, outperforming physically based benchmarks. These networks are capable of learning long-term dependencies in data, which is critical for hydrological forecasting.

While NSE and R² remain common in hydrological model evaluation, they are not sufficient alone to capture error characteristics, particularly for deep learning models such as LSTM. Recent studies recommend incorporating additional error indices, including the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Relative Error (MARE), to provide a more comprehensive assessment. RMSE emphasizes large deviations and is sensitive to peak flow errors, making it useful for flood-related applications. MAE, by contrast, offers a scale-dependent but more robust measure of overall prediction accuracy across flow conditions. MARE further normalizes the absolute error by observed values, allowing comparison across basins with different hydrological regimes. When jointly applied, these metrics provide complementary insights into model robustness, error distribution, and applicability in data-scarce or ungauged basins, strengthening the interpretability and reliability of LSTM-based predictions [43].

Despite these successes, purely data-driven models often lack interpretability and physical meaning. This has led to a growing interest in hybrid and physics-informed AI models. These approaches combine domain knowledge with machine learning, ensuring that outputs adhere to physical laws while still leveraging data-driven insights. Karpatne et al. (2017) introduced the concept of theory-guided data science to bridge the gap between empirical learning and physical modeling [96]. More recently, Chen et al. (2006) reviewed hybrid models that fuse conceptual rainfall-runoff models with neural networks for more robust hydrological simulations [97].

Comparisons between AI and traditional physically based models reveal fundamental differences. Physically based models such as SWAT, VIC, and HEC-HMS simulate hydrological processes through governing equations. While they provide mechanistic insights, they require extensive data input, calibration, and are computationally expensive. In contrast, AI models, particularly LSTMs and CNNs, can model complex relationships directly from data. Arsenault (2023) showed that regional LSTM models trained on catchment datasets outperformed traditional hydrological models in ungauged basins [50]. However, concerns remain about their lack of transparency and generalization capabilities. Bhasme (2021) emphasized that while AI models often match or exceed traditional models in predictive accuracy, they fail to capture hydrological realism without additional constraints [98]. There are also efforts to increase interpretability through explainable AI (XAI). Núñez (2023) introduced interpretable neural networks tailored for hydrological forecasting, offering insights into model decision-making [99]. Others, such as Lipton (2018), provide general frameworks for understanding the internal workings of deep networks [100].

Beyond conventional machine learning and deep learning methods, physics-informed neural networks (PINNs) and graph neural networks (GNNs) have recently gained traction in hydrological applications. PINNs integrate governing physical equations (for example, continuity or momentum) directly into the neural network loss function, thereby improving model interpretability, reducing physically inconsistent predictions, and addressing the “black-box” criticism often directed at AI models [105]. In parallel, GNNs provide a natural framework for hydrology by representing river basins as graphs, where nodes correspond to sub-catchments and edges represent flow connectivity. This architecture allows effective learning of spatial dependencies across large, interconnected watersheds, with recent studies demonstrating superior performance in streamflow prediction and runoff routing compared to traditional RNNs.

Recent advances further emphasize the importance of hybrid physics–AI approaches, particularly PINNs, which embed hydrological governing equations into neural network loss functions to ensure physical consistency. These methods, alongside hybrid frameworks such as LSTM–SWAT couplings, have demonstrated improved generalization in ungauged basins. In parallel, explainable AI (XAI) techniques such as SHAP, LIME, and attention mechanisms are increasingly used to interpret black-box models, helping identify key hydrological drivers and enhancing model transparency. Another promising development is the application of GNNs, which naturally capture river network connectivity by representing basins as nodes and flow paths as edges, thereby improving basin-scale predictions. Collectively, these innovations broaden the technical landscape of AI in hydrology, making models both more robust and interpretable.

Moreover, the lack of standardized benchmarking across AI hydrology studies often hinders fair performance comparison and model reproducibility. Several challenges must be addressed. First, many AI models are prone to overfitting, especially in data-limited regions. Second, extrapolation beyond historical data, critical under changing climate or land use, is a well-known weakness. Third, there is a need for large, labeled datasets to train deep models effectively. Efforts like CAMELS-US [101] and CAMELS-GB [42] have helped mitigate these limitations by offering standardized, high-resolution hydrological data across catchments. Despite these challenges, ongoing advances in hybrid modeling, data availability, and cloud-based computing are paving the way for more robust AI-based hydrological tools.

As we move forward, the future of AI in hydrology will likely lie in hybridization—combining the strengths of data-driven learning with the interpretability and robustness of physics-based modeling. This shift is supported by studies such as Zhong (2024) [102], which show that hybrid AI systems perform better in basins with limited observations. Similarly, Eythorsson (2025) [103] warns that blind trust in black-box models may erode scientific understanding unless AI is embedded within a physically meaningful framework.

3.3. AI for Streamflow Prediction in Ungauged Watersheds

Accurate streamflow prediction is vital for water resources planning, flood risk mitigation, and ecosystem management. However, many watersheds remain ungauged or poorly monitored, limiting the efficacy of traditional modeling methods. To address this, artificial intelligence, particularly ML models like RF, SVM, and LSTM networks, has emerged as a powerful tool that can capture nonlinear hydrological dynamics using alternative data inputs [93,106].

This section delves into three key themes. First, the concept of regionalization involves transferring trained models from gauged to ungauged sites based on catchment similarities, leveraging transfer learning and basin descriptors [93,106]. Second, the synthetic generation of data using proxy variables, such as DEM-based features, satellite-derived soil moisture, and hydrological signatures, enables AI models to be trained despite missing flow observations [37,88]. Third, model performance assessment examines how AI regionalization stacks up against conventional hydrological models, using metrics like Nash–Sutcliffe efficiency (NSE) [49] and Kling–Gupta efficiency (KGE); studies report significant accuracy gains in diverse ungauged contexts [38,107]. Together, these developments underscore AI’s shifting role in transforming streamflow prediction in data-scarce basins.

3.3.1. Regionalization: Transferring Models from Gauged to Ungauged Sites

Among various AI strategies, two major data-driven pathways have emerged: transferring models from gauged to ungauged sites and training models on synthetic or proxy datasets. The regionalization of hydrological models is a widely used method for extending streamflow prediction to ungauged basins, particularly in regions lacking in situ measurements. Classical techniques transfer calibrated model parameters or structures from gauged catchments to similar ungauged ones, using metrics based on spatial proximity, physiographic attributes, and climate similarity. These descriptors, such as catchment slope, drainage area, and precipitation regime, serve as proxies for runoff-generating processes [36,69]. With the growing use of machine learning and data-driven models, regionalization approaches have shifted toward generalized frameworks that leverage large hydrological datasets. Among these, LSTM neural networks have emerged as a prominent tool. When trained across hundreds of basins using meteorological time series and static catchment attributes, LSTM models have demonstrated enhanced predictive capabilities for ungauged locations [46].

Model performance improves further when training involves diverse datasets that represent a wide range of hydroclimatic and physiographic conditions [43]. Recent applications of LSTM-based models have shown significant improvements over traditional conceptual models in predicting streamflow in ungauged basins. These models effectively encode spatial information through catchment descriptors and temporal dynamics through sequential rainfall and temperature inputs [38,50]. Training LSTM models using large-sample hydrology datasets has proven essential in enhancing their generalization ability. Unlike models trained on individual basins, those exposed to a broader spectrum of conditions are more robust when transferred to new locations [108].

Data-driven models have also been used to create gridded runoff datasets in regions with limited monitoring infrastructure. By integrating satellite-derived meteorological inputs, LSTM networks are capable of generating continuous runoff predictions across sparsely monitored catchments [109]. Further developments have explored the use of reanalysis data and global-scale hydrological datasets to train generalized models capable of performing in various climatic and geographic settings [101]. While the performance gains are notable, several barriers limit full implementation across diverse hydro-climatic contexts. Despite these advances, several challenges persist in transferring models to ungauged basins. These include inconsistencies in metadata, variations in data quality, and anthropogenic influences such as land use changes, reservoir operations, and water withdrawals. Moreover, the absence of standardized similarity metrics complicates the process of identifying appropriate donor catchments. Improving transferability in future regionalization efforts depends on expanding large-sample training datasets, refining catchment descriptors, and integrating human-altered basin characteristics into model development [28].

3.3.2. Synthetic Generation of Data and Model Training Using Proxy Variables

The absence of direct streamflow measurements in ungauged watersheds presents a significant challenge in hydrological modeling. To address this, researchers have turned to synthetic data generation and proxy-variable modeling, using inputs such as soil moisture, vegetation indices, terrain slope, and precipitation. These proxies, obtained from satellite products, DEMs, and reanalysis datasets, provide the basis for training AI models that simulate streamflow dynamics where gauges are unavailable [45,80]. Recent innovations in machine learning have made it possible to simulate runoff using models trained exclusively on proxy variables. One study used LSTM networks with gridded meteorological inputs and physiographic variables to develop a regional runoff dataset. This approach demonstrated high spatial accuracy and generalization potential, even in areas lacking ground measurements [109]. Soil moisture is another widely used proxy. A study applied ANNs using satellite-based soil moisture as input and achieved robust streamflow forecasts across multiple ungauged catchments. This was particularly effective for seasonal predictions, where antecedent soil conditions played a critical role [110]. Satellite-derived NDVI and soil texture have also been shown to enhance discharge simulation in regional models. For instance, integrating these variables into a large-scale hydrological model yielded significant improvements during dry-season events when vegetation and surface conditions strongly influence runoff generation [111]. Synthetic streamflow series can also be generated using Monte Carlo simulations and physically based models. These datasets, often calibrated using regional flow statistics (e.g., baseflow index or peak runoff percentiles), are used to pre-train machine learning models, improving accuracy in predicting both average and extreme flows [112]. To optimize input selection, feature importance techniques such as SHAP analysis and mutual information have been employed. These methods consistently highlight proxies like precipitation, NDVI, soil moisture, and terrain slope as top predictors for runoff generation across varying hydro-climatic conditions [109,111].

However, several limitations remain. Proxy data vary in spatial and temporal resolution, and differences in satellite retrieval algorithms can affect their reliability. Furthermore, specific catchments exhibit hydrologic behaviors that general proxy-based models may not well capture. Continued efforts are needed to develop region-specific proxy datasets, incorporate advanced feature engineering, and deploy transfer learning approaches to enhance model adaptability in diverse environments [110].

3.3.3. Model Performance Assessment

Evaluating the performance of AI models for streamflow prediction in ungauged basins is essential to ensure their reliability, especially when applied to regions with limited or no in situ observations. In the absence of ground-truth discharge data, model validation relies on analog gauged catchments, synthetic benchmarking frameworks, and performance metrics tailored to hydrological behavior across scales. Since direct observations are typically unavailable in such catchments, model validation often relies on indirect assessment techniques, gauged-basin analogs, and performance metrics tailored for hydrologic applications.

One of the most common validation strategies is leave-one-out cross-validation (LOOCV) [1]. One gauged basin is excluded from the training dataset and treated as ungauged during testing. LOOCV is widely used in regionalization studies and large-sample hydrology to simulate ungauged conditions. By excluding one gauged basin during training and treating it as ungauged, LOOCV evaluates model robustness and transferability across spatial domains with differing hydrometeorological characteristics [43,108]. This approach is convenient in large-sample hydrology, where a wide variety of climatic and physiographic conditions are represented. LOOCV provides insight into the generalization capabilities of machine learning models and supports the evaluation of regional transferability. Model skill is typically quantified using metrics such as the NSE, KGE, Root Mean Squared Error (RMSE), and bias. These metrics offer complementary perspectives on model performance. NSE, while widely used, is particularly sensitive to peak flows and may mask low-flow errors. KGE combines correlation, variability ratio, and bias components into a single, interpretable score. RMSE penalizes large errors, and bias quantifies the systematic over- or underestimation of streamflow. For instance, NSE is sensitive to high flows, while KGE captures the balance of correlation, variability, and bias in the simulated streamflow [43].

In recent years, probabilistic forecasting has gained attention in hydrological AI. Deep learning models such as Bayesian LSTM networks, which incorporate stochastic variational inference, can estimate uncertainty in both model structure and inputs. This allows for the generation of predictive intervals, providing more informative outputs for decision-making in water resource management [113]. Similarly, Bayesian deep learning methods have been applied in multi-step ahead streamflow forecasting, showing strong potential for probabilistic accuracy across multiple catchments [114]. Model benchmarking is increasingly supported by open, large sample datasets such as CAMELS. These datasets offer standardized physical, climatic, and hydrologic attributes across hundreds of basins, enabling reproducibility and consistent model comparison [115].

Finally, the integration of uncertainty quantification frameworks into streamflow prediction is now considered best practice in hydrological AI. Probabilistic forecasting not only enhances model transparency but also mitigates overconfidence in deterministic outputs, particularly vital for extreme event forecasting and adaptation under climate variability [116].

3.3.4. Comparative Analysis of Advanced AI Approaches for Ungauged Basins

Recent advances such as Physics-Informed Neural Networks (PINNs), Graph Neural Networks (GNNs), foundation models, and Uncertainty Quantification (UQ) frameworks represent a paradigm shift in hydrological modeling for ungauged basins. Unlike earlier machine learning models that function as black boxes, these approaches aim to enhance physical consistency, spatial representation, and predictive reliability under uncertainty. The summary is in Table 3.

PINNs integrate governing hydrological equations into the loss function of neural networks, ensuring predictions adhere to physical laws and reducing physically inconsistent outputs [117,118]. This approach improves generalization in data-scarce environments by embedding prior knowledge, addressing issues such as equifinality [22]. However, their computational complexity and reliance on accurate formulation of physical constraints limit operational adoption in real-time forecasting [117].

GNNs offer a unique advantage for representing river basins as graphs, where nodes represent sub-catchments and edges denote connectivity, allowing explicit modeling of spatial dependencies [119,120]. Compared to sequence-based models like LSTM, GNNs better capture networked hydrological behavior, particularly in large river systems [43] (Kratzert et al., 2019). However, performance depends heavily on the availability and accuracy of catchment descriptors such as topography, soil characteristics, and land cover [80], which can be sparse in ungauged basins. Interpretability remains an active research challenge.

Adapted from natural language processing and computer vision paradigms, foundation models leverage pretraining on extensive hydrological datasets (CAMELS, GRDC) to enable transfer learning across basins. These models significantly reduce training effort for site-specific applications and support adaptability to new hydrological contexts. Nevertheless, the large computational resources required, dependency on diverse pretraining datasets, and ethical considerations in data sharing pose barriers to widespread adoption [121,122].

Bayesian deep learning, ensemble neural networks, and stochastic LSTM architecture have emerged as effective UQ strategies for streamflow prediction, providing predictive intervals essential for risk-informed water management [30,123]. These frameworks mitigate overconfidence in deterministic models and improve decision-making under uncertainty [124]. However, they remain computationally demanding and lack standardized implementation protocols for hydrology [117]. Combining UQ with PINNs or GNNs is an emerging direction that promises enhanced robustness under climate and land-use variability [121].

3.3.5. Comparative Analysis of Advanced AI Approaches for Ungauged Basins

The application of AI-based models in hydrology has often been compared with traditional conceptual and physically based models. However, a critical comparative framework is necessary to contextualize these differences beyond raw performance metrics. Traditional models such as HBV, GR4J, and SWAT rely on predefined hydrological equations, requiring calibration and sensitivity analysis for parameter estimation [13,125]. In contrast, AI models such as LSTM, PINNs, and GNNs learn patterns directly from data, reducing dependency on process-based assumptions [22,43].

Table 4 presents a structured comparison of these modeling paradigms across key dimensions, including interpretability, data dependency, uncertainty handling, scalability, and adaptability to ungauged basins. This tabular synthesis provides a more transparent framework for evaluating methodological trade-offs and suitability for specific hydrological contexts.

AI-based models outperform traditional models in scalability and adaptability, particularly in data-rich regions and for large-scale applications [43,121]. However, interpretability and computational burden remain key limitations, making hybrid approaches that embed physical constraints (for example, PINNs, physics-guided LSTM) promising alternatives for ungauged basins. Conversely, traditional models retain advantages in interpretability and computational efficiency, making them suitable for operational forecasting where transparency is critical. Future research should explore integrated frameworks that combine the robustness of physical models with the adaptability of AI to enhance predictive accuracy under climate and land-use change.

3.4. Integration of AI in Remote Sensing for Streamflow Prediction

The integration of AI with satellite remote sensing has significantly advanced streamflow prediction, particularly in ungauged and human-altered watersheds. Remote sensing technologies provide spatially continuous, near-real-time observations of key hydrological variables, such as precipitation, vegetation indices (NDVI), land cover, soil moisture, and surface temperature, from sensors onboard platforms like MODIS, SMAP, and GPM. When paired with AI algorithms capable of modeling complex and nonlinear processes, these data sources facilitate accurate and scalable hydrological forecasting in data-scarce regions [55,126].

Traditional physically based models often underperform in ungauged basins due to limited ground data and their inability to fully capture non-stationary dynamics associated with land use changes, anthropogenic alterations, and climate variability. In contrast, AI approaches, particularly DL, ensemble learning, and hybrid models, demonstrate a heightened ability to extract patterns from multi-source remote sensing inputs and predict hydrological behavior under dynamic conditions [127,128].

In regions lacking dense observation networks, AI methods have successfully used remotely sensed precipitation (e.g., CHIRPS, GSMaP, PERSIANN-CDR, IMERGE, etc.) and topographic features (derived from DEM) as surrogate inputs for streamflow modeling. For instance, bias correction of satellite-based precipitation using neural networks and regression methods has led to substantial improvements in streamflow simulation over mountainous and tropical catchments [55]. Similarly, models trained on anthropogenic similarity indices, accounting for factors such as dam presence, irrigation, and urbanization, have shown greater predictive skill in human-regulated rivers compared to traditional parameter transfer methods [127].

Recent reviews underscore the expanding utility of remote sensing–AI fusion for diverse hydrological applications, including flood detection, drought monitoring, and near-real-time rainfall-runoff forecasting. Studies using data from MODIS, SMAP, and GPM demonstrate the capacity of AI-based models to operate effectively at basin, regional, and even global scales [126]. Importantly, these models are increasingly supported by harmonized, open-access datasets such as CAMELS and HYSETS, which enable transfer learning and regional generalization across varying hydroclimatic zones [115].

As these hybrid systems mature, future research should focus on (i) improving the spatial and temporal resolution of remote sensing inputs, (ii) integrating explainable AI to enhance model interpretability, and (iii) accounting for human interventions, such as land management, reservoir operations, and climate adaptation measures, in training datasets. These directions are key to ensuring reliable, context-aware AI models for streamflow prediction in ungauged and dynamic watersheds.

3.4.1. Remote Sensing Data

Remote sensing platforms such as the MODIS and the Tropical Rainfall Measuring Mission (TRMM) [129], along with its successor, the GPM mission, have become foundational tools in hydrological modeling, especially in data-scarce and ungauged basins (see Table 3). These systems offer high-resolution, continuous, and near-real-time environmental observations critical for understanding and predicting hydrologic processes.

MODIS, aboard NASA’s Terra and Aqua satellites, provides global coverage of land surface properties including vegetation indices (NDVI, EVI), surface temperature, albedo, and leaf area index (LAI) [102], at spatial resolutions ranging from 250 m to 1 km with a daily revisit time. These variables are widely used in hydrological studies for estimating actual evapotranspiration, soil moisture dynamics, and monitoring land cover change, which are essential for assessing runoff generation and water availability [130,131]. As the use of AI in hydrology grows, MODIS-derived variables have increasingly been integrated into machine learning models, such as random forest, support vector regression, and convolutional neural networks (CNNs), to improve predictions of streamflow, drought patterns, and catchment behavior in both natural and human-impacted watersheds [126,132].

On the other hand, TRMM and GPM focus on precipitation measurement. TRMM, which operated from 1997 to 2015, provided 3 h rainfall estimates at a spatial resolution of approximately 25 km. GPM, which continues to operate today, offers enhanced precipitation estimates via the IMERG (Integrated Multi-satellite Retrievals for GPM) product, with a spatial resolution of about 10 km and a temporal frequency of 30 min. These datasets are essential inputs in rainfall–runoff modeling, flood forecasting, and streamflow simulation, especially in basins lacking dense ground-based rain gauge networks [133,134].

In tropical watersheds such as the Lesti sub-watershed in Indonesia, TRMM-derived rainfall data, after undergoing bias correction, were found to be statistically reliable, achieving correlation coefficients above 0.70 and NSE values exceeding 0.50 for streamflow simulations [135]. Similarly, in Chinese monsoonal basins, deep learning models incorporating GPM precipitation as input significantly outperformed traditional hydrological models, offering better accuracy in both peak and baseflow conditions [132].

The integration of remote sensing with artificial intelligence has ushered in new modeling paradigms. For instance, deep learning-based bias correction of satellite precipitation data has proven highly effective in improving streamflow forecasts in mountainous regions [55]. MODIS and GPM datasets are also being used to develop hybrid frameworks for downscaling and error correction, with CNNs and LSTM networks enhancing the spatial and temporal resolution of forecasts. Such approaches not only address the scale mismatch between satellite data and catchment processes but also improve the reliability of projections under changing climatic and land use conditions.

Recent reviews further emphasize the practicality of combining MODIS and GPM with AI to build scalable and transferable hydrological models across diverse climatic zones [126]. These integrated frameworks are especially valuable in regions with limited infrastructure or access to real-time monitoring data, where AI-driven use of satellite observations can support water resource planning, flood preparation, and climate adaptation strategies. As the field progresses, when combined with AI techniques, it is expected to remain central to next-generation hydrologic modeling and watershed management systems (refer to Table 5 for the summary).

3.4.2. AI for Downscaling, Bias Correction, and Temporal Interpolation

Satellite-based datasets provide valuable large-scale hydrometeorological information, particularly for ungauged basins. However, their application in fine-resolution hydrological modeling is constrained by limitations such as coarse spatial resolution, systematic biases, and temporal discontinuities. AI has increasingly been adopted to address these challenges through three key preprocessing strategies: spatial downscaling, bias correction, and temporal interpolation. These techniques aim to enhance the quality, precision, and consistency of remote sensing products before they are used in watershed-scale models.

AI-based spatial downscaling focuses on improving the resolution of coarse satellite data such as GPM precipitation or MODIS-derived variables. For instance, daily GPM precipitation data, which typically have a native resolution of 10 km, can be downscaled to 1 km using convolutional neural networks and other deep learning architectures [136]. Similar approaches have demonstrated significant improvements in capturing spatial heterogeneity by leveraging terrain features, vegetation indices, and land surface parameters as predictors [137]. These downscaling models enable a more accurate representation of localized rainfall dynamics and support hydrological modeling in small or topographically complex catchments.

Bias correction is another essential application of AI, especially in regions where satellite measurements are prone to error due to atmospheric disturbances or retrieval algorithm limitations [138]. Physically constrained machine learning frameworks have been used to adjust precipitation estimates by learning from the spatial and statistical patterns of biases between satellite data and in situ observations [79]. Compared to traditional statistical techniques, AI-based methods such as deep learning, gradient boosting, and support vector regression can better handle nonlinearities and perform well even in regions with limited calibration data [52].

Temporal interpolation and gap filling are necessary for creating continuous time series from remote sensing datasets that may be incomplete due to cloud cover, satellite malfunctions, or infrequent temporal resolution. AI models such as recurrent neural networks (RNNs), Gaussian process regression (GPR), and k-nearest neighbors (k-NN) are effective in reconstructing missing values by modeling temporal dependencies and spatial correlations [139]. These gap-filling techniques restore consistency in satellite time series, which is crucial for hydrological simulations that require high-frequency input data.

Recent advancements have also integrated multiple preprocessing steps into hybrid AI frameworks that perform downscaling, bias correction, and temporal smoothing simultaneously. These combined systems improve overall data quality and reduce the need for extensive manual calibration in modeling workflows [132]. For example, machine learning algorithms have been employed to downscale and refine satellite-derived soil moisture using MODIS inputs such as land surface temperature and vegetation indices [140]. The resulting datasets exhibit improved spatial resolution and accuracy, supporting both agricultural and hydrological decision-making.

Collectively, these AI-enhanced preprocessing techniques represent a transformative shift in the way remote sensing data is utilized for watershed modeling. By increasing spatial precision, reducing systematic errors, and filling temporal gaps, AI enables the development of robust and transferable hydrological models that are especially useful in ungauged or poorly monitored regions. These tools support more informed water resource management, flood forecasting, and climate adaptation planning. A schematic representation of AI-enhanced preprocessing of remote sensing data for hydrological modeling is shown in Figure 4. Satellite-derived inputs undergo AI-based techniques such as downscaling, bias correction, and temporal interpolation to produce high-resolution, bias-adjusted, and gap-filled datasets suitable for streamflow prediction in ungauged watersheds.

3.4.3. AI-Remote Sensing Synergy in Ungauged Catchment Mapping

Ungauged catchments, regions without in situ hydrological measurements, pose substantial challenges in water resource management, flood forecasting, and ecosystem assessment. The synergy of AI and remote sensing technologies offers a scalable and data-informed solution by extracting hydrologically relevant information from satellite datasets and applying machine learning models to simulate watershed behavior where ground data are absent.

High-resolution DEMs, vegetation indices (NDVI), precipitation products, and surface water proxies captured via remote sensing serve as key inputs for AI algorithms such as CNNs, SVMs, LSTMs, clustering techniques, and hybrid frameworks. These approaches enable tasks such as catchment boundary delineation, basin classification, streamflow estimation, runoff or soil moisture mapping, and flood inundation modeling across ungauged or poorly gauged regions.

For instance, CNN-based DEM analysis has been shown to automatically delineate river networks and watershed extents in ungauged basins with comparable accuracy to manual methods [35]. Unsupervised classification techniques such as PCA and clustering applied to physiographic and hydroclimate variables derived from remote sensing enable grouping of catchments with similar hydrological behavior, facilitating regionalization and transfer learning [141,142]. Machine learning regression models using satellite-derived proxies, such as vegetation indices, precipitation estimates, and terrain features, have effectively predicted streamflow and discharge in large ungauged basins, including the Amazon, achieving robust performance characterized by high correlations and NSE [143].

Moreover, flood inundation mapping in ungauged watersheds benefits significantly from integrated AI–remote sensing frameworks. Coupled hydrometeorological–hydraulic modeling, informed by satellite precipitation and terrain data, has successfully reconstructed the catastrophic 2006 flash flood in Volos City, Greece [144]. Meanwhile, satellite-derived discharge estimates across the Amazon basin produced the first continuous mapping of river discharge using remote sensing data combined with machine learning models [143].

Finally, hybrid AI systems combining attention-based deep neural networks and geospatial remote sensing have enabled simultaneous estimation of runoff, evapotranspiration, and soil moisture over multi-basin settings, delivering high accuracy while minimizing the need for local calibration [145]. These methods also support the sustainable development of urban–environmental systems through high-resolution water-budget analysis in riverine megacities [142]. To synthesize the key contributions of AI and remote sensing in ungauged watershed studies, Table 6 presents a summary of major hydrological applications, corresponding remote sensing inputs, AI techniques used, and associated outcomes based on recent literature.

4. Discussion

The review presented provides an analysis of the evolution and current state of hydrological modeling for streamflow prediction in ungauged watersheds that encompasses the physically based distributed models, conceptual lumped models, and data-driven approaches, including AI and remote sensing integration.

Recent literature consistently demonstrates the strong performance of AI models in simulating streamflow in ungauged watersheds across various climatic and physiographic settings. Recurrent neural networks such as LSTM and GRU have shown superior capabilities in capturing temporal dependencies in rainfall–runoff processes [43,95,98]. Deep learning architectures have proven especially valuable in data-scarce regions when paired with remote sensing inputs like GPM precipitation, MODIS vegetation indices, and DEM-based terrain data [1,37,101,130]. In tropical and monsoonal basins, models integrating AI and RS outperform traditional hydrological models in both peak and low-flow simulations [106,143].

Hybrid frameworks that combine ML models with physical or conceptual constraints, such as vegetation-constrained ecohydrological models, further enhance model generalizability in ungauged scenarios [1,16,141]. Transfer learning and regionalization techniques have also been employed to adapt AI models trained on gauged basins to ungauged locations using physiographic similarity and clustering methods [9,14,26,65,150]. Moreover, AI has enabled advanced hydrological tasks such as flood mapping [135,151,152], discharge estimation via satellite altimetry [143], catchment delineation from DEMs [35], and bias correction of RS inputs [116,133]. Among these advances, gaps remain in the integration of AI with physically based hydrological understanding and in addressing long-term robustness and uncertainty. Nonetheless, the synergy of AI and RS holds transformative potential for operational hydrology in ungauged and poorly monitored basins [37,145,149].

Despite the promising advancements, several inconsistencies and limitations persist across AI-based hydrological modeling studies. One major issue is the variability in model performance across different geographic and climatic regions, where AI models that perform well in temperate or data-rich basins often underperform in arid, snow-dominated, or tropical mountainous areas due to input data sparsity and regional heterogeneity [13,46,82]. Additionally, model transferability remains challenging, while regionalization and transfer learning strategies attempt to adapt models from gauged to ungauged basins, their success is often limited by physiographic dissimilarities and lack of calibration data [9,25,65]. There is also a lack of consensus on the optimal AI approach, as some studies report LSTM outperforming traditional ANNs or SVMs [97,98]. In contrast, others find that simpler models such as RF or hybrid wavelet models provide more stable results under noisy or irregular data conditions [8,88,108]. Furthermore, traditional conceptual or physically based models still occasionally outperform AI models, particularly when expert knowledge, long-term memory of hydrological processes, or robust calibration is essential [44,51].

A significant limitation is the inconsistent reporting of model validation procedures and uncertainty quantification. Few studies comprehensively evaluate model robustness, overfitting risk, or bias, especially under changing climatic or land-use conditions [68,79,114]. Moreover, evaluation metrics are not always standardized, making it difficult to compare model performance across studies [75,96]. Data dependency remains a core limitation; many AI models rely on high-resolution satellite inputs or long time-series data, which are not uniformly available across global ungauged basins, thereby limiting generalizability and operational scalability [101,130,145]. These challenges underscore the need for more transparent benchmarking, improved uncertainty handling, and hybrid approaches that integrate physical knowledge with AI flexibility.

Several critical research gaps remain. While AI has significantly advanced hydrological modeling, several notable research gaps remain, particularly in its application to ungauged watersheds. First, there is a geographic imbalance in research coverage, with a concentration of studies in data-rich regions such as North America, Europe, and parts of China. At the same time, large portions of Africa, Southeast Asia, and South America remain underrepresented [16,45,72,143]. This spatial bias limits the global applicability and equity of AI-based solutions in hydrology. Second, many studies rely on similar satellite-derived datasets, which, although useful, may not fully capture local hydrological dynamics or may introduce common errors across models [67,116,144]. This creates a dependency on specific remote sensing products, raising questions about the generalizability of AI models across diverse hydrological conditions and data infrastructures.

Moreover, there is a limited integration of AI with physically based or conceptual hydrological models, which could offer enhanced interpretability and physical consistency [26,63,92]. Hybrid models that combine the strengths of data-driven learning with physical laws remain underexplored, especially for modeling runoff generation, baseflow separation, or extreme event prediction in ungauged regions [109,130,145]. Another critical gap is the lack of studies addressing long-term robustness and sustainability of AI models, particularly under non-stationary conditions caused by land-use change or climate variability [55,96,135]. Additionally, few studies have explored real-time operationalization of AI models for water resource management or flood warning systems in data-scarce settings [113,149]. Reproducibility and open science practices are still emerging, with limited availability of standardized benchmark datasets, source codes, or model architectures [75,111,137].

In AI-based streamflow modeling, evaluation and comparison of model performance remain essential yet inconsistently addressed across studies. Many works adopt standard statistical metrics, such as NSE, coefficient of determination, RMSE, and MAE, to evaluate accuracy [67,92,107,143]. However, these metrics are often applied without clear justification for their relevance to specific hydrological objectives or thresholds. Uncertainty quantification, bias assessment, and overfitting detection are still limited, despite their critical role in ungauged settings where ground-truth validation is lacking [130,137]. Although cross-validation techniques such as LOOCV or k-fold validation are sometimes applied [113], their implementation varies in rigor and transparency.

Furthermore, few studies conduct systematic comparisons across multiple AI models under the same dataset or hydrological conditions, which limits our ability to generalize findings across regions [116,136]. Comparative benchmarks involving traditional physically based models, conceptual models, and deep learning approaches are rare and often suffer from inconsistent preprocessing or spatial resolutions [58,109,143]. Additionally, model explainability and interpretability metrics are seldom reported, making it difficult to understand how models reach predictions, particularly in catchments lacking ground validation [79,130]. To ensure reliability and transferability, future work must promote transparent reporting, uncertainty analysis, and harmonized evaluation protocols, especially when working in data-scarce environments. Table 7 summarizes key practices in assessing AI model performance for streamflow prediction in ungauged basins, including commonly used evaluation metrics, uncertainty assessment approaches, validation techniques, benchmarking efforts, model interpretability, and reproducibility based on recent literature.

Despite the promising capabilities of AI models in streamflow prediction, reproducibility and transferability remain critical challenges in ungauged watershed applications. Many studies demonstrate strong performance in specific case studies, yet fail to provide publicly accessible source codes, training data, or model configurations, limiting reproducibility across research groups [31,73,106]. Furthermore, while transfer learning techniques and regionalization approaches have been explored to adapt models trained in gauged basins to ungauged ones [37,97,141], the success of these approaches often depends heavily on the similarity in climatic, physiographic, and hydrological characteristics between donor and target catchments.

For instance, clustering methods and unsupervised classification have been used to group hydrologically similar basins, enabling more effective model transfer [103,104,141,142]. However, transferability tends to degrade when applied across heterogeneous regions or under extreme hydrological conditions [65,82]. Moreover, there is a lack of consistent reporting on how models are adapted, retrained, or fine-tuned when applied to new regions, making it challenging to assess generalizability. In addition, only a limited number of studies leverage open-source platforms or benchmark datasets to facilitate reproducibility and cross-comparison [30,58,83]. This lack of transparency and standardized protocols hinders the broader adoption and trust in AI-driven hydrological tools, especially for decision-making in data-scarce regions.

A critical challenge in AI-based streamflow modeling, particularly in developing regions, lies in the delayed availability of hydrological data. Institutional datasets often undergo lengthy verification and evaluation processes before being released, sometimes several years after measurement. These delays limit the timely calibration and validation of hydrological and AI-based models, especially for operational forecasting. Moreover, data-sharing restrictions across agencies exacerbate the issue, resulting in fragmented hydrological records that hinder reproducibility and the real-time application of findings.

Another important consideration is the complexity of modeling transboundary rivers, where hydrological data may be governed by multiple institutions across political boundaries. The lack of standardized monitoring, differing data access policies, and political sensitivities often restrict open data exchange, creating significant uncertainty for AI-based modeling in such contexts. Recent studies, such as from Yaseen et al. (2016) [153], suggest that incorporating temporal cycles and large-scale climatic signals may provide a way forward in data-constrained and transboundary settings. Future research should explore how AI-driven regionalization, proxy data, and multi-source data fusion can be adapted for shared river basins, ensuring equitable and reliable streamflow prediction across political borders.

The present review makes a significant and timely contribution to the evolving discourse at the intersection of AI, RS, and hydrological modeling, with a focused lens on ungauged watersheds, a domain often constrained by data scarcity and modeling uncertainties. Unlike earlier reviews that focus solely on either AI or remote sensing, this study offers a triangulated perspective by jointly evaluating physically based, conceptual, and data-driven models within the unique context of ungauged basins. Compared to prior reviews, which often focus either on AI methods in general hydrology or remote sensing applications in water science, this work provides a comprehensive, structured synthesis explicitly tailored to streamflow prediction in ungauged or poorly gauged catchments. By organizing the literature across multiple modeling paradigms, physically based, conceptual, and data-driven, the review bridges traditional hydrology with AI-enhanced frameworks. It illustrates how these approaches can be harmonized in data-limited environments.

One of the key novel aspects of this review lies in its integration of AI and remote sensing applications across all stages of the hydrological modeling pipeline, from data preprocessing (bias correction, downscaling, interpolation) to model training, regionalization, and uncertainty assessment. Furthermore, this review advances the field by systematically identifying evaluation inconsistencies, data dependencies, and transferability limitations, providing an evidence-based agenda for future research. The inclusion of comparative tables summarizing model performance, evaluation practices, and AI–RS synergy in ungauged basins offers valuable resources for researchers and practitioners aiming to deploy scalable, replicable, and operational hydrological tools.

Ultimately, this review serves not only as a scientific synthesis but also as a practical roadmap for enhancing hydrological modeling under uncertainty, advocating for hybrid modeling strategies, reproducibility standards, and the development of benchmark datasets. In doing so, it supports the broader hydrology community in leveraging AI’s potential to address global water challenges in both data-rich and data-poor contexts. In the context of climate change, this synthesis supports the development of resilient water management systems aligned with UN SDG 6 (Clean Water and Sanitation) and SDG 13 (Climate Action) by enabling predictive capabilities in vulnerable, data-scarce regions.

5. Challenges and Future Directions

Despite the promising advancements in AI-based streamflow prediction for ungauged watersheds, several critical challenges must be addressed to realize their potential fully. A significant obstacle is the lack of high-quality, spatially representative, and diverse training datasets. Many existing AI models are developed using data biased toward well-gauged regions in developed countries, which limits their transferability and performance in data-scarce or hydrologically complex basins. Expanding standardized, globally distributed datasets, such as CAMELS, HYSETS, and new satellite-based observations, is, therefore, crucial, especially for underrepresented tropical, mountainous, and arid regions.

Another critical challenge is the integration of AI with hydrological process understanding. Hybrid approaches that combine machine learning with physical principles can help reduce equifinality, improve extrapolation to extreme events, and maintain physical consistency under limited calibration. Recent advances in PINNs further illustrate how embedding physical constraints directly into learning architectures can improve generalization and model reliability. Similarly, GNNs offer promise for river network modeling by leveraging basin connectivity. At the same time, explainable AI (XAI) tools such as SHAP, LIME, and attention mechanisms can address the opacity of deep learning models, building trust among decision-makers.

Remote sensing remains indispensable for supporting AI-based hydrology, but integration challenges persist due to spatial–temporal mismatches, noise, and multi-resolution data sources. Robust data fusion techniques, combining satellite, reanalysis, and sparse in situ data, are needed to maximize reliability. At the same time, practical issues of reproducibility, computational efficiency, and deployment in resource-limited environments highlight the importance of standardized benchmarking protocols and open science practices. Cloud-based platforms like Google Earth Engine and Microsoft Planetary Computer also offer promising avenues for real-time, large-scale applications in remote or resource-limited regions.

To translate these technical advances into practice, greater collaboration between hydrologists, data scientists, and policy-makers is essential. Embedding AI-driven tools into operational workflows, such as flood early warning systems, real-time decision support tools, and water allocation strategies, can bridge the gap between model development and impact-driven implementation.

Looking ahead, future research should focus on integrating innovative approaches that bridge the gap between AI-based models and sustainable water management. Nature-based solutions (NbS), such as river restoration, wetland conservation, and green infrastructure, can be coupled with hydrological and AI models to improve flow regulation while enhancing ecosystem services. Additionally, the concept of digital twins, virtual replicas of water systems continuously updated with real-time data, presents a powerful tool for dynamic monitoring and scenario testing. By linking physical models with real-time sensor networks, researchers can achieve higher predictive accuracy, adaptive forecasting, and proactive management of extreme events. Embedding these approaches into AI-based hydrology will advance the reliability, sustainability, and resilience of flow modeling in the face of climate and land-use change.

To systematically organize these future directions and ensure actionable guidance, we propose a structured roadmap (Figure 5) that categorizes challenges and solutions into four pillars: Model Development and Benchmarking, Data Integration and Digital Twins, Operational Deployment and Reproducibility, and Governance and Ethics. This roadmap addresses critical gaps by outlining actionable strategies across technical, operational, and ethical dimensions. To advance AI-based streamflow prediction in ungauged watersheds, a structured roadmap is essential to address critical challenges and enable sustainable adoption. Key priorities include the development of community-driven benchmarking protocols and open-access datasets to standardize model evaluation and improve transferability across diverse hydrological settings. Integrating digital twins with cloud platforms and IoT-enabled sensor networks can facilitate real-time data assimilation, adaptive forecasting, and dynamic water system management. Addressing operational deployment barriers requires containerized workflows and reproducible pipelines, ensuring transparency, portability, and reliable model execution across platforms. Finally, governance and ethics must be embedded in the AI lifecycle through explainable AI frameworks and clear policy guidelines, promoting fairness, accountability, and stakeholder trust. Together, these strategies form a roadmap for advancing AI in hydrological modeling, bridging the gap between technical innovation and operational impact.

Beyond its practical orientation, the roadmap also serves as a theory-building framework for reconceptualizing the role of AI in hydrology. By aligning technical innovations with methodological and ethical considerations, the roadmap highlights a shift from purely data-driven prediction toward a more integrated paradigm that combines physical consistency, interpretability, and decision relevance. This reflects a critical interpretive synthesis of the reviewed literature: rather than simply summarizing individual studies, the framework identifies higher-order themes, such as prediction–interpretation trade-offs, cross-basin transferability, and governance of AI tools, that collectively redefine how hydrological knowledge is produced and applied. In this sense, the roadmap contributes not only to guiding future research but also to reframing AI in hydrology as both a predictive and theory-generating paradigm, advancing the scientific understanding of hydrological processes under uncertainty.

These multi-dimensional insights are visually synthesized in Figure 6, which presents a conceptual framework summarizing the key components and knowledge pathways outlined in this review. The framework consists of five interconnected domains: (1) Challenges, including data scarcity, geographic bias, limited transferability, and weak uncertainty handling; (2) AI Approaches, such as machine learning, deep learning, hybrid models, and XAI techniques; (3) Remote Sensing Integration, which enables spatial modeling; (4) Hydrological Applications, ranging from streamflow prediction and catchment delineation to bias correction and flood mapping; and (5) Research Gaps and Future Directions, which call for improved physics-informed modeling, open science practices, and interdisciplinary collaboration. This framework not only encapsulates the review’s findings but also serves as a strategic guide for future research and operational adoption of AI in hydrology, particularly in ungauged and vulnerable river basins.

6. Conclusions

This review comprehensively examined the evolution, state-of-the-art methodologies, and future directions of AI applications in hydrological modeling for streamflow prediction in ungauged watersheds. Ungauged basins, characterized by a lack of in situ measurements, pose persistent challenges to traditional hydrological modeling frameworks, particularly due to difficulties in calibration, parameter regionalization, and data sparsity. In response, AI, ranging from conventional machine learning algorithms (random forests, support vector machines) to advanced deep learning models (LSTM, GRU, CNN), has emerged as a transformative tool capable of learning complex nonlinear hydrologic relationships and leveraging remotely sensed and proxy data inputs.

The review highlights that AI models consistently outperform traditional physically based and conceptual models across a range of hydroclimatic settings, particularly when enhanced with satellite-derived variables such as precipitation (GPM, TRMM), NDVI, DEMs, and soil moisture products. Hybrid approaches that integrate AI with physical constraints or embed domain knowledge (ecohydrological rules, attention mechanisms, regionalization) show particular promise in improving model transferability and robustness in data-scarce basins. Furthermore, advances in bias correction, downscaling, and gap filling using AI have significantly enhanced the utility of remote sensing products for hydrological applications. Beyond technical applications, these developments suggest a reconceptualization of hydrological AI as not merely predictive, but as a theory-building paradigm that integrates physical laws, data-driven insights, and interpretability for sustainable water management.

What sets this review apart from prior literature is its holistic synthesis of AI and remote sensing integration for ungauged streamflow prediction, spanning from hydrological model evolution, AI-enhanced pre-processing techniques, hybrid frameworks, and model evaluation protocols to reproducibility and transferability challenges. Unlike earlier studies that focus narrowly on algorithm performance, this review identifies cross-cutting themes such as proxy variable synthesis, explainability, regionalization strategies, and open science practices. It also introduces a structured comparison of model evaluation techniques, highlights underexplored geographies like Southeast Asia and tropical basins, and proposes a forward-looking research agenda that emphasizes scalability, equity, and sustainability.

Nonetheless, challenges persist. Model performance remains inconsistent across diverse climatic zones, catchment dissimilarities still constrain model transferability, and uncertainty quantification is underutilized. Moreover, the lack of standardized benchmarking protocols and limited access to reproducible workflows hamper broader adoption. To address these gaps, future research should focus on:

Developing physics-informed AI models that enhance interpretability and maintain hydrological realism;
Creating region-specific proxy datasets for ungauged conditions;
Promoting open-source datasets, reproducible code, and model transparency;
Embedding long-term robustness testing under non-stationary land use and climate conditions.

Ultimately, this review contributes a timely, integrative, and forward-looking roadmap for advancing AI-enhanced hydrological modeling in ungauged basins. Its findings are expected to inform water resource planners, researchers, and AI developers seeking to build resilient, data-efficient, and transferable prediction systems in a rapidly changing hydrological world.

Author Contributions

Conceptualization, C.E.F.M. and R.G.T.M.; methodology, C.E.F.M. and R.G.T.M.; software, J.G.G., R.G.T.M. and J.C.F.M.; validation, R.G.T.M. and J.C.F.M.; formal analysis, R.G.T.M. and J.G.G.; investigation, R.G.T.M. and J.C.F.M.; resources, J.G.G. and C.E.F.M.; data curation, R.G.T.M., J.G.G. and J.C.F.M.; writing—original draft preparation, R.G.T.M. and J.C.F.M.; writing—review and editing, J.G.G. and C.E.F.M.; visualization, J.G.G.; supervision, C.E.F.M.; project administration, C.E.F.M.; funding acquisition, C.E.F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This review does not involve the generation or analysis of new empirical data. All findings are based on previously published, peer-reviewed literature, which is appropriately cited throughout the manuscript. No new data were created or analyzed in this study. Readers seeking additional details should consult the sources listed in the References section.

Acknowledgments

The authors would like to acknowledge Mapúa University for providing financial support for the Article Processing Charges (APC) associated with this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
ANN	Artificial Neural Network
BFI	Baseflow Index
CAMELS	Catchment Attributes and Meteorology for Large-Sample Studies
CNN	Convolutional Neural Network
DEM	Digital Elevation Model
DHM	Distributed Hydrological Model
DL	Deep Learning
FDC	Flow Duration Curve
GEE	Google Earth Engine
GPM	Global Precipitation Measurement
GR4J	Génie Rural à 4 paramètres Journalier (GR4J rainfall-runoff model)
GPR	Gaussian Process Regression
HBV	Hydrologiska Byråns Vattenbalansavdelning model
KGE	Kling–Gupta Efficiency
LOOCV	Leave-One-Out Cross-Validation
LSTM	Long Short-Term Memory
MAE	Mean Absolute Error
ML	Machine Learning
MODIS	Moderate Resolution Imaging Spectroradiometer
NAM	Nedbør-Afstrømnings-Model
NDVI	Normalized Difference Vegetation Index
NSE	Nash–Sutcliffe Efficiency
PCA	Principal Component Analysis
R2	Coefficient of Determination
RF	Random Forest
RMSE	Root Mean Square Error
RS	Remote Sensing
RNN	Recurrent Neural Network
SVM	Support Vector Machine
TRMM	Tropical Rainfall Measuring Mission
VELMA	Visualizing Ecosystem Land Management Assessments
DHSVM	Distributed Hydrology Soil Vegetation Model
MIKE SHE	(Commercially available) Integrated Hydrological Modeling System by DHI

References

Panda, C.; Panda, K.C.; Singh, R.M.; Singh, R.; Singh, V.P. A Generalised Hydrological Model for Streamflow Prediction Using Wavelet Ensembling. J. Hydrol. 2025, 655, 132883. [Google Scholar] [CrossRef]
Dewangan, M.; Vishvakarma, P. Hydrological Modeling and Its Applications: A Review. Int. J. Early Child. Spec. Educ. 2020, 12, 966–971. [Google Scholar]
Ibrahim, U.A.; Dan’azumi, S. An Overview of Some Hydrological Models in Water Resources Engineering Systems. Arid. Zone J. Eng. Technol. Environ. 2020, 16, 285–292. [Google Scholar]
Baran-Gurgul, K.; Rutkowska, A. Water Resource Management: Hydrological Modelling, Hydrological Cycles, and Hydrological Prediction. Water 2024, 16, 3689. [Google Scholar] [CrossRef]
Singson, C.L.; Alejo, L.A.; Balderama, O.F.; Bareng, J.L.R.; Kantoush, S.A. Modeling Climate Change Impact on the Inflow of the Magat Reservoir Using the Soil and Water Assessment Tool (SWAT) Model for Dam Management. J. Water Clim. Change 2023, 14, 633–650. [Google Scholar] [CrossRef]
Zhao, Q.; Ding, S.; Ji, X.; Hong, Z.; Lu, M.; Wang, P. Relative Contribution of the Xiaolangdi Dam to Runoff Changes in the Lower Yellow River. Land 2021, 10, 521. [Google Scholar] [CrossRef]
Gacu, J.G.; Monjardin, C.E.F.; Senoro, D.B.; Tan, F.J. Flood Risk Assessment Using GIS-Based Analytical Hierarchy Process in the Municipality of Odiongan, Romblon, Philippines. Appl. Sci. 2022, 12, 9456. [Google Scholar] [CrossRef]
Budhathoki, B.R.; Adhikari, T.R.; Shrestha, S.; Awasthi, R.P.; Dawadi, B.; Gao, H.; Dhital, Y.P. Application of Hydrological Models to Streamflow Estimation at Ungauged Transboundary Himalayan River Basin, Nepal. Hydrol. Res. 2024, 55, 859–872. [Google Scholar] [CrossRef]
Belina, Y.; Kebede, A.; Masinde, M. Comparative Analysis of HEC-HMS and Machine Learning Models for Rainfall-Runoff Prediction in the Upper Baro Watershed, Ethiopia. Hydrol. Res. 2024, 55, 873–889. [Google Scholar] [CrossRef]
de Paula Netto, M.O.; Coimbra, V.S.; Lagarez Junior, M.L.; Ferreira, A.A.; Rocha, C.H.B. Comparative Analysis of SWAT and HEC-HMS Models for Efficient Watershed Management. Rev. Gestão Soc. e Ambient. 2024, 18, e09931. [Google Scholar] [CrossRef]
Ferreira, R.G.; Dias, R.L.S.; de Siqueira Castro, J.; dos Santos, V.J.; Calijuri, M.L.; da Silva, D.D. Performance of Hydrological Models in Fluvial Flow Simulation. Ecol. Inform. 2021, 66, 101453. [Google Scholar] [CrossRef]
Dadson, S.J.; Hall, J.W.; Murgatroyd, A.; Acreman, M.; Bates, P.; Beven, K.; Holden, J.; Holman, I.P.; Lane, S.N.; Connell, E.O.; et al. A Restatement of the Natural Science Evidence Concerning Flood Management in the UK. Proc. R. Soc. A 2017, 473, 20160706. [Google Scholar] [CrossRef] [PubMed]
Beven, K. Rainfall-Runoff Modelling: The Primer, 2nd ed.; John Wiley & Sons, Ltd.: Oxford, UK; Chichester, UK; Hoboken, NJ, USA, 2012; ISBN 9781119951001. [Google Scholar]
Arnold, J.G.; Srinivasan, R.; Muttiah, S.; Williams, J.R. Large Area Hydrologic Modeling and Assessment Part I: Model Development. J. Am. Water Resour. Assoc. 2007, 34, 73–89. [Google Scholar] [CrossRef]
Özdoğan-Sarıkoç, G.; Dadaser-Celik, F. Physically Based vs. Data-Driven Models for Streamflow and Reservoir Volume Prediction at a Data-Scarce Semi-Arid Basin. Environ. Sci. Pollut. Res. 2024, 31, 39098–39119. [Google Scholar] [CrossRef]
Tran Tuan, T. Multiple Conceptual Hydrological Models for Simulating Streamflow in Data-Sparse River Basins: An Application of the Vietnamese Cau River Basin. Water Pract. Technol. 2024, 19, 2944–2958. [Google Scholar] [CrossRef]
Monjardin, C.E.F.; Uy, F.A.A.; Tan, F.; Cruz, F.R.G. Automated Real-Time Monitoring System (ARMS) of Hydrological Parameters for Ambuklao, Binga and San Roque Dams Cascade in Luzon Island, Philippines. In Proceedings of the 2017 IEEE Conference on Technologies for Sustainability (SusTech), Phoenix, AZ, USA, 12–14 November 2017. [Google Scholar]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N. Prabhat Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
Shen, C. A Transdisciplinary Review of Deep Learning Research and Its Relevance for Water Resources Scientists. Water Resour. Res. 2018, 54, 8558–8593. [Google Scholar] [CrossRef]
Loukas, A.; Vasiliades, L. Streamflow Simulation Methods for Ungauged and Poorly Gauged Watersheds. Nat. Hazards Earth Syst. Sci. 2014, 14, 1641–1661. [Google Scholar] [CrossRef]
Fischer, S.; Dallan, E.; Fiori, A.; Grimaldi, S.; Kochanek, K.; Prieto, C.; Reis, D.S.; Volpi, E. Hydrological Design in the HELPING Decade–Inspiring the Community to Innovate the Hydrological Design Concept. Hydrol. Sci. J. 2025, 70, 375–389. [Google Scholar] [CrossRef]
Nearing, G.S.; Kratzert, F.; Sampson, A.K.; Pelissier, C.S.; Klotz, D.; Frame, J.M.; Prieto, C.; Gupta, H.V. What Role Does Hydrological Science Play in the Age of Machine Learning? Water Resour. Res. 2021, 57, e2020WR028091. [Google Scholar] [CrossRef]
Coulibaly, P.; Samuel, J.; Pietroniro, A.; Harvey, D. Evaluation of Canadian National Hydrometric Network Density Based on WMO 2008 Standards. Can. Water Resour. J. 2013, 38, 159–167. [Google Scholar] [CrossRef]
Kovacek, D.; Weijs, S. BCUB—A Large-Sample Ungauged Basin Attribute Dataset for British Columbia, Canada. Earth Syst. Sci. Data 2025, 17, 259–275. [Google Scholar] [CrossRef]
Gal, L.; Grippa, M.; Hiernaux, P.; Peugeot, C.; Mougin, E.; Kergoat, L. Changes in Lakes Water Volume and Runoff over Ungauged Sahelian Watersheds. J. Hydrol. 2016, 540, 1176–1188. [Google Scholar] [CrossRef]
Tourian, M.J.; Papa, F.; Elmi, O.; Sneeuw, N.; Kitambo, B.; Tshimanga, R.M.; Paris, A.; Calmant, S. Current Availability and Distribution of Congo Basin’s Freshwater Resources. Commun. Earth Environ. 2023, 4, 174. [Google Scholar] [CrossRef]
Sivapalan, M.; Takeuchi, K.; Franks, S.W.; Gupta, V.K.; Karambiri, H.; Lakshmi, V.; Liang, X.; McDonnell, J.J.; Mendiondo, E.M.; O’Connell, P.E.; et al. IAHS Decade on Predictions in Ungauged Basins (PUB), 2003–2012: Shaping an Exciting Future for the Hydrological Sciences. Hydrol. Sci. J. 2003, 48, 857–880. [Google Scholar] [CrossRef]
Hrachowitz, M.; Savenije, H.H.G.; Blöschl, G.; McDonnell, J.J.; Sivapalan, M.; Pomeroy, J.W.; Arheimer, B.; Blume, T.; Clark, M.P.; Ehret, U.; et al. A Decade of Predictions in Ungauged Basins (PUB)-a Review. Hydrol. Sci. J. 2013, 58, 1198–1255. [Google Scholar] [CrossRef]
Xie, J.; Liu, X.; Jasechko, S.; Berghuijs, W.R.; Wang, K.; Liu, C.; Reichstein, M.; Jung, M.; Koirala, S. Majority of Global River Flow Sustained by Groundwater. Nat. Geosci. 2024, 17, 770–777. [Google Scholar] [CrossRef]
Nearing, G.; Cohen, D.; Dube, V.; Gauch, M.; Gilon, O.; Harrigan, S.; Hassidim, A.; Klotz, D.; Kratzert, F.; Metzger, A.; et al. Global Prediction of Extreme Floods in Ungauged Watersheds. Nature 2024, 627, 559–563. [Google Scholar] [CrossRef]
Huang, P.; Wang, G.; Guo, L.; Mello, C.R.; Li, K.; Ma, J.; Sun, S. Most Global Gauging Stations Present Biased Estimations of Total Catchment Discharge. Geophys. Res. Lett. 2023, 50, e2023GL104253. [Google Scholar] [CrossRef]
Monjardin, C.E.F.; Transfiguracion, K.M.; Mangunay, J.P.J.; Paguia, K.M.; Uy, F.A.A.; Tan, F.J. Determination of River Water Level Triggering Flood in Manghinao River in Bauan, Batangas, Philippines. J. Mech. Eng. 2021, 18, 181–192. [Google Scholar] [CrossRef]
Tran, T.N.D.; Le, M.H.; Zhang, R.; Nguyen, B.Q.; Bolten, J.D.; Lakshmi, V. Robustness of Gridded Precipitation Products for Vietnam Basins Using the Comprehensive Assessment Framework of Rainfall. Atmos. Res. 2023, 293, 106923. [Google Scholar] [CrossRef]
Lu, Z.; Kim, J. A Framework for Studying Hydrology-Driven Landslide Hazards in Northwestern US Using Satellite InSAR, Precipitation and Soil Moisture Observations: Early Results and Future Directions. GeoHazards 2021, 2, 17–40. [Google Scholar] [CrossRef]
Idowu, D.; Peter, B.G.; Boakye, J.; Cohen, S.; Carter, E. Evaluating Earth Observation Products for Catchment-Scale Operational Flood Monitoring and Risk Management in a Sparsely Gauged to Ungauged River Basin in Nigeria. Int. J. Appl. Earth Obs. Geoinf. 2025, 138, 104445. [Google Scholar] [CrossRef]
Dembélé, M.; Hrachowitz, M.; Savenije, H.H.G.; Mariéthoz, G.; Schaefli, B. Improving the Predictive Skill of a Distributed Hydrological Model by Calibration on Spatial Patterns With Multiple Satellite Data Sets. Water Resour. Res. 2020, 56, e2019WR026085. [Google Scholar] [CrossRef]
Feng, D.; Lawson, K.; Shen, C. Mitigating Prediction Error of Deep Learning Streamflow Models in Large Data-Sparse Regions With Ensemble Modeling and Soft Data. Geophys. Res. Lett. 2021, 48, e2021GL092999. [Google Scholar] [CrossRef]
Arsenault, R.; Brissette, F.; Martel, J.L.; Troin, M.; Lévesque, G.; Davidson-Chaput, J.; Gonzalez, M.C.; Ameli, A.; Poulin, A. A Comprehensive, Multisource Database for Hydrometeorological Modeling of 14,425 North American Watersheds. Sci. Data 2020, 7, 243. [Google Scholar] [CrossRef]
Fekete, B.M.; Vörösmarty, C.J.; Grabs, W. High-Resolution Fields of Global Runoff Combining Observed River Discharge and Simulated Water Balances. Glob. Biogeochem. Cycles 2002, 16, 15-1–15-10. [Google Scholar] [CrossRef]
Newman, A.J.; Clark, M.P.; Sampson, K.; Wood, A.; Hay, L.E.; Bock, A.; Viger, R.J.; Blodgett, D.; Brekke, L.; Arnold, J.R.; et al. Development of a Large-Sample Watershed-Scale Hydrometeorological Data Set for the Contiguous USA: Data Set Characteristics and Assessment of Regional Variability in Hydrologic Model Performance. Hydrol. Earth Syst. Sci. 2015, 19, 209–223. [Google Scholar] [CrossRef]
Ouyang, W.; Ye, L.; Chai, Y.; Ma, H.; Chu, J.; Peng, Y.; Zhang, C. A Differentiable, Physics-Based Hydrological Model and Its Evaluation for Data-Limited Basins. J. Hydrol. 2025, 649, 132471. [Google Scholar] [CrossRef]
Leeming, K.A.; Bloomfield, J.P.; Coxon, G.; Zheng, Y. Functional Data Analysis to Investigate Controls on and Changes in the Seasonality of UK Baseflow. Hydrol. Sci. J. 2025, 70, 522–534. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Herrnegger, M.; Sampson, A.K.; Hochreiter, S.; Nearing, G.S. Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning. Water Resour. Res. 2019, 55, 11344–11354. [Google Scholar] [CrossRef]
Mosavi, A.; Ozturk, P.; Chau, K.W. Flood Prediction Using Machine Learning Models: Literature Review. Water 2018, 10, 1536. [Google Scholar] [CrossRef]
Palmer, D.; Koubli, E.; Cole, I.; Betts, T.; Gottschalg, R. Satellite or Ground-Based Measurements for Production of Site Specific Hourly Irradiance Data: Which Is Most Accurate and Where? Sol. Energy 2018, 165, 240–255. [Google Scholar] [CrossRef]
Gauch, M.; Kratzert, F.; Klotz, D.; Nearing, G.; Lin, J.; Hochreiter, S. Rainfall-Runoff Prediction at Multiple Timescales with a Single Long Short-Term Memory Network. Hydrol. Earth Syst. Sci. 2021, 25, 2045–2062. [Google Scholar] [CrossRef]
Tyralis, H.; Papacharalampous, G.; Langousis, A. Super Ensemble Learning for Daily Streamflow Forecasting: Large-Scale Demonstration and Comparison with Multiple Machine Learning Algorithms. Neural Comput. Appl. 2021, 33, 3053–3068. [Google Scholar] [CrossRef]
Li, W.; Kiaghadi, A.; Dawson, C. High Temporal Resolution Rainfall–Runoff Modeling Using Long-Short-Term-Memory (LSTM) Networks. Neural Comput. Appl. 2021, 33, 1261–1278. [Google Scholar] [CrossRef]
Farfán-Durán, J.F.; Cea, L. Streamflow Forecasting with Deep Learning Models: A Side-by-Side Comparison in Northwest Spain. Earth Sci. Inform. 2024, 17, 5289–5315. [Google Scholar] [CrossRef]
Arsenault, R.; Martel, J.L.; Brunet, F.; Brissette, F.; Mai, J. Continuous Streamflow Prediction in Ungauged Basins: Long Short-Term Memory Neural Networks Clearly Outperform Traditional Hydrological Models. Hydrol. Earth Syst. Sci. 2023, 27, 139–157. [Google Scholar] [CrossRef]
Alipour, M.H. Streamflow Prediction in Ungauged Basins Located within Data-Scarce Areas Using XGBoost: Role of Feature Engineering and Explainability. Int. J. River Basin Manag. 2025, 23, 71–92. [Google Scholar] [CrossRef]
Gacu, J.G.; Monjardin, C.E.F.; Mangulabnan, R.G.T.; Pugat, G.C.E.; Solmerin, J.G. Artificial Intelligence (AI) in Surface Water Management: A Comprehensive Review of Methods, Applications, and Challenges. Water 2025, 17, 1707. [Google Scholar] [CrossRef]
Rahman, A.; Jahan, S.; Yildirim, G.; Alim, M.A.; Haque, M.; Rahman, M.M.; Kausher, A.H.M. A Review and Analysis of Water Research, Development, and Management in Bangladesh. Water 2022, 2022, 1834. [Google Scholar] [CrossRef]
Kamyab, H.; Khademi, T.; Chelliapan, S.; SaberiKamarposhti, M.; Rezania, S.; Yusuf, M.; Farajnezhad, M.; Abbas, M.; Hun Jeon, B.; Ahn, Y. The Latest Innovative Avenues for the Utilization of Artificial Intelligence and Big Data Analytics in Water Resource Management. Results Eng. 2023, 20, 101566. [Google Scholar] [CrossRef]
Biazar, S.M.; Golmohammadi, G.; Nedhunuri, R.R.; Shaghaghi, S.; Mohammadi, K. Artificial Intelligence in Hydrology: Advancements in Soil, Water Resource Management, and Sustainable Development. Sustainability 2025, 17, 2250. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. Int. J. Surg. 2021, 88, 105906. [Google Scholar] [CrossRef]
Wijayarathne, D.; Boodoo, S.; Coulibaly, P.; Sills, D. Evaluation of Radar Quantitative Precipitation Estimates (Qpes) as an Input of Hydrological Models for Hydrometeorological Applications. J. Hydrometeorol. 2020, 21, 1847–1864. [Google Scholar] [CrossRef]
Thorndahl, S.; Einfalt, T.; Willems, P.; Ellerbæk Nielsen, J.; Ten Veldhuis, M.C.; Arnbjerg-Nielsen, K.; Rasmussen, M.R.; Molnar, P. Weather Radar Rainfall Data in Urban Hydrology. Hydrol. Earth Syst. Sci. 2017, 21, 1359–1380. [Google Scholar] [CrossRef]
Vivoni, E.R.; Mascaro, G.; Mniszewski, S.; Fasel, P.; Springer, E.P.; Ivanov, V.Y.; Bras, R.L. Real-World Hydrologic Assessment of a Fully-Distributed Hydrological Model in a Parallel Computing Environment. J. Hydrol. 2011, 409, 483–496. [Google Scholar] [CrossRef]
Sezen, C.; Šraj, M. Improving the Simulations of the Hydrological Model in the Karst Catchment by Integrating the Conceptual Model with Machine Learning Models. Sci. Total Environ. 2024, 926, 171684. [Google Scholar] [CrossRef]
Lee, J.; Chung, E.S.; Kim, S.; Kim, D. Streamflow Forecasting in Ungauged Basins with CNN-LSTM and Radar-Based Precipitation. J. Hydro-Environ. Res. 2025, 60–61, 100666. [Google Scholar] [CrossRef]
Gui, Z.; Zhang, F.; Chang, D.; Xie, A.; Yue, K.; Wang, H. A General Method to Improve Runoff Prediction in Ungauged Basins Based on Remotely Sensed Actual Evapotranspiration Data. Water 2023, 15, 3307. [Google Scholar] [CrossRef]
Devia, G.K.; Ganasri, B.P.; Dwarakish, G.S. A Review on Hydrological Models. Aquat. Procedia 2015, 4, 1001–1007. [Google Scholar] [CrossRef]
Vieux, B.E. Distributed Hydrologic Modeling Using GIS; Springer: Dordrecht, The Netherlands, 2001; pp. 1–17. [Google Scholar] [CrossRef]
Islam, Z. A Review on Physically Based Hydrologic Modeling; Department of Civil and Environmental Engineering, University of Alberta: Edmonton, AB, Canada, 2011; p. 46. [Google Scholar] [CrossRef]
Sidle, R.C. Strategies for Smarter Catchment Hydrology Models: Incorporating Scaling and Better Process Representation. Geosci. Lett. 2021, 8, 24. [Google Scholar] [CrossRef]
Ibrahim, A.B.; Cordery, I. Estimation of Recharge and Runoff Volumes from Ungauged Catchments in Eastern Australia. Hydrol. Sci. J. 1995, 40, 499–515. [Google Scholar] [CrossRef]
Nyabeze, W.R. Calibrating a Distributed Model to Estimate Runoff for Ungauged Catchments in Zimbabwe. Phys. Chem. Earth 2005, 30, 625–633. [Google Scholar] [CrossRef]
Razavi, T.; Coulibaly, P. Streamflow Prediction in Ungauged Basins: Review of Regionalization Methods. J. Hydrol. Eng. 2013, 18, 958–975. [Google Scholar] [CrossRef]
Asadi, S.; Jimeno-Sáez, P.; López-Ballesteros, A.; Senent-Aparicio, J. Comparison and Integration of Physical and Interpretable AI-Driven Models for Rainfall-Runoff Simulation. Results Eng. 2024, 24, 103048. [Google Scholar] [CrossRef]
Ng, K.W.; Huang, Y.F.; Koo, C.H.; Chong, K.L.; El-Shafie, A.; Najah Ahmed, A. A Review of Hybrid Deep Learning Applications for Streamflow Forecasting. J. Hydrol. 2023, 625, 130141. [Google Scholar] [CrossRef]
Souffront Alcantara, M.A.; Nelson, E.J.; Shakya, K.; Edwards, C.; Roberts, W.; Krewson, C.; Ames, D.P.; Jones, N.L.; Gutierrez, A. Hydrologic Modeling as a Service (HMaaS): A New Approach to Address Hydroinformatic Challenges in Developing Countries. Front. Environ. Sci. 2019, 7, 158. [Google Scholar] [CrossRef]
Clark, M.P.; Rupp, D.E.; Woods, R.A.; Zheng, X.; Ibbitt, R.P.; Slater, A.G.; Schmidt, J.; Uddstrom, M.J. Hydrological Data Assimilation with the Ensemble Kalman Filter: Use of Streamflow Observations to Update States in a Distributed Hydrological Model. Adv. Water Resour. 2008, 31, 1309–1324. [Google Scholar] [CrossRef]
Wang, X.; Babovic, V. Application of Hybrid Kalman Filter for Improving Water Level Forecast. J. Hydroinformatics 2016, 18, 773–790. [Google Scholar] [CrossRef]
Liu, Y.; Wang, W.; Hu, Y.; Cui, W. Improving the Distributed Hydrological Model Performance in Upper Huai River Basin: Using Streamflow Observations to Update the Basin States via the Ensemble Kalman Filter. Adv. Meteorol. 2016, 2016, 4921616. [Google Scholar] [CrossRef]
Zhao, S.; Liu, M.; Tao, M.; Zhou, W.; Lu, X.; Xiong, Y.; Li, F.; Wang, Q. The Role of Satellite Remote Sensing in Mitigating and Adapting to Global Climate Change. Sci. Total Environ. 2023, 904, 166820. [Google Scholar] [CrossRef]
Mu, Y.; Biggs, T.; Shen, S.S.P. Satellite-Based Precipitation Estimates Using a Dense Rain Gauge Network over the Southwestern Brazilian Amazon: Implication for Identifying Trends in Dry Season Rainfall. Atmos. Res. 2021, 261, 105741. [Google Scholar] [CrossRef]
Aryastana, P.; Liu, C.Y.; Jong-Dao Jou, B.; Cayanan, E.; Punay, J.P.; Chen, Y.N. Assessment of Satellite Precipitation Data Sets for High Variability and Rapid Evolution of Typhoon Precipitation Events in the Philippines. Earth Space Sci. 2022, 9, e2022EA002382. [Google Scholar] [CrossRef]
Zhang, G.; Xu, T.; Yin, W.; Bateni, S.M.; Jun, C.; Kim, D.; Liu, S.; Xu, Z.; Ming, W.; Wang, J. A Machine Learning Downscaling Framework Based on a Physically Constrained Sliding Window Technique for Improving Resolution of Global Water Storage Anomaly. Remote Sens. Environ. 2024, 313, 114359. [Google Scholar] [CrossRef]
Beck, H.E.; van Dijk, A.I.J.M.; de Roo, A.; Miralles, D.G.; McVicar, T.R.; Schellekens, J.; Bruijnzeel, L.A. Global-Scale Regionalization of Hydrologic Model Parameters. Water Resour. Res. 2016, 52, 3599–3622. [Google Scholar] [CrossRef]
Dal Molin, M.; Kavetski, D.; Albert, C.; Fenicia, F. Exploring Signature-Based Model Calibration for Streamflow Prediction in Ungauged Basins. Water Resour. Res. 2023, 59, e2022WR031929. [Google Scholar] [CrossRef]
El Garnaoui, M.; Boudhar, A.; Nifa, K.; El Jabiri, Y.; Karaoui, I.; El Aloui, A.; Midaoui, A.; Karroum, M.; Mosaid, H.; Chehbouni, A. Nested Cross-Validation for HBV Conceptual Rainfall–Runoff Model Spatial Stability Analysis in a Semi-Arid Context. Remote Sens. 2024, 16, 3756. [Google Scholar] [CrossRef]
Kim, D.; Jung, I.W.; Chun, J.A. A Comparative Assessment of Rainfall-Runoff Modelling against Regional Flow Duration Curves for Ungauged Catchments. Hydrol. Earth Syst. Sci. 2017, 21, 5647–5661. [Google Scholar] [CrossRef]
Song, Z.; Xia, J.; Wang, G.; She, D.; Hu, C.; Hong, S. Regionalization of Hydrological Model Parameters Using Gradient Boosting Machine. Hydrol. Earth Syst. Sci. 2022, 26, 505–524. [Google Scholar] [CrossRef]
Kuana, L.A.; Almeida, A.S.; Mercuri, E.G.F.; Noe, S.M. Regionalization of GR4J Model Parameters for River Flow Prediction in Paraná, Brazil. Hydrol. Earth Syst. Sci. 2024, 28, 3367–3390. [Google Scholar] [CrossRef]
Beck, H.E.; Van Dijk, A.I.J.M.; De Roo, A.; Dutra, E.; Fink, G.; Orth, R.; Schellekens, J. Global Evaluation of Runoff from 10 State-of-the-Art Hydrological Models. Hydrol. Earth Syst. Sci. 2017, 21, 2881–2903. [Google Scholar] [CrossRef]
Parajka, J.; Merz, R.; Blöschl, G. A Comparison of Regionalisation Methods for Catchment Model Parameters. Hydrol. Earth Syst. Sci. 2005, 9, 157–171. [Google Scholar] [CrossRef]
Papacharalampous, G.; Tyralis, H. Time Series Features for Supporting Hydrometeorological Explorations and Predictions in Ungauged Locations Using Large Datasets. Water 2022, 14, 1657. [Google Scholar] [CrossRef]
Bastola, S. The Regionalization of a Parameter of HYMOD, a Conceptual Hydrological Model, Using Data from across the Globe. HydroResearch 2022, 5, 13–21. [Google Scholar] [CrossRef]
Lan, T.; Zhang, J.; Li, H.; Zhang, H.; Gong, X.; Sun, J.; Chen, Y.D.; Xu, C.Y. Flow Duration Curve Prediction: A Framework Integrating Regionalization and Copula Model. J. Hydrol. 2025, 647, 132364. [Google Scholar] [CrossRef]
Kiraz, M.; Coxon, G.; Wagener, T. A Signature-Based Hydrologic Efficiency Metric for Model Calibration and Evaluation in Gauged and Ungauged Catchments. Water Resour. Res. 2023, 59, e2023WR035321. [Google Scholar] [CrossRef]
Ridolfi, E.; Kumar, H.; Bárdossy, A. A Methodology to Estimate Flow Duration Curves at Partially Ungauged Basins. Hydrol. Earth Syst. Sci. 2020, 24, 2043–2060. [Google Scholar] [CrossRef]
Guo, Y.; Zhang, Y.; Zhang, L.; Wang, Z. Regionalization of Hydrological Modeling for Predicting Streamflow in Ungauged Catchments: A Comprehensive Review. Wiley Interdiscip. Rev. Water 2021, 8, e1487. [Google Scholar] [CrossRef]
Seleem, O.; Ayzel, G.; de Souza, A.C.T.; Bronstert, A.; Heistermann, M. Towards Urban Flood Susceptibility Mapping Using Data-Driven Models in Berlin, Germany. Geomat. Nat. Hazards Risk 2022, 13, 1640–1662. [Google Scholar] [CrossRef]
Li, J.; Wang, Z.; Wu, X.; Xu, C.Y.; Guo, S.; Chen, X.; Zhang, Z. Robust Meteorological Drought Prediction Using Antecedent SST Fluctuations and Machine Learning. Water Resour. Res. 2021, 57, e2020WR029413. [Google Scholar] [CrossRef]
Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
Chen, J.; Adams, B.J. Integration of Artificial Neural Networks with Conceptual Models in Rainfall-Runoff Modeling. J. Hydrol. 2006, 318, 232–249. [Google Scholar] [CrossRef]
Bhasme, P.; Vagadiya, J.; Bhatia, U. Enhancing Predictive Skills in Physically-Consistent Way: Physics Informed Machine Learning for Hydrological Processes. J. Hydrol. 2022, 615, 128618. [Google Scholar] [CrossRef]
Núñez, J.; Cortés, C.B.; Yáñez, M.A. Explainable Artificial Intelligence in Hydrology: Interpreting Black-Box Snowmelt-Driven Streamflow Predictions in an Arid Andean Basin of North-Central Chile. Water 2023, 15, 3369. [Google Scholar] [CrossRef]
Lipton, Z.C. The Mythos of Model Interpretability. Commun. ACM 2018, 61, 35–43. [Google Scholar] [CrossRef]
Wilbrand, K.; Taormina, R.; ten Veldhuis, M.C.; Visser, M.; Hrachowitz, M.; Nuttall, J.; Dahm, R. Predicting Streamflow with LSTM Networks Using Global Datasets. Front. Water 2023, 5, 1166124. [Google Scholar] [CrossRef]
Zhong, L.; Lei, H.; Li, Z.; Jiang, S. Advancing Streamflow Prediction in Data-Scarce Regions through Vegetation-Constrained Distributed Hybrid Ecohydrological Models. J. Hydrol. 2024, 645, 132165. [Google Scholar] [CrossRef]
Eythorsson, D.; Clark, M. Toward Automated Scientific Discovery in Hydrology: The Opportunities and Dangers of AI Augmented Research Frameworks. Hydrol. Process. 2025, 39, e70065. [Google Scholar] [CrossRef]
Monjardin, C.E.F.; Uy, F.A.A.; Tan, F.J.; Carpio, R.C.; Javate, K.C.P.; Laquindanum, J.P. Application of Artificial Neuro-Fuzzy Interference System in Rainfall-Runoff Modelling at Imus River, Cavite. In Proceedings of the 2020 IEEE Conference on Technologies for Sustainability (SusTech), Santa Ana, CA, USA, 23–25 April 2020. [Google Scholar] [CrossRef]
Zhang, Y.; Zhou, Z.; Deng, Y.; Pan, D.; Van Griensven Thé, J.; Yang, S.X.; Gharabaghi, B. Daily Streamflow Forecasting Using Networks of Real-Time Monitoring Stations and Hybrid Machine Learning Methods. Water 2024, 16, 1284. [Google Scholar] [CrossRef]
Nogueira Filho, F.J.M.; Souza Filho, F.D.A.; Porto, V.C.; Rocha, R.V.; Estácio, Á.B.S.; Martins, E.S.P.R. Deep Learning for Streamflow Regionalization for Ungauged Basins: Application of Long-Short-Term-Memory Cells in Semiarid Regions. Water 2022, 14, 1318. [Google Scholar] [CrossRef]
Singh, L.; Mishra, P.K.; Pingale, S.M.; Khare, D.; Thakur, H.P. Streamflow Regionalisation of an Ungauged Catchment with Machine Learning Approaches. Hydrol. Sci. J. 2022, 67, 886–897. [Google Scholar] [CrossRef]
Kratzert, F.; Gauch, M.; Klotz, D.; Nearing, G. HESS Opinions: Never Train a Long Short-Term Memory (LSTM) Network on a Single Basin. Hydrol. Earth Syst. Sci. 2024, 28, 4187–4201. [Google Scholar] [CrossRef]
Ayzel, G.; Kurochkina, L.; Abramov, D.; Zhuravlev, S. Development of a Regional Gridded Runoff Dataset Using Long Short-Term Memory (LSTM) Networks. Hydrology 2021, 8, 6. [Google Scholar] [CrossRef]
Rouse, R.E.; Khamis, D.; Hosking, S.; McRobie, A.; Shuckburgh, E. Streamflow Prediction Using Artificial Neural Networks and Soil Moisture Proxies. Environ. Data Sci. 2025, 4, e5. [Google Scholar] [CrossRef]
Wanders, N.; Bierkens, M.F.P.; de Jong, S.M.; de Roo, A.; Karssenberg, D. The Benefits of Using Remotely Sensed Soil Moisture in Parameter Identification of Large-Scale Hydrological Models. Water Resour. Res. 2014; 50, 6874–6891. [Google Scholar] [CrossRef]
Sun, W.C.; Ishidaira, H.; Bastola, S. Towards Improving River Discharge Estimation in Ungauged Basins: Calibration of Rainfall-Runoff Models Based on Satellite Observations of River Flow Width at Basin Outlet. Hydrol. Earth Syst. Sci. 2010, 14, 2011–2022. [Google Scholar] [CrossRef]
Li, D.; Marshall, L.; Liang, Z.; Sharma, A.; Zhou, Y. Bayesian LSTM With Stochastic Variational Inference for Estimating Model Uncertainty in Process-Based Hydrological Models. Water Resour. Res. 2021, 57, e2021WR029772. [Google Scholar] [CrossRef]
Ghobadi, F.; Kang, D. Multi-Step Ahead Probabilistic Forecasting of Daily Streamflow Using Bayesian Deep Learning: A Multiple Case Study. Water 2022, 14, 3672. [Google Scholar] [CrossRef]
Addor, N.; Newman, A.J.; Mizukami, N.; Clark, M.P. The CAMELS Data Set: Catchment Attributes and Meteorology for Large-Sample Studies. Hydrol. Earth Syst. Sci. 2017, 21, 5293–5313. [Google Scholar] [CrossRef]
Liu, S.; Lu, D.; Painter, S.L.; Griffiths, N.A.; Pierce, E.M. Uncertainty Quantification of Machine Learning Models to Improve Streamflow Prediction under Changing Climate and Environmental Conditions. Front. Water 2023, 5, 1150126. [Google Scholar] [CrossRef]
Di, D.; Bai, Y.; Fang, H.; Sun, B.; Wang, N.; Li, B. Intelligent Siltation Diagnosis for Drainage Pipelines Using Weak-Form Analysis and Theory-Guided Neural Networks in Geo-Infrastructure. Autom. Constr. 2025, 176, 106246. [Google Scholar] [CrossRef]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-Informed Machine Learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Cui, Z.; Chen, Q.; Luo, J.; Ma, X.; Liu, G. Characterizing Subsurface Structures From Hard and Soft Data With Multiple-Condition Fusion Neural Network. Water Resour. Res. 2024, 60, e2024WR038170. [Google Scholar] [CrossRef]
Jiang, S.; Sweet, L.B.; Blougouras, G.; Brenning, A.; Li, W.; Reichstein, M.; Denzler, J.; Shangguan, W.; Yu, G.; Huang, F.; et al. How Interpretable Machine Learning Can Benefit Process Understanding in the Geosciences. Earth’s Futur. 2024, 12, e2024EF004540. [Google Scholar] [CrossRef]
Liu, G.; Jiang, E.; Li, D.; Li, J.; Wang, Y.; Zhao, W.; Yang, Z. Annual Multi-Objective Optimization Model and Strategy for Scheduling Cascade Reservoirs on the Yellow River Mainstream. J. Hydrol. 2025, 659, 133306. [Google Scholar] [CrossRef]
Song, J.; Ma, C.; Ran, M. AirGPT: Pioneering the Convergence of Conversational AI with Atmospheric Science. npj Clim. Atmos. Sci. 2025, 8, 179. [Google Scholar] [CrossRef]
Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning Yarin. Proc. 33rd Int. Conf. Mach. Learn. 2016, 48, 1050–1059. [Google Scholar]
Farzana, S.Z. Uncertainty in Hydrological Modelling: A Review. Int. J. Hydrol. Res. 2023, 8, 1–13. [Google Scholar] [CrossRef]
Perrin, C.; Michel, C.; Andréassian, V. Improvement of a Parsimonious Model for Streamflow Simulation. J. Hydrol. 2003, 279, 275–289. [Google Scholar] [CrossRef]
Janga, B.; Asamani, G.P.; Sun, Z.; Cristea, N. A Review of Practical AI for Remote Sensing in Earth Sciences. Remote Sens. 2023, 15, 4112. [Google Scholar] [CrossRef]
Tursun, A.; Xie, X.; Wang, Y.; Peng, D.; Liu, Y.; Zheng, B.; Wu, X.; Nie, C. Streamflow Prediction in Human-Regulated Catchments Using Multiscale Deep Learning Modeling With Anthropogenic Similarities. Water Resour. Res. 2024, 60, e2023WR036853. [Google Scholar] [CrossRef]
Wang, J.; Li, Z.; Zhou, L.; Ma, C.; Sun, W. Ensemble Streamflow Simulations in a Qinghai–Tibet Plateau Basin Using a Deep Learning Method with Remote Sensing Precipitation Data as Input. Remote Sens. 2025, 17, 967. [Google Scholar] [CrossRef]
Gado, T.A.; Zamzam, D.H.; Guo, Y.; Zeidan, B.A. Evaluation of Satellite-Based Rainfall Estimates in the Upper Blue Nile Basin. J. Earth Syst. Sci. 2024, 133, 27. [Google Scholar] [CrossRef]
Pan, S.; Pan, N.; Tian, H.; Friedlingstein, P.; Sitch, S.; Shi, H.; Arora, V.K.; Haverd, V.; Jain, A.K.; Kato, E.; et al. Evaluation of Global Terrestrial Evapotranspiration Using State-of-the-Art Approaches in Remote Sensing, Machine Learning and Land Surface Modeling. Hydrol. Earth Syst. Sci. 2020, 24, 1485–1509. [Google Scholar] [CrossRef]
Liou, Y.A.; Kar, S.K. Evapotranspiration Estimation with Remote Sensing and Various Surface Energy Balance Algorithms-a Review. Energies 2014, 7, 2821–2849. [Google Scholar] [CrossRef]
Wang, J.; Wu, Y.; Hu, Z.; Zhang, J. Remote Sensing of Watershed: Towards a New Research Paradigm. Remote Sens. 2023, 15, 2569. [Google Scholar] [CrossRef]
Zhang, Z.; Tian, J.; Huang, Y.; Chen, X.; Chen, S.; Duan, Z. Hydrologic Evaluation of TRMM and GPM IMERG Satellite-Based Precipitation in a Humid Basin of China. Remote Sens. 2019, 11, 431. [Google Scholar] [CrossRef]
Monjardin, C.E.; Cabundocan, C.; Ignacio, C.; Tesnado, C.J. Impact of Climate Change on the Frequency and Severity of Floods in the Pasig-Marikina River Basin. E3S Web Conf. 2019, 117, 5. [Google Scholar] [CrossRef]
Lufi, S.; Ery, S.; Rispiningtati, R. Hydrological Analysis of TRMM (Tropical Rainfall Measuring Mission) Data in Lesti Sub Watershed. Civ. Environ. Sci. 2020, 3, 18–30. [Google Scholar] [CrossRef]
Sun, T.; Yan, N.; Zhu, W.; Zhuang, Q. Assessing a Machine Learning-Based Downscaling Framework for Obtaining 1km Daily Precipitation from GPM Data. Heliyon 2024, 10, e36368. [Google Scholar] [CrossRef]
Meema, T.; Wattanasetpong, J.; Wichakul, S. Integrating Machine Learning and Zoning-Based Techniques for Bias Correction in Gridded Precipitation Data to Improve Hydrological Estimation in the Data-Scarce Region. J. Hydrol. 2025, 646, 132356. [Google Scholar] [CrossRef]
Gyasi-Agyei, Y.; Obuobie, E.; Yu, B.; Addi, M.; Yahaya, B. Optimal Selection of Daily Satellite Precipitation Product Based on Structural Similarity Index at 1 Km Resolution for the Pra Catchment, Ghana. Sci. Rep. 2023, 13, 16702. [Google Scholar] [CrossRef] [PubMed]
Gyawali, B.; Ahmed, M.; Murgulet, D.; Wiese, D. Filling Temporal Gaps Within and Between GRACE and GRACE-FO Records: Advances, Challenges, and Future Opportunities. Earth Sci. Rev. 2021. in review. [Google Scholar] [CrossRef]
Siabi, N.; Sanaeinejad, S.H.; Ghahraman, B. Effective Method for Filling Gaps in Time Series of Environmental Remote Sensing Data: An Example on Evapotranspiration and Land Surface Temperature Images. Comput. Electron. Agric. 2022, 193, 106619. [Google Scholar] [CrossRef]
Dasgupta, R.; Das, S.; Banerjee, G.; Mazumdar, A. Revisit Hydrological Modeling in Ungauged Catchments Comparing Regionalization, Satellite Observations, and Machine Learning Approaches. HydroResearch 2024, 7, 15–31. [Google Scholar] [CrossRef]
Mumtaz, M.; Jahanzaib, S.H.; Hussain, W.; Khan, S.; Youssef, Y.M.; Qaysi, S.; Abdelnabi, A.; Alarifi, N.; Abd-Elmaboud, M.E. Synergy of Remote Sensing and Geospatial Technologies to Advance Sustainable Development Goals for Future Coastal Urbanization and Environmental Challenges in a Riverine Megacity. ISPRS Int. J. Geo-Information 2025, 14, 30. [Google Scholar] [CrossRef]
Pellet, V.; Aires, F.; Yamazaki, D.; Zhou, X.; Paris, A. A First Continuous and Distributed Satellite-Based Mapping of River Discharge over the Amazon. J. Hydrol. 2022, 614, 128481. [Google Scholar] [CrossRef]
Papaioannou, G.; Varlas, G.; Terti, G.; Papadopoulos, A.; Loukas, A.; Panagopoulos, Y.; Dimitriou, E. Flood Inundation Mapping at Ungauged Basins Using Coupled Hydrometeorological-Hydraulic Modelling: The Catastrophic Case of the 2006 Flash Flood in Volos City, Greece. Water 2019, 11, 2328. [Google Scholar] [CrossRef]
Nguyen, N.T.; Du, T.L.T.; Park, H.; Chang, C.H.; Choi, S.; Chae, H.; Nelson, E.J.; Hossain, F.; Kim, D.; Lee, H. Estimating the Impacts of Ungauged Reservoirs Using Publicly Available Streamflow Simulations and Satellite Remote Sensing. Remote Sens. 2023, 15, 4563. [Google Scholar] [CrossRef]
Adeyinka Idowu, A.; Olamide Abigael, A.; Babatunde Kazeem, A.; Modupe Beatrice, O. Design, Fabrication and Evaluation of Electrically-Operated Groundnut Roasting Machine. Am. J. Eng. Technol. Manag. 2020, 5, 48. [Google Scholar] [CrossRef]
Li, J.; Zhao, Y.; Bates, P.; Neal, J.; Tooth, S.; Hawker, L.; Maffei, C. Digital Elevation Models for Topographic Characterisation and Flood Flow Modelling along Low-Gradient, Terminal Dryland Rivers: A Comparison of Spaceborne Datasets for the Río Colorado, Bolivia. J. Hydrol. 2020, 591, 125617. [Google Scholar] [CrossRef]
Santillan, J.R.; Makinano-Santillan, M.; Makinano, R.M. Vertical Accuracy Assessment of ALOS World 3D—30M Digital Elevation Model over Northeastern Mindanao, Philippines. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 5374–5377. [Google Scholar] [CrossRef]
Chatrabhuj; Meshram, K.; Mishra, U.; Omar, P.J. Integration of Remote Sensing Data and GIS Technologies in River Management System. Discov. Geosci. 2024, 2, 67. [Google Scholar] [CrossRef]
Islam, N.; Irshad, K. Artificial Ecosystem Optimization with Deep Learning Enabled Water Quality Prediction and Classification Model. Chemosphere 2022, 309, 136615. [Google Scholar] [CrossRef]
Gacul, L.-A.; Ferrancullo, D.; Gallano, R.; Fadriquela, K.J.; Mendez, K.J.; Morada, J.R.; Morgado, J.K.; Gacu, J. GIS-Based Identification of Flood Risk Zone in a Rural Municipality Using Fuzzy Analytical Hierarchy Process (FAHP). Rev. Int. Géomatique 2024, 33, 295–320. [Google Scholar] [CrossRef]
Gacu, J.G.; Monjardin, C.E.F.; de Jesus, K.L.M.; Senoro, D.B. GIS-Based Risk Assessment of Structure Attributes in Flood Zones of Odiongan, Romblon, Philippines. Buildings 2023, 13, 506. [Google Scholar] [CrossRef]
Yaseen, Z.M.; Kisi, O.; Demir, V. Enhancing Long-Term Streamflow Forecasting and Predicting Using Periodicity Data Component: Application of Artificial Intelligence. Water Resour. Manag. 2016, 30, 4125–4151. [Google Scholar] [CrossRef]

Figure 1. The diagram presents the structured literature review methodology used to identify and select studies on the application of artificial intelligence in streamflow prediction for ungauged watersheds. It follows four main stages: identification, screening, eligibility, and inclusion, according to PRISMA guidelines.

Figure 2. This presents the keyword co-occurrence network of AI applications in hydrology (2018–2023), created using VOSviewer version 1.6.20. The color gradient shows the temporal evolution of research, with newer topics such as artificial intelligence, deep learning, and LSTM emerging in recent years.

Figure 3. Global distribution and temporal trends of the 143 reviewed studies. The map shows the geographic spread of research (with the mentioned country or location), while the bar charts present the number of studies by continent and by year. A sharp increase in publications is observed from 2021 to 2025, highlighting the growing focus on applying artificial intelligence for streamflow prediction in ungauged watersheds.

Figure 4. Visual summary of AI applications for pre-processing remote sensing data before integration into hydrological modeling.

Figure 5. Structured roadmap for advancing AI in hydrological modeling, highlighting four key pillars and corresponding future directions.

Figure 6. Conceptual framework illustrating key elements in AI-based streamflow prediction for ungauged watersheds, including core challenges, AI techniques, remote sensing inputs, modeling approaches, evaluation metrics, applications, limitations, and future directions.

Table 1. The summary provides an overview of various hydrological modeling approaches used for streamflow prediction in ungauged catchments, highlighting their key features, data requirements, and applicability in data-scarce environments.

Model Type	Key Characteristics	Strengths	Limitations	Key References
Distributed Models	Physically based; spatially explicit (e.g., MIKE SHE, DHSVM)	High accuracy; captures heterogeneity	Data- and compute-intensive	[8,41,65,66,67,68]
Data Assimilation	Real-time updating (e.g., EnKF, variational)	Improve short-term forecasts	Requires real-time data, high complexity	[73,74,75]
Remote Sensing Integration	Uses satellite data (e.g., rainfall, NDVI) for calibration	Enhances model realism in data-scarce areas	Data conflicts; equifinality issues	[66,75]
Lumped Models	Simplified, catchment-scale (e.g., HBV, GR4J)	Low data needs; widely used	Ignores spatial variability	[80,81,82,83,85,93]
Regionalization Methods	Parameter transfer using proximity or similarity	Enables lumped models in ungauged basins	Less accurate in heterogeneous regions	[85,88]
Signature Transfer	Aligns flow metrics (e.g., FDC, BFI) across basins	Improves behavioral realism	Sensitive to timing mismatches	[81,90,91,92]

Table 2. Summary of the evolution, classification, and applications of AI in hydrology. The table categorizes key AI approaches, machine learning, deep learning, and hybrid models, highlighting their principal applications, advantages, limitations, and emerging research directions as identified in recent studies.

Aspect	Details/Highlights	Research Remarks	References
AI Evolution in Hydrology	From early use of ANNs to advanced ML (e.g., RF, SVM, GBM) and DL (e.g., LSTM, CNN), a growing trend toward hybrid and physics-informed models.	Demonstrates the trajectory from empirical black-box learning to physically constrained approaches, highlighting the need for model frameworks that balance accuracy and interpretability.	[19,44]
Machine Learning (ML)	Includes RF, SVM, GBM; used for flood prediction, drought forecasting, evapotranspiration estimation; interpretable but limited in extrapolation.	Robust for pattern recognition and moderately interpretable, but struggles with extrapolation under non-stationary climate conditions.	[94,95]
Deep Learning (DL)	RNN, LSTM, and CNN dominate; excellent for time-series rainfall-runoff modeling and learning long-term dependencies.	LSTM models outperform many benchmarks but remain sensitive to hyperparameter tuning (sequence length, learning rate) and require large datasets for reliable generalization.	[43]
Hybrid/Physics-Informed AI	Combines data-driven methods with physical laws; more robust and physically consistent; bridges empirical and mechanistic modeling.	Offers stronger robustness and physical consistency, promising direction to mitigate black-box limitations while maintaining predictive power.	[96,97]
Comparison of Physical Models	AI models like LSTM outperform SWAT or VIC in ungauged basins but lack transparency and physical realism; physically based models remain more interpretable but data-intensive.	Reveals trade-offs: AI excels in accuracy but lacks physical transparency; physically based models remain more interpretable but demand high-quality input data.	[38,98]
Explainable AI (XAI)	XAI improves the interpretability of black-box models; interpretable networks and frameworks are emerging.	Essential for bridging trust gaps between stakeholders and model users, emerging tools allow the identification of influential hydrological drivers in predictions.	[99,100]
Challenges in AI Hydrology	Overfitting, poor extrapolation in climate-change contexts, and large data requirements, mitigated by standardized datasets.	Persistent limitations necessitate standardized datasets, benchmarking protocols, and improved regularization for operational deployment.	[42,101]
Future Directions	Emphasis on hybrid models that merge AI with physics; caution against overreliance on black-box models.	Encourages blending AI’s predictive power with hydrological reasoning; cautions against blind reliance on data-driven models without physical grounding.	[102,103]

Table 3. Summary comparative overview of advanced AI approaches for ungauged basins.

Approach	Strengths	Weaknesses	Applicability in Ungauged Basins
PINNs	Physically consistent; reduces equifinality	High computational cost; complex PDE integration	Suitable for data-scarce basins requiring process fidelity
GNNs	Captures spatial connectivity; scalable for large networks	Needs accurate static descriptors; interpretability issues	Effective for regionalization and routing
Foundation Models	Transferable across basins; reduces training effort	Requires large pretraining datasets; ethical concerns	Emerging for global hydrology
UQ Frameworks	Provides predictive uncertainty; supports decision-making	Computationally demanding; lacks standardization	Essential for flood/drought risk forecasting

Table 4. This presents a comparative analysis of AI-based models and traditional hydrological models, highlighting key differences in interpretability, data dependency, uncertainty handling, scalability, and applicability to ungauged basins.

Dimension	Traditional Models	AI-Based Models
Theoretical Basis	Based on physical or conceptual equations of hydrology	Data-driven; may integrate physics (PINNs)
Interpretability	High (parameters linked to physical processes)	Low to Moderate (often black-box; improved in hybrid/physics-guided models)
Data Requirements	Moderate (needs discharge & climate inputs; limited spatial resolution)	High (requires large datasets for training; sensitive to input quality)
Calibration Effort	Significant manual calibration; sensitive to initial parameters	Automated optimization; less reliance on manual calibration
Uncertainty Handling	Handled via probabilistic calibration (GLUE, Bayesian)	Advanced UQ via Bayesian DL, ensembles, stochastic LSTMs
Transferability	Limited (requires recalibration for new basin)	High (with pretraining or transfer learning); strong for regionalization
Computational Demand	Low to Moderate	High (depending on model architecture and training resources)
Scalability	Moderate (constrained by physical assumptions)	High (suitable for global/regional datasets)
Adaptability to Ungauged Basins	Moderate; relies on parameter regionalization	High with GNNs, foundation models, and hybrid AI

Table 5. Key characteristics of MODIS, TRMM, and GPM remote sensing datasets and their applications in hydrological modeling.

Dataset	Provider	Key Variables	Spatial Resolution	Temporal Resolution	Common Hydrological Applications
MODIS (Terra & Aqua)	NASA	NDVI, EVI, LST, albedo, LAI, land cover	250 m–1 km	Daily	ET estimation, vegetation dynamics, drought detection, runoff partitioning
TRMM (3B42)	NASA & JAXA	Precipitation (3 h)	~0.25° (~25 km)	3 hourly	Rainfall-runoff modeling, flood analysis, and model forcing in ungauged basins
GPM (IMERG)	NASA & JAXA	Precipitation (half-hourly)	~0.1° (~10 km)	30 min	Real-time flood forecasting, streamflow simulation, precipitation downscaling

Table 6. Summary of AI–Remote Sensing Integration for Catchment Mapping in Ungauged Watersheds.

Task	Remote Sensing Inputs	AI Techniques	Hydrological Outcome	Reference
Catchment boundary mapping	DEM (SRTM/ALOS)	Convolutional Neural Networks (CNN)	Automatic delineation of drainage networks	[146,147,148]
Catchment classification	NDVI, precipitation proxies, elevation, land cover	PCA, clustering, k-means	Grouping of catchments by hydrologic similarity	[141,142]
Streamflow estimation	Satellite rainfall, NDVI, LST, DEM slopes	LSTM, SVR, Random Forest	Discharge simulation without gauge station data	[70,142]
Flood mapping	Precipitation, DEM, land cover, hydraulic parameters	Hybrid ML + hydrodynamic modeling	Event-scale inundation mapping in ungauged watersheds	[144,149]

Table 7. Overview of evaluation metrics, validation strategies, and benchmarking practices in AI-based streamflow modeling for ungauged catchments.

Aspect	Observations	References
Evaluation Metrics	Standard metrics include NSE, R², RMSE, and MAE; however, use is inconsistent and not always tied to hydrological significance.	[67,92,107,143]
Uncertainty and Bias Assessment	Rarely addressed; only a few studies report confidence intervals or probabilistic outputs in ungauged basins.	[130,137]
Validation Techniques	LOOCV and k-fold are used inconsistently; limited transparency in split strategies and validation logic.	[113,130]
Comparative Benchmarks	Lack of systematic comparison between AI and traditional models under identical conditions.	[58,109,143]
Interpretability and Explainability	Rare reporting of SHAP values, feature attribution, or model interpretability measures.	[79,130]
Reproducibility of Evaluation	Few studies provide access to datasets, code, or model parameters used in evaluation.	[75,111,137]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gacu, J.G.; Monjardin, C.E.F.; Mangulabnan, R.G.T.; Mendez, J.C.F. Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review. Water 2025, 17, 2722. https://doi.org/10.3390/w17182722

AMA Style

Gacu JG, Monjardin CEF, Mangulabnan RGT, Mendez JCF. Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review. Water. 2025; 17(18):2722. https://doi.org/10.3390/w17182722

Chicago/Turabian Style

Gacu, Jerome G., Cris Edward F. Monjardin, Ronald Gabriel T. Mangulabnan, and Jerime Chris F. Mendez. 2025. "Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review" Water 17, no. 18: 2722. https://doi.org/10.3390/w17182722

APA Style

Gacu, J. G., Monjardin, C. E. F., Mangulabnan, R. G. T., & Mendez, J. C. F. (2025). Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review. Water, 17(18), 2722. https://doi.org/10.3390/w17182722

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Artificial Intelligence in Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds: A Review

Abstract

1. Introduction

2. Materials and Methods

3. AI in Hydrology/Watershed Management

3.1. Hydrological Modeling for Streamflow Prediction in Ungauged Watersheds

3.1.1. Distributed Hydrologic Modeling (DHM)

3.1.2. Conceptual and Lumped Hydrological Modeling

3.2. Applications and Evolution of AI in Hydrology

3.3. AI for Streamflow Prediction in Ungauged Watersheds

3.3.1. Regionalization: Transferring Models from Gauged to Ungauged Sites

3.3.2. Synthetic Generation of Data and Model Training Using Proxy Variables

3.3.3. Model Performance Assessment

3.3.4. Comparative Analysis of Advanced AI Approaches for Ungauged Basins

3.3.5. Comparative Analysis of Advanced AI Approaches for Ungauged Basins

3.4. Integration of AI in Remote Sensing for Streamflow Prediction

3.4.1. Remote Sensing Data

3.4.2. AI for Downscaling, Bias Correction, and Temporal Interpolation

3.4.3. AI-Remote Sensing Synergy in Ungauged Catchment Mapping

4. Discussion

5. Challenges and Future Directions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI