You are currently viewing a new version of our website. To view the old version click .
Water
  • Editor’s Choice
  • Review
  • Open Access

3 July 2024

Advancing Hydrology through Machine Learning: Insights, Challenges, and Future Directions Using the CAMELS, Caravan, GRDC, CHIRPS, PERSIANN, NLDAS, GLDAS, and GRACE Datasets

,
,
and
1
Department of Civil & Environmental Engineering, FAMU-FSU College of Engineering, 2525 Pottsdamer Street, Tallahassee, FL 32310, USA
2
Center for Spatial Ecology & Restoration, Florida A&M University, 407 Frederick S. Humphries Science Research Center, 1515 S. Martin Luther King Jr. Blvd., Tallahassee, FL 32307, USA
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Water Resource Management in Artificial Intelligence and Big Data Analytics

Abstract

Machine learning (ML) applications in hydrology are revolutionizing our understanding and prediction of hydrological processes, driven by advancements in artificial intelligence and the availability of large, high-quality datasets. This review explores the current state of ML applications in hydrology, emphasizing the utilization of extensive datasets such as CAMELS, Caravan, GRDC, CHIRPS, NLDAS, GLDAS, PERSIANN, and GRACE. These datasets provide critical data for modeling various hydrological parameters, including streamflow, precipitation, groundwater levels, and flood frequency, particularly in data-scarce regions. We discuss the type of ML methods used in hydrology and significant successes achieved through those ML models, highlighting their enhanced predictive accuracy and the integration of diverse data sources. The review also addresses the challenges inherent in hydrological ML applications, such as data heterogeneity, spatial and temporal inconsistencies, issues regarding downscaling the LSH, and the need for incorporating human activities. In addition to discussing the limitations, this article highlights the benefits of utilizing high-resolution datasets compared to traditional ones. Additionally, we examine the emerging trends and future directions, including the integration of real-time data and the quantification of uncertainties to improve model reliability. We also place a strong emphasis on incorporating citizen science and the IoT for data collection in hydrology. By synthesizing the latest research, this paper aims to guide future efforts in leveraging large datasets and ML techniques to advance hydrological science and enhance water resource management practices.

1. Introduction

Machine learning applications in hydrology have been gaining momentum, transforming how we understand and predict various hydrological processes []. The rapid advancement of artificial intelligence technologies is fostering increased research and applications of machine learning in this field, promising significant advancements in the near future [,]. These technologies are being used to improve the accuracy and efficiency of hydrological models, addressing complex problems such as climate change impacts, water resource management, and disaster preparedness.
One of the fundamental requirements for effective machine learning models is access to large, high-quality datasets, which provide the necessary data for accurate predictions and robust model training [,,]. In hydrology, machine learning models have seen tremendous successes in predicting different parameters, including streamflow, precipitation, groundwater level, and flood frequency, especially in data-scarce regions [,,,,,,].
The CAMELS dataset, for instance, integrates meteorological and hydrological data across multiple catchments, providing valuable insights into catchment behavior and facilitating the development of robust machine learning models [,]. Its detailed and coherent description of catchment characteristics makes it a valuable resource for exploring interrelationships among different attributes and understanding their influence on hydrological processes. Similarly, the Caravan dataset offers a comprehensive archive of hydrological responses, making it an invaluable resource for modeling and prediction tasks [,]. The increasing availability of high-resolution datasets like CHIRPS has enabled significant improvements in precipitation and drought monitoring, particularly in regions where traditional observational data are sparse []. This dataset, combined with advanced machine learning techniques, has demonstrated superior performance in various applications, from drought assessment in East Africa to streamflow forecasting in India [,]. The integration of satellite observations with ground-based measurements has also enhanced the reliability and accuracy of hydrological predictions, supporting better water resource management and climate impact assessments. Similarly, the PERSIANN dataset has significantly advanced hydrological research by improving flood prediction, precipitation estimation, and runoff simulation, demonstrating its versatility and critical role in enhancing predictive accuracy and supporting drought assessments [,,,,,,,,]. Moreover, datasets like GRDC provide an extensive archive of river discharge data, which is critical for global water resource management, climate impact studies, and hydrological modeling. The GRDC dataset’s extensive coverage and long record length facilitate comprehensive analyses of hydrological patterns and trends, aiding in the development of robust hydrological models and enhancing our understanding of global water cycles [,].
Other datasets, such as the NLDAS (North American land data assimilation system), GLDAS (global land data assimilation system), and GRACE (gravity recovery and climate experiment), have been pivotal in advancing the understanding of hydrological processes. These datasets offer diverse and comprehensive data, ranging from catchment attributes and meteorological time series to river discharge and precipitation estimates, which are essential for developing and validating hydrological models. The selection of these datasets is driven by their extensive spatial and temporal coverage, high-quality data, broad applicability, and community acceptance. Specifically, CAMELS offers detailed catchment characteristics, while Caravan provides a global perspective with consistent data quality. GRDC, CHIRPS, PERSIANN, NLDAS, GLDAS, and GRACE offer reliable and comprehensive hydrological data. Their widespread use and acceptance within the hydrology research community ensure that our review covers the most relevant and impactful data sources. By focusing on these well-established datasets, we aim to provide insights that are both scientifically robust and widely applicable, supporting the development of accurate and reliable machine learning models in hydrology.
Despite the advances, challenges, such as data heterogeneity, spatial and temporal inconsistencies, and the need for the integration of human activities remain prevalent [,,]. Addressing these challenges is crucial for advancing the field and ensuring that machine learning models can provide reliable and actionable insights.
This review paper aims to explore the current state of machine learning applications in hydrology, with a particular focus on the utilization of large datasets, such as CAMELS, Caravan, GRDC, CHIRPS, PERSIANN, NLDAS, GLDAS, and GRACE. We will discuss the successes, challenges, and future directions in this rapidly evolving field, highlighting the key studies and emerging trends. By synthesizing the latest research, this article seeks to guide future efforts in leveraging these datasets to advance hydrological science and improve water resource management practices. The paper will also address the integration of human impact on data, the quantification of uncertainties, and the potential of real-time data integration to enhance the accuracy and applicability of machine learning models in hydrology.

3. Machine Learning Methods in Hydrology

3.1. Long Short-Term Memory (LSTM)

LSTM networks, a type of recurrent neural network (RNN), are effective for time series prediction tasks such as streamflow prediction, rainfall-runoff modeling, and groundwater level forecasting [,,]. They capture temporal patterns and dependencies in hydrological data, offering improved predictive accuracy over traditional methods []. However, LSTMs require significant computational resources and are prone to overfitting and interpretability issues [,,].

3.2. Random Forests (RFs)

Random forests combine multiple decision trees to enhance predictive accuracy and control overfitting [,]. They are used in creating flood susceptibility maps, drought assessments, precipitation downscaling, and forecasting [,,,]. RF models are robust to noisy data and provide important insight features. However, they may introduce bias in small datasets.

3.3. Support Vector Machines (SVMs)

SVMs classify data by finding the optimal hyperplane that separates different classes, making them suitable for streamflow prediction, groundwater level forecasting, and precipitation downscaling [,,,]. They are effective in high-dimensional spaces and robust to overfitting. They can be sensitive to noise [,,].
The applications of machine learning techniques, their applications in the context of hydrology, their advantages, and potential limitations are shown in Table 1.
Table 1. Applications, advantages, and disadvantages of machine learning techniques.

3.4. Artificial Neural Networks (ANNs)

ANN models and complex non-linear relationships [] are used for rainfall-runoff modeling, flood forecasting, and water quality prediction [,,,]. They are flexible for various tasks but are prone to overfitting and are often seen as black boxes [,,].

3.5. Gradient Boosting Machines (GBMs)

GBMs sequentially build multiple decision trees, improving prediction tasks like flood prediction, soil moisture estimation, and groundwater level prediction [,,,]. They offer high predictive accuracy and feature important insights, but require careful parameter tuning to avoid overfitting [,].

3.6. Convolutional Neural Networks (CNNs)

CNNs, primarily used for spatial data analysis, excel at recognizing spatial patterns in remote sensing data analysis, precipitation estimation, and flood mapping [,,,]. They handle large-scale datasets effectively and learn features automatically but require large amounts of labeled data [,].

3.7. Transformers

Originally developed for natural language processing, transformers have been adapted for hydrological applications due to their ability to handle sequential data and capture long-range dependencies []. They have shown superior performance in streamflow prediction and flood forecasting [,,,].
Some traditional models have also been used for hydrological research along with machine learning models. For instance, a five-level nested experimental watershed was developed to study the water cycle at multiple scales, and it was found that in humid regions, surface runoff constitutes a significant portion of the total runoff [].

4. Key Datasets

The CAMELS (catchment attributes and meteorology for large-sample studies) dataset offers comprehensive data for 671 minimally impacted catchments across the contiguous United States (CONUS), encompassing various attributes, such as topography, climate, streamflow, land cover, soil, and geology. This diversity facilitates extensive hydrological research and aids in understanding the interrelationships among catchment characteristics [] Similarly, the Caravan dataset aggregates data from seven large-sample hydrology datasets, covering 6830 catchments globally over nearly four decades. It includes meteorological forcing, streamflow data, and static catchment attributes, promoting accessible, high-quality hydrological research []. The datasets extensively used in hydrological ML applications, their spatial and temporal coverage, data resolution, key attributes, and their primary applications in hydrology are demonstrated in Table 2.
Table 2. Key datasets used in hydrological ML applications.
The Global Runoff Data Centre (GRDC) archives river discharge data from over 9800 stations worldwide, with some records dating back 200 years. This extensive archive supports global water resource management, climate studies, and hydrological modeling []. CHIRPS (climate hazards group infrared precipitation with stations) provides high-resolution precipitation data from 1981 to the present by combining satellite observations with station data, which is essential for monitoring climate extremes and drought forecasting []. The PERSIANN (precipitation estimation from remotely sensed information using artificial neural networks) suite includes several high-resolution precipitation products. For example, PERSIANN-CCS provides near-global, high-resolution (0.04°) estimates at multiple temporal resolutions from 2003 to the present, which are ideal for real-time weather monitoring and severe weather analyses []. PERSIANN-CDR offers daily estimates at a 0.25° resolution from 1983 to the present, supporting long-term climatological and hydrological studies []. PERSIANN-CCS-CDR combines both, offering three-hourly estimates at a 0.04° resolution from 1983 to the present for extreme weather analyses and climatological studies []. The NLDAS (North American land data assimilation system) offers high-resolution, gridded datasets for North America from 1979 onwards, supporting water resource management, drought monitoring, and flood forecasting. On the other hand, the GLDAS (global land data assimilation system) generates high-resolution land surface states and fluxes using satellite and ground-based data from 1948 to the present, aiding global land surface condition monitoring and hydrological modeling. The GRACE (gravity recovery and climate experiment) and its follow-on mission (GRACE-FO) provide monthly data on Earth’s gravitational field variations from 2002 onwards, crucial for studying groundwater depletion, glacial melting, and sea-level rise. This dataset enhances our understanding of global water distribution and climate dynamics.

5. Case Studies

5.1. CAMELS

The CAMELS dataset has been instrumental in streamflow forecasting through various machine learning approaches. Studies have demonstrated the effectiveness of LSTM networks, transfer learning, and other advanced models, consistently showing improvements over traditional models and significant regional performance variations [,,,,,,,,,,,,]. In rainfall-runoff modeling, CAMELS has enhanced predictive accuracy and robustness, with LSTM and transformer-based models outperforming traditional approaches [,,,,,,,]. For flood forecasting, machine learning frameworks have achieved high accuracy in storm classification and flood peak estimation [,,,,]. Groundwater level forecasting, though less explored, has seen improved model performance through regional characteristics integration [,]. The dataset also advances various hydrological modeling techniques, including knowledge-guided frameworks, hybrid models, and AI-enhanced parameter learning, showcasing its versatility and robustness in hydrological research [,,,,,,,]
Notable studies using the CAMELS-GB dataset include investigations of urbanization’s impact on river discharge and hybrid hydroclimatic forecasting, while CAMELS-CL has seen the development of LSTM and random forest models for enhanced hydrological predictions [,,,,,]. CAMELS-BR has applied the FS-LSTM model for streamflow prediction [], and CAMELS-AUS has focused on hybrid models for streamflow prediction and global water flux partitioning analyses [,].
The key case studies and findings using the chosen datasets are shown in Table 3.
Table 3. Key case studies and findings using the datasets.

5.2. CARAVAN

The Caravan dataset, despite being relatively new, shows significant potential in hydrology and machine learning. It has demonstrated superior performance in streamflow prediction, flood forecasting, and catchment model instance prediction through advanced models like temporal fusion transformers and latent factor models [,,,,].

5.3. GRDC

The GRDC dataset has been extensively used for streamflow and water balance studies, improving monthly runoff reconstructions and enhancing streamflow and water storage predictions [,,,]. It has also been pivotal in flood prediction, hydrological modeling, and simulation, demonstrating strong performance in data-scarce areas [,,,].

5.4. CHIRPS

CHIRPS has been widely utilized in drought assessment, runoff estimation, flood modeling, and improving precipitation models. Studies have shown its superior performance in various regions, enhancing drought monitoring and flood prediction accuracy [,,,,,,,,,,,,,].

5.5. PERSIANN

The PERSIANN dataset has significantly contributed to hydrological modeling, flood prediction, and precipitation estimation. It has been used for streamflow and sediment load simulation, rainfall-runoff modeling, and reliable flood forecasting. Advanced techniques like cGANs and deep neural networks have enhanced precipitation estimation, supporting drought assessment and runoff simulation, demonstrating PERSIANN’s versatility and importance in hydrological research [,,,,,,,,].

5.6. NLDAS

NLDAS data has advanced hydrological modeling, runoff and flood prediction, and evapotranspiration and soil moisture estimation. It has shown improved accuracy in predicting lake water temperatures, precipitation, soil moisture, and runoff, demonstrating the dataset’s utility in diverse hydrological applications [,,,,,,,,,,].

5.7. GLDAS

GLDAS data has also significantly advanced hydrological research by improving hydrological modeling, soil moisture and evapotranspiration estimation, and groundwater and storage data predictions. It has enhanced the accuracy of terrestrial water storage variations and streamflow simulations and provided detailed spatial and temporal resolution for global land surface conditions [,,,,,,,,].

5.8. GRACE

GRACE data has been used in groundwater and water storage anomaly studies, groundwater level prediction, enhancing spatial resolution, and filling temporal gaps. Machine learning techniques have successfully downscaled GRACE data, improving groundwater monitoring and providing high-resolution predictions, showcasing the dataset’s critical role in hydrological research [,,,,,,,,,,,].

6. Data Challenges in the ML Approach

6.1. Spatial and Temporal Resolution

One of the primary limitations of these datasets is their spatial and temporal resolution. The GLDAS dataset offers data at 0.25° × 0.25° and 1° × 1° resolutions, which are too coarse for detailed local studies such as urban hydrology or small watershed modeling. The NLDAS dataset, with a finer spatial resolution of 1/8th degree (~12.5 km), still may not suffice for applications demanding even higher granularity.
The challenges of current LSH datasets are shown in Figure 2.
Figure 2. Summary of the challenges of current LSH datasets.
Temporal resolution is another critical factor. The CAMELS dataset, for instance, has a spatial focus on the contiguous United States (CONUS) and provides daily data from 1980 to 2015. While this temporal span is useful, the daily resolution might not capture the finer temporal variations necessary for short-term forecasting or real-time applications. While datasets like the NLDAS provide hourly data, enabling more detailed temporal analysis, others like CHIRPS offer daily data, which may miss significant sub-daily variations crucial for certain applications, such as flash flood forecasting. PERSIANN-CCS-CDR faces challenges in accurately representing spatial distribution patterns, especially in high temporal resolutions []. Furthermore, NLDAS precipitation data shows discrepancies with observations at hourly timescales, as shown in Figure 3. This is attributed to the inherent variability of precipitation and the analysis scheme used by the NLDAS [].
Figure 3. Comparison of NLDAS forcing with local forcing for precipitation at Station EF-4 (ARM/CART, Plevna, Kansas), which is representative of other stations. Each point in the hourly panel represents one hour during the period from 0000 UT on 1 January 1998 to 2300 UT on 30 September 1999. The averaging period for the other panels is indicated accordingly [].
CHIRPS accuracy is influenced by complex local factors like geography and topography. Daily data are less accurate than monthly or interannual data [], which is shown in Figure 4. The GRDC dataset’s temporal resolution varies widely, with some stations providing daily data and others offering only monthly or annual data, impacting the consistency and utility of the dataset for machine learning models that require uniform temporal granularity.
Figure 4. Mean rainfall data from rain gauge and CHIRPS: (a) daily and (b) monthly.

6.2. Data Quality and Consistency

The quality and consistency of data vary significantly across different regions and periods within these datasets. For instance, the GRDC dataset suffers from inconsistent data quality due to variations in measurement techniques and station maintenance over time. Addressing this, data homogenization and quality control protocols can be used to standardize measurements and reduce inconsistencies over time [,,]. Similarly, the CAMELS and CARAVAN datasets face challenges with data gaps and missing values, which can introduce noise and biases into machine learning models. The accurate imputation of these gaps is necessary but can introduce further uncertainties. Advanced imputation techniques (e.g., multiple imputations, principal component analyes, autoregressive conditional heteroscedasticity (ARCH)), machine learning-based methods (e.g., K-nearest neighbors and neural networks), and robust random regression imputation (RRRI) have been developed by researchers. These techniques can be applied to better handle missing data while minimizing bias [,]. CHIRPS tends to overestimate precipitation in most areas (69% of stations), particularly during El Niño and drier periods. Biases in CHIRPS data can be significant for high-detail studies like flood or drought risk analyses [,]. PERSIAN products generally underestimate precipitation and exhibit low correlation and efficiency metrics when compared to ground-based observations in the Kelani River Basin, suggesting limited reliability []. NLDAS data often show warm bias in solar radiation and cool bias in longwave radiation. Furthermore, the NLDAS may not capture the exact amount of precipitation for individual events, especially small-scale convective precipitation events, though the total amount over a longer period (21 months in the study) can be within 10% of the observed values []. Bias correction methods, including quantile mapping, the convolutional autoencoder (ConvAE) neural network, non-linear power bias correction, and power transformation, have been proven to be effective in reducing these biases, helping to adjust the precipitation estimates to more closely match the observed values [,,,]. GLDAS-1 forcing data is not suitable for detecting long-term changes as the forcing data sources were switched several times in the past, which created discontinuity. In GLDAS-2 data, the continuity is much better; however, as the dataset was bias-corrected, the bias correction makes GLDAS-2 precipitation less correlated with observed precipitation [].
Many datasets, such as GRDC and CARAVAN, have gaps in their records due to various reasons like equipment failure or data loss. These gaps pose significant challenges for machine learning applications that require continuous and complete datasets. Techniques for data imputation can mitigate this issue but often introduce additional uncertainties.

6.3. Regional and Climatic Representation

The regional focus of datasets like CARAVAN and CHIRPS can limit their applicability. CARAVAN, which primarily focuses on the Neotropics, and CHIRPS, which is designed to perform well in diverse climatic regions, may not generalize well to other areas with different climatic conditions. CHIRPS underestimates precipitation in mainly western, southern Antioquia (31% of stations) [] and complex topography []. Furthermore, the dataset differs from the observed data in higher elevation and limited to the specific region []. This regional bias can affect the representativeness and generalizability of machine learning models trained on these datasets. PERSIANN-CCS-CDR shows good performance in monthly precipitation assessment. It tends to slightly underestimate observed precipitation, and its accuracy varies depending on the region and season [].
Datasets like the NLDAS and GLDAS rely on specific parameterizations to represent land surface processes. These parameterizations may not accurately capture local conditions, particularly in areas with complex terrain or unique land surface characteristics. NLDAS models (like MOSAIC and NOAH36) tend to overestimate ET in mountainous regions []. This is attributed to potential errors in both NLDAS data and the way ET is derived in these areas. The assumption of homogeneity within each catchment or grid cell, as seen in CAMELS and CHIRPS, can also lead to inaccuracies in models that depend on the spatial variability of environmental processes.
Furthermore, accurately capturing extreme events such as floods, droughts, and hurricanes is a common challenge. Datasets like CHIRPS may not adequately represent these events due to their spatial resolution and data recording practices. For example, CHIRPS may not accurately represent the intensity of rainfall events like storms due to overestimation or underestimation biases []. Previous research also shows that the data has difficulty in detecting extremely high precipitation events []. Similarly, PERSIANN-CDR has good detection abilities for small precipitation events but struggles with extreme precipitation events, often underestimating them. This limitation can affect its accuracy in long-term drought trend analyses [].

6.4. Downscaling of LSH

Downscaling is essential for large-scale hydrological (LSH) datasets to improve spatial and temporal resolution, capture local variability, and support accurate regional impact assessments and decision-making. It translates coarse-resolution data into detailed, actionable information for local water resource management, climate change adaptation, and integration with regional models. However, the downscaling of LSH such as CAMELS, Caravan, GRDC, CHIRPS, PERSIANN, NLDAS, GLDAS, and GRACE involves multiple challenges primarily due to the inherent complexities in translating coarse-resolution data to finer scales. Inconsistent measurements or missing data in datasets like GRDC can lead to inaccuracies in the downscaled product.
Ensuring the physical plausibility and climate realism of downscaled outputs is crucial for accurate regional impact assessments []. For instance, downscaling datasets like the GLDAS or NLDAS to a resolution suitable for urban hydrology requires detailed information that may not be present in the original dataset. Furthermore, datasets like PERSIANN, which offer hourly to daily data, may not accurately capture sub-hourly precipitation events when downscaled. The variability in downscaling methodologies, from statistical to dynamical approaches, adds another layer of complexity. Each method comes with its own assumptions and uncertainties, which need a thorough evaluation to ensure their validity for specific applications [,]. A downscaling method that is effective in one region (e.g., temperate climates) may not work as well in another (e.g., arid or tropical regions) due to different hydrological processes and data characteristics. Uncertainties in the original CHIRPS dataset can become more pronounced when downscaled, affecting the reliability of precipitation estimates at local scales. Moreover, the computational costs associated with running high-resolution models over long periods can be prohibitive, making the process resource-intensive []. Applying downscaling techniques to a global dataset like GRACE that measures terrestrial water storage requires extensive computational power to ensure accuracy. Additionally, the integration of multiple datasets, each with different resolutions and temporal spans, requires sophisticated techniques to maintain consistency and reliability in the downscaled data.
Effectively downscaling large-scale hydrological datasets involves combining statistical and dynamical methods, integrating multi-source data, and enhancing computational capabilities. Statistical techniques, such as regression models and machine learning, refine coarse data cost-effectively by identifying relationships between large-scale and local variables. Dynamical downscaling with regional climate models (RCMs) simulates physical processes at finer scales. Additionally, hybrid models leverage both methods for improved accuracy and robustness. Advances in computational power facilitate high-resolution simulations, making downscaling more feasible. This requires collaboration among climatologists, hydrologists, data scientists, and policymakers to ensure the reliability and applicability of downscaled data.

6.5. Data Accessibility

To advance large-sample hydrology (LSH), it is crucial to make datasets more FAIR (findable, accessible, interoperable, and reusable) []. Currently, many datasets are stored in obscure local repositories, making them hard to find. Accessibility is limited in many regions, biasing studies toward areas with better data availability. Interoperability issues arise from inconsistent maintenance practices, and restrictive licensing hampers data reuse. Global disparities in streamflow records present significant barriers. While North America and Europe have extensive records, many regions lack data, often due to a lack of stations or unprocessed, non-digitized data. Issues such as paywalls and cumbersome retrieval processes further complicate data access. Addressing these issues involves standardizing data storage and metadata, ensuring open access, and revising licensing practices to promote data sharing. Making LSH datasets FAIR will enhance hydrological research, enabling comprehensive global studies and better water management strategies.
Based on temporal and spatial resolution, data accessibility, and data coverage, the eight specified datasets are demonstrated in Figure 5:
Figure 5. Comparison of dataset limitations in hydrology.
Comparative hydrology requires consistent data processing across different catchments for meaningful comparisons. While it is relatively easy to compare catchments within the same large-sample hydrology (LSH) dataset, cross-dataset comparisons are challenging due to varying naming conventions, data sources, and calculation methods []. Several efforts aim to standardize measurement techniques and data management across LSH datasets. For instance, the CARAVAN project addresses this by creating a globally consistent and open dataset using sources such as ERA5-Land and HydroATLAS, which are processed in the cloud to reduce the burden of handling large datasets []. In satellite-based datasets like PERSIANN, GLDAS, and GRACE, ongoing efforts harmonize data products and reduce discrepancies. PERSIANN combines satellite and ground-based observations for high-resolution precipitation estimates, refining algorithms and validation techniques for consistency []. The GLDAS and GRACE continually update data processing methodologies to enhance resolution and integration with other datasets [,]. Despite these initiatives, full standardization across LSH datasets is still limited. These advancements and collaborative approaches are crucial for overcoming the challenges of hydrological data heterogeneity. Standardizing measurement techniques and data management practices will improve the reliability and comparability of LSH datasets, enabling more robust research and better-informed water management strategies.

7. Benefits of High-Resolution Datasets over Traditional Methods

Although high-resolution datasets have limitations such as spatial and temporal resolution and data accessibility, their integration has significantly improved the monitoring and assessment of various hydrological and meteorological parameters, addressing several shortcomings inherent in traditional methods. The Table 4 highlights the comparative advantages of high-resolution datasets over conventional approaches across different aspects, such as precipitation monitoring, streamflow assessment, and general data issues. Traditional methods, such as rain gauges and empirical models, often suffer from limitations like sparse distribution, point-specific data, and extensive calibration requirements. In contrast, high-resolution satellite data (e.g., PERSIANN-CDR and CHIRPS) provide comprehensive spatial and temporal coverage, enhancing the accuracy and reliability of precipitation and streamflow estimates. Additionally, projects like the NLDAS and GLDAS offer detailed temporal data and integrate advanced observational inputs, significantly improving model accuracy. The table also underscores the benefits of standardized, high-resolution datasets in ensuring consistent data quality, filling data gaps, and improving the detection and modeling of extreme weather events.
Table 4. Comparison of the advantages of high-resolution datasets over conventional approaches.

8. Future Directions

8.1. Focusing on Specific Hydrologic Regimes

There is a growing need for large sample hydrology datasets that cater to the specific needs of researchers studying different hydrologic regimes. These datasets should include relevant data tailored to the dominant hydrologic processes in each regime. For example, datasets for snow-dominated catchments might include information on the snowpack, snow water equivalent (SWE), snowmelt rates, and freezing and thawing cycles. Permafrost regions are particularly sensitive to climate change [,,], and datasets should include data on permafrost extent, temperature, and thaw depth. Similarly, datasets for arid regions could encompass data on soil moisture, evapotranspiration, and groundwater recharge. Furthermore, urbanization significantly alters hydrological processes [,,], and datasets should include data on impervious cover, drainage networks, and water use patterns. Datasets for monsoon-influenced catchments could include precipitation data with high temporal resolution to capture the intense bursts of rainfall that occur during these events, as well as data on soil infiltration capacity and surface runoff processes. By focusing on specific hydrologic regimes, researchers can develop more accurate and transferable machine learning models for different hydrological settings.
The future directions of current LSH datasets are demonstrated in Figure 6.
Figure 6. Summary of the future directions of current LSH datasets.

8.2. Incorporating Human Impacts

Current large sample hydrology datasets offer a wealth of information for machine learning applications. However, a critical gap exists in fully capturing the influence of human activities on hydrological systems, as reported in the limitations of this review paper. While these datasets, including CAMELS, NLDAS, and GLDAS, excel at capturing natural climatic and geographic drivers of streamflow, precipitation, and other hydrological variables, they often lack data on how human actions such as irrigation, urbanization, and water management practices are altering the water cycle []. Integrating data on water use, infrastructure development, and land management practices into large sample hydrology datasets is crucial for a more comprehensive picture. By accounting for human activities, machine learning models for hydrological applications can achieve significantly better accuracy and generalizability. Furthermore, this oversight can limit the applicability of machine learning models in regions heavily influenced by anthropogenic factors, potentially leading to inaccurate predictions and analyses. Models that can account for the complex interplay between human water use and natural climate variability will be more reliable for forecasting future streamflow patterns and water availability. Datasets that encompass human impacts allow researchers to develop machine learning models that can simulate the combined effects of climate change and human water use on hydrological systems. By incorporating human impact data, researchers can develop models that can predict future water availability scenarios under different water management strategies and climate change projections.
Data on human activities that impact hydrology, such as water withdrawals, irrigation practices, and reservoir operations, are often limited in spatial and temporal coverage. Additionally, data quality and consistency can vary significantly across different regions. Additionally, human impact data comes in various formats and units, requiring careful standardization and harmonization before integration with hydrological data. Furthermore, data on water use, especially for agricultural or industrial purposes, may have privacy restrictions. Several existing datasets offer valuable information on human activities that impact hydrology. The Socioeconomic Data and Applications Center (SEDAC) [], Global Water Resources Modeling Coalition (GWRC) [], Food and Agriculture Organization (FAO), and Aquastat (FAO’s global information system on water and agriculture) [] databases provide data on water use, infrastructure, and land management practices. Integrating data from these sources with traditional hydrological datasets might significantly enhance the information available for machine learning applications.

8.3. Uncertainty Quantification

A crucial element often missing is a clear understanding of the uncertainties associated with the dataset. Just like any scientific measurement, hydrological data are inherently uncertain due to various factors [,]. Ignoring these uncertainties can lead to misleading results and unreliable predictions from machine learning models. Uncertainty quantification allows researchers to assess the reliability and limitations of the data used to train machine learning models [,]. By understanding the range of potential values and the likelihood of errors, more robust and informative models can be built. Models trained on data without uncertainty estimates may perform well on historical data but struggle when applied to new scenarios. Uncertainty quantification is necessary to quantify the model’s confidence in its predictions, providing a more realistic picture of its generalizability [,]. The methods of uncertainty quantification in hydrology, their description, applications, and the impact on hydrological models are illustrated in Table 5.
Table 5. Impacts of uncertainty quantification methods on hydrological models.
Real-world water resource management decisions often involve inherent uncertainties. By incorporating uncertainty estimates into machine learning models, a more complete picture of the risks and potential outcomes associated with different water management strategies can be provided to decision-makers. Uncertainty quantification can help identify potential biases in the data. For example, systematic errors in precipitation measurements might lead to the underestimation of streamflow in certain regions [,,]. Identifying these biases allows for data correction or the development of models that are less sensitive to them. Propagating known measurement errors through models used to generate data allows for estimating the overall uncertainty in the derived variables. Running multiple hydrological models with different parameterizations on the same data can also provide an ensemble of potential outcomes. The spread of these outcomes indicates the uncertainty associated with the model predictions. Furthermore, Bayesian methods allow for incorporating prior knowledge about the uncertainties associated with the data into the analysis []. This can be particularly useful when dealing with limited data or missing information.

8.4. Real-Time Data Integration

For real-time decision making and proactive water resource management, real-time data integration is becoming increasingly crucial []. This involves continuously collecting, processing, and incorporating the latest hydrological observations into machine learning models. It is like having a constantly updated feed of information flowing directly into the models, allowing them to react and adapt to the ever-changing hydrological landscape. By leveraging the constant flow of real-time data, machine learning models can evolve from historical analysis tools to powerful real-time decision support systems. For example, real-time data on precipitation, river stages, and soil moisture allows for more accurate and timely flood forecasts []. Machine learning models can be continuously updated with the latest observations, leading to earlier warnings and more effective evacuation measures. Real-time data on streamflow, groundwater levels, and evapotranspiration can be used to monitor drought conditions and identify areas at risk. Early detection allows for proactive water management strategies, such as water restrictions or targeted conservation efforts. Real-time data on inflows, outflows, and downstream water demands can be used to optimize reservoir operations. Machine learning models can be trained to predict future water availability and suggest release strategies that balance competing needs for hydropower generation, irrigation, and environmental flows. Ensuring real-time data arrives with minimal delay is crucial. Slow or unreliable data transmission can hinder the effectiveness of machine learning models that rely on the most up-to-date information. Furthermore, real-time data streams may contain errors or inconsistencies []. Implementing robust quality control measures is essential to ensure the accuracy and reliability of the data used by machine learning models. Furthermore, processing large volumes of real-time data requires significant computational power. Machine learning models need to be optimized to handle real-time data streams efficiently without compromising accuracy.
Several technologies facilitate real-time data integration in hydrology. Dense networks of sensors deployed across catchments can collect real-time data on precipitation, river stages, soil moisture, and other variables []. These networks provide a continuous stream of observations for machine learning models. Modern sensors with enhanced accuracy and durability can monitor water quality, streamflow, soil moisture, and groundwater levels in real time. Coupled with IoT devices and robust 5G networks, these sensors enable extensive data collection and rapid transmission, even from remote areas. This integration supports dynamic modeling and predictive analytics, allowing water managers to respond swiftly to issues like droughts, floods, and contamination, thereby enhancing sustainability and resilience in water management practices. Moreover, cloud-based platforms offer the scalable computational resources needed to process and analyze large volumes of real-time data in real time [,]. Machine learning models can be deployed on these platforms for continuous training and prediction. Furthermore, streaming analytics techniques are specifically designed to handle continuous data streams [,]. These techniques allow for real-time processing and the analysis of hydrological data, enabling near-instantaneous insights and predictions.

8.5. Data Collection

The future of data collection in hydrology and climate science holds exciting potential, particularly with the integration of citizen science initiatives. Engaging the public in scientific research democratizes data collection and significantly enhances the scope and scale of gathered data [,]. Citizen science can involve the public in reporting local precipitation levels, streamflow measurements, and groundwater levels using mobile apps, complementing traditional monitoring systems, especially in under-monitored regions []. Additionally, citizen science enhances data density and spatial coverage, fosters community engagement and awareness, and provides timely, localized insights into hydrological events, thereby improving the accuracy and responsiveness of flood warnings and water resource management [,]. However, negative impacts of citizen science, including over-burdening participants, health and safety risks, decreased self-reliance, exclusion, technology barriers, decentralizing monitoring and risk, conflict creation, data privacy concerns, and demotivational impacts, should also be considered []. Mobile and wearable technology, equipped with GPS, cameras, and environmental sensors, enables real-time, location-based data collection []. Social media and online platforms can gather real-time reports from users experiencing hydrological events, providing timely georeferenced data [,]. The internet of things (IoT) offers automated, continuous monitoring through smart sensors in rivers and lakes, and transmitting real-time data on water levels and flow rates []. Thus, real-time data can be gathered for hydrological analyses. Educational programs and community engagement can train citizens in scientific data collection methods, enhancing data quality and reliability. Integration with traditional methods and robust validation protocols, including machine learning techniques, can ensure the accuracy of citizen-collected data. Supportive policies and governance frameworks are essential in recognizing the value of such data and providing necessary tools. Ethical considerations, including data privacy and ownership, must be addressed to build trust and encourage participation. Leveraging citizen science and modern technology can lead to more comprehensive and inclusive data collection, driving innovation in hydrology and climate science.

9. Conclusions

The integration of machine learning (ML) in hydrology has significantly advanced our understanding and prediction of various hydrological processes. The availability of extensive datasets such as CAMELS, CARAVAN, GRDC, CHIRPS, NLDAS, GLDAS, and GRACE has been crucial in supporting these advancements. These datasets provide diverse and comprehensive data necessary for developing robust ML models that can predict streamflow, groundwater levels, precipitation, and flood frequencies, even in data-scarce regions. The CAMELS dataset, with its detailed catchment attributes and meteorological data, has been instrumental in enhancing streamflow and rainfall-runoff modeling. Similarly, the CARAVAN dataset’s standardization and aggregation of global hydrology data facilitate large-scale hydrological studies. GRDC’s extensive river discharge data, along with CHIRPS’ high-resolution precipitation records, provide invaluable inputs for accurate hydrological modeling. Despite the significant progress, challenges remain. The datasets often face issues with spatial and temporal resolution, data quality, and consistency. For example, the CAMELS dataset’s daily resolution may not capture finer temporal variations, while the GLDAS dataset’s coarse spatial resolution can limit its application in detailed local studies. Additionally, the integration of human activities and impacts into these datasets is still lacking, which is crucial for comprehensive hydrological modeling. However, these high-resolution datasets have a number of advantages including improved accuracy in hydrological modeling, enhanced spatial and temporal detail, and the ability to capture finer-scale processes that are often missed by coarser datasets. Future directions should focus on improving dataset resolution, integrating human impact data, enhancing real-time data integration, and inclusion of citizen science and the IoT in data collection. Developing datasets tailored to specific hydrologic regimes and incorporating uncertainty quantification will further refine ML models and their applications in hydrology. By addressing these challenges and leveraging the strengths of these datasets, the field of hydrology can continue to benefit from the transformative potential of machine learning, leading to more accurate predictions, better water resource management, and improved resilience to climatic extremes.

Funding

The research was supported by Florida State University Council on Research + Creativity (CRC): Sustainability through funding number 046725.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lange, H.; Sippel, S. Machine Learning Applications in Hydrology; Springer: Cham, Switzerland, 2020; pp. 233–257. [Google Scholar] [CrossRef]
  2. Raschka, S.; Patterson, J.; Nolet, C. Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence. Information 2020, 11, 193. [Google Scholar] [CrossRef]
  3. Xu, Y.; Liu, X.; Cao, X.; Huang, C.; Liu, E.; Qian, S.; Liu, X.; Wu, Y.; Dong, F.; Qiu, C.W.; et al. Artificial Intelligence: A Powerful Paradigm for Scientific Research. Innovation 2021, 2, 100179. [Google Scholar] [CrossRef] [PubMed]
  4. Zhou, L.; Pan, S.; Wang, J.; Vasilakos, A.V. Machine Learning on Big Data: Opportunities and Challenges. Neurocomputing 2017, 237, 350–361. [Google Scholar] [CrossRef]
  5. Kratzert, F.; Klotz, D.; Herrnegger, M.; Sampson, A.K.; Hochreiter, S.; Nearing, G.S. Toward Improved Predictions in Ungauged Basins: Exploiting the Power of Machine Learning. Water Resour. Res. 2019, 55, 11344–11354. [Google Scholar] [CrossRef]
  6. Arriagada, P.; Karelovic, B.; Link, O. Automatic Gap-Filling of Daily Streamflow Time Series in Data-Scarce Regions Using a Machine Learning Algorithm. J. Hydrol. 2021, 598, 126454. [Google Scholar] [CrossRef]
  7. Lu, D.; Konapala, G.; Painter, S.L.; Kao, S.C.; Gangrade, S. Streamflow Simulation in Data-Scarce Basins Using Bayesian and Physics-Informed Machine Learning Models. J. Hydrometeorol. 2021, 22, 1421–1438. [Google Scholar] [CrossRef]
  8. Yang, C.; Xu, M.; Kang, S.; Fu, C.; Hu, D. Improvement of Streamflow Simulation by Combining Physically Hydrological Model with Deep Learning Methods in Data-Scarce Glacial River Basin. J. Hydrol. 2023, 625, 129990. [Google Scholar] [CrossRef]
  9. Rafik, A.; Ait Brahim, Y.; Amazirh, A.; Ouarani, M.; Bargam, B.; Ouatiki, H.; Bouslihim, Y.; Bouchaou, L.; Chehbouni, A. Groundwater Level Forecasting in a Data-Scarce Region through Remote Sensing Data Downscaling, Hydrological Modeling, and Machine Learning: A Case Study from Morocco. J. Hydrol. Reg. Stud. 2023, 50, 101569. [Google Scholar] [CrossRef]
  10. Guzman, S.M.; Paz, J.O.; Tagert, M.L.M.; Mercer, A.E. Evaluation of Seasonally Classified Inputs for the Prediction of Daily Groundwater Levels: NARX Networks Vs Support Vector Machines. Environ. Model. Assess. 2019, 24, 223–234. [Google Scholar] [CrossRef]
  11. Zhu, H.; Zhou, Q. Advancing Satellite-Derived Precipitation Downscaling in Data-Sparse Area Through Deep Transfer Learning. IEEE Trans. Geosci. Remote Sens. 2024, 62, 4102513. [Google Scholar] [CrossRef]
  12. Mangukiya, N.K.; Sharma, A. Alternate Pathway for Regional Flood Frequency Analysis in Data-Sparse Region. J. Hydrol. 2024, 629, 130635. [Google Scholar] [CrossRef]
  13. Newman, A.J.; Clark, M.P.; Sampson, K.; Wood, A.; Hay, L.E.; Bock, A.; Viger, R.J.; Blodgett, D.; Brekke, L.; Arnold, J.R.; et al. Development of a Large-Sample Watershed-Scale Hydrometeorological Data Set for the Contiguous USA: Data Set Characteristics and Assessment of Regional Variability in Hydrologic Model Performance. Hydrol. Earth Syst. Sci. 2015, 19, 209–223. [Google Scholar] [CrossRef]
  14. Addor, N.; Newman, A.J.; Mizukami, N.; Clark, M.P. The CAMELS Data Set: Catchment Attributes and Meteorology for Large-Sample Studies. Hydrol. Earth Syst. Sci. 2017, 21, 5293–5313. [Google Scholar] [CrossRef]
  15. Clerc-Schwarzenbach, F.M.; Selleri, G.; Neri, M.; Toth, E.; van Meerveld, I.; Seibert, J. HESS Opinions: A Few Camels or a Whole Caravan? EGUsphere 2024, 2024, 1–29. [Google Scholar] [CrossRef]
  16. Kratzert, F.; Nearing, G.; Addor, N.; Erickson, T.; Gauch, M.; Gilon, O.; Gudmundsson, L.; Hassidim, A.; Klotz, D.; Nevo, S.; et al. Caravan-A Global Community Dataset for Large-Sample Hydrology. Sci. Data 2023, 10, 61. [Google Scholar] [CrossRef] [PubMed]
  17. Funk, C.; Peterson, P.; Landsfeld, M.; Pedreros, D.; Verdin, J.; Shukla, S.; Husak, G.; Rowland, J.; Harrison, L.; Hoell, A.; et al. The Climate Hazards Infrared Precipitation with Stations—A New Environmental Record for Monitoring Extremes. Sci. Data 2015, 2, 150066. [Google Scholar] [CrossRef] [PubMed]
  18. Adem, E.; Elfeki, A.; Chaabani, A.; Alwegdani, A.; Hussain, S.; Elhag, M. Impact of Satellite Precipitation Estimation Methods on the Hydrological Response: Case Study Wadi Nu’man Basin, Saudi Arabia. Theor. Appl. Climatol. 2024, 155, 3907–3925. [Google Scholar] [CrossRef]
  19. Wang, M.; Rezaie-Balf, M.; Naganna, S.R.; Yaseen, Z.M. Sourcing CHIRPS Precipitation Data for Streamflow Forecasting Using Intrinsic Time-Scale Decomposition Based Machine Learning Models. Hydrol. Sci. J. 2021, 66, 1437–1456. [Google Scholar] [CrossRef]
  20. Khan, M.A.; Stamm, J. Assessment of the Hydrological and Coupled Soft Computing Models, Based on Different Satellite Precipitation Datasets, to Simulate Streamflow and Sediment Load in a Mountainous Catchment. J. Water Clim. Change 2023, 14, 610–632. [Google Scholar] [CrossRef]
  21. Bhusal, A.; Parajuli, U.; Regmi, S.; Kalra, A. Application of Machine Learning and Process-Based Models for Rainfall-Runoff Simulation in DuPage River Basin, Illinois. Hydrology 2022, 9, 117. [Google Scholar] [CrossRef]
  22. Yeditha, P.K.; Kasi, V.; Rathinasamy, M.; Agarwal, A. Forecasting of Extreme Flood Events Using Different Satellite Precipitation Products and Wavelet-Based Machine Learning Methods. Chaos 2020, 30, 063115. [Google Scholar] [CrossRef]
  23. Chancay, J.E.; Espitia-Sarmiento, E.F. Improving Hourly Precipitation Estimates for Flash Flood Modeling in Data-Scarce Andean-Amazon Basins: An Integrative Framework Based on Machine Learning and Multiple Remotely Sensed Data. Remote Sens. 2021, 13, 4446. [Google Scholar] [CrossRef]
  24. Hayatbini, N.; Kong, B.; Hsu, K.L.; Nguyen, P.; Sorooshian, S.; Stephens, G.; Fowlkes, C.; Nemani, R.; Ganguly, S. Conditional Generative Adversarial Networks (CGANs) for near Real-Time Precipitation Estimation from Multispectral GOES-16 Satellite Imageries-PERSIANN-CGAN. Remote Sens. 2019, 11, 2193. [Google Scholar] [CrossRef]
  25. Tao, Y.; Hsu, K.; Ihler, A.; Gao, X.; Sorooshian, S. A Two-Stage Deep Neural Network Framework for Precipitation Estimation from Bispectral Satellite Information. J. Hydrometeorol. 2018, 19, 393–408. [Google Scholar] [CrossRef]
  26. Das, P.; Zhang, Z.; Ren, H. Evaluating the Accuracy of Two Satellite-Based Quantitative Precipitation Estimation Products and Their Application for Meteorological Drought Monitoring over the Lake Victoria Basin, East Africa. Geo-Spat. Inf. Sci. 2022, 25, 500–518. [Google Scholar] [CrossRef]
  27. Yu, C.; Hu, D.; Shao, H.; Dai, X.; Liu, G.; Wu, S. Runoff Simulation Driven by Multi-Source Satellite Data Based on Hydrological Mechanism Algorithm and Deep Learning Network. J. Hydrol. Re.g Stud. 2024, 52, 101720. [Google Scholar] [CrossRef]
  28. Khajehali, M.; Safavi, H.R.; Nikoo, M.R.; Fooladi, M. A Fusion-Based Framework for Daily Flood Forecasting in Multiple-Step-Ahead and near-Future under Climate Change Scenarios: A Case Study of the Kan River, Iran. In Natural Hazards; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar] [CrossRef]
  29. Ayzel, G.; Kurochkina, L.; Zhuravlev, S. The Influence of Regional Hydrometric Data Incorporation on the Accuracy of Gridded Reconstruction of Monthly Runoff. Hydrol. Sci. J. 2022, 67, 2429–2440. [Google Scholar] [CrossRef]
  30. Wang, C.; Jiang, S.; Zheng, Y.; Han, F.; Kumar, R.; Rakovec, O.; Li, S. Distributed Hydrological Modeling With Physics-Encoded Deep Learning: A General Framework and Its Application in the Amazon. Water Resour. Res. 2024, 60, e2023WR036170. [Google Scholar] [CrossRef]
  31. Jiang, S.; Zheng, Y.; Solomatine, D. Improving AI System Awareness of Geoscience Knowledge: Symbiotic Integration of Physical Approaches and Deep Learning. Geophys. Res. Lett. 2020, 47, e2020GL088229. [Google Scholar] [CrossRef]
  32. Xu, T.; Liang, F. Machine Learning for Hydrologic Sciences: An Introductory Overview. WIREs Water 2021, 8, e1533. [Google Scholar] [CrossRef]
  33. Rasheed, Z.; Aravamudan, A.; Gorji Sefidmazgi, A.; Anagnostopoulos, G.C.; Nikolopoulos, E.I. Advancing Flood Warning Procedures in Ungauged Basins with Machine Learning. J. Hydrol. 2022, 609, 127736. [Google Scholar] [CrossRef]
  34. Zhou, F.; Chen, Y.; Liu, J. Application of a New Hybrid Deep Learning Model That Considers Temporal and Feature Dependencies in Rainfall–Runoff Simulation. Remote Sens. 2023, 15, 1395. [Google Scholar] [CrossRef]
  35. Ehteram, M.; Ghanbari-Adivi, E. Self-Attention (SA) Temporal Convolutional Network (SATCN)-Long Short-Term Memory Neural Network (SATCN-LSTM): An Advanced Python Code for Predicting Groundwater Level. Environ. Sci. Pollut. Res. 2023, 30, 92903–92921. [Google Scholar] [CrossRef] [PubMed]
  36. Arsenault, R.; Martel, J.L.; Brunet, F.; Brissette, F.; Mai, J. Continuous Streamflow Prediction in Ungauged Basins: Long Short-Term Memory Neural Networks Clearly Outperform Traditional Hydrological Models. Hydrol. Earth Syst. Sci. 2023, 27, 139–157. [Google Scholar] [CrossRef]
  37. Sabzipour, B.; Arsenault, R.; Troin, M.; Martel, J.L.; Brissette, F.; Brunet, F.; Mai, J. Comparing a Long Short-Term Memory (LSTM) Neural Network with a Physically-Based Hydrological Model for Streamflow Forecasting over a Canadian Catchment. J. Hydrol. 2023, 627, 130380. [Google Scholar] [CrossRef]
  38. Shen, C.; Lawson, K. Applications of Deep Learning in Hydrology. Deep Learning for the Earth Sciences: A Comprehensive Approach to Remote Sensing, Climate Science and Geosciences; John Wiley & Sons: Hoboken, NJ, USA, 2021; pp. 283–297. [Google Scholar] [CrossRef]
  39. Tripathy, K.P.; Mishra, A.K. Deep Learning in Hydrology and Water Resources Disciplines: Concepts, Methods, Applications, and Research Directions. J. Hydrol. 2024, 628, 130458. [Google Scholar] [CrossRef]
  40. Hegelich, S. Decision Trees and Random Forests: Machine Learning Techniques to Classify Rare Events. Eur. Policy Anal. 2016, 2, 98–120. [Google Scholar] [CrossRef]
  41. Ali, J.; Khan, R.; Ahmad, N.; Maqsood, I. Random Forests and Decision Trees. Int. J. Comput. Sci. Issues 2012, 9, 272. [Google Scholar]
  42. He, X.; Chaney, N.W.; Schleiss, M.; Sheffield, J. Spatial Downscaling of Precipitation Using Adaptable Random Forests. Water Resour. Res. 2016, 52, 8217–8237. [Google Scholar] [CrossRef]
  43. Liang, Z.; Tang, T.; Li, B.; Liu, T.; Wang, J.; Hu, Y. Long-Term Streamflow Forecasting Using SWAT through the Integration of the Random Forests Precipitation Generator: Case Study of Danjiangkou Reservoir. Hydrol. Res. 2018, 49, 1513–1527. [Google Scholar] [CrossRef]
  44. Elbeltagi, A.; Pande, C.B.; Kumar, M.; Tolche, A.D.; Singh, S.K.; Kumar, A.; Vishwakarma, D.K. Prediction of Meteorological Drought and Standardized Precipitation Index Based on the Random Forest (RF), Random Tree (RT), and Gaussian Process Regression (GPR) Models. Environ. Sci. Pollut. Res. 2023, 30, 43183–43202. [Google Scholar] [CrossRef] [PubMed]
  45. Saber, M.; Boulmaiz, T.; Guermoui, M.; Abdrabo, K.I.; Kantoush, S.A.; Sumi, T.; Boutaghane, H.; Hori, T.; Binh, D.V.; Nguyen, B.Q.; et al. Enhancing Flood Risk Assessment through Integration of Ensemble Learning Approaches and Physical-Based Hydrological Modeling. Geomat. Nat. Hazards Risk 2023, 14, 2203798. [Google Scholar] [CrossRef]
  46. Anandhi, A.; Srinivas, V.V.; Nanjundiah, R.S.; Nagesh Kumar, D. Downscaling Precipitation to River Basin in India for IPCC SRES Scenarios Using Support Vector Machine. Int. J. Climatol. 2008, 28, 401–420. [Google Scholar] [CrossRef]
  47. Sudheer, C.; Shrivastava, N.A.; Panigrahi, B.K.; Mathur, S. Groundwater Level Forecasting Using SVM-QPSO. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2011; pp. 731–741. [Google Scholar] [CrossRef]
  48. Sudheer, C.; Maheswaran, R.; Panigrahi, B.K.; Mathur, S. A Hybrid SVM-PSO Model for Forecasting Monthly Streamflow. Neural. Comput. Appl. 2014, 24, 1381–1389. [Google Scholar] [CrossRef]
  49. Raghavendra, S.; Deka, P.C. Support Vector Machine Applications in the Field of Hydrology: A Review. Appl. Soft Comput. 2014, 19, 372–386. [Google Scholar] [CrossRef]
  50. Pappu, V.; Pardalos, P.M. High-Dimensional Data Classification. Springer Optim. Its Appl. 2014, 92, 119–150. [Google Scholar] [CrossRef]
  51. Xu XUHUAN, H.; Caramanis, C.; Mannor, S.; Smola, A. Robustness and Regularization of Support Vector Machines. J. Mach. Learn. Res. 2009, 10, 1485–1510. [Google Scholar]
  52. Li, H.X.; Yang, J.L.; Zhang, G.; Fan, B. Probabilistic Support Vector Machines for Classification of Noise Affected Data. Inf. Sci. 2013, 221, 60–71. [Google Scholar] [CrossRef]
  53. Tan, C.O.; Beklioglu, M. Modeling Complex Nonlinear Responses of Shallow Lakes to Fish and Hydrology Using Artificial Neural Networks. Ecol. Model. 2006, 196, 183–194. [Google Scholar] [CrossRef]
  54. Kouadri, S.; Pande, C.B.; Panneerselvam, B.; Moharir, K.N.; Elbeltagi, A. Prediction of Irrigation Groundwater Quality Parameters Using ANN, LSTM, and MLR Models. Environ. Sci. Pollut. Res. 2022, 29, 21067–21091. [Google Scholar] [CrossRef]
  55. Wu, W.; Dandy, G.C.; Maier, H.R. Protocol for Developing ANN Models and Its Application to the Assessment of the Quality of the ANN Model Development Process in Drinking Water Quality Modelling. Environ. Model. Softw. 2014, 54, 108–127. [Google Scholar] [CrossRef]
  56. Chang, L.C.; Amin, M.Z.M.; Yang, S.N.; Chang, F.J. Building ANN-Based Regional Multi-Step-Ahead Flood Inundation Forecast Models. Water 2018, 10, 1283. [Google Scholar] [CrossRef]
  57. Nourani, V.; Komasi, M.; Mano, A. A Multivariate ANN-Wavelet Approach for Rainfall-Runoff Modeling. Water Resour. Manag. 2009, 23, 2877–2894. [Google Scholar] [CrossRef]
  58. Carabantes, M. Black-Box Artificial Intelligence: An Epistemological and Critical Analysis. AI Soc. 2020, 35, 309–317. [Google Scholar] [CrossRef]
  59. Khalaf Jabbar Rafiqul Zaman Khan, H.D. Methods to Avoid Over-Fitting and under-Fitting in Supervised Machine Learning (Comparative Study). Comput. Sci. Commun. Instrum. Devices 2015, 70, 978–981. [Google Scholar]
  60. Piotrowski, A.P.; Napiorkowski, J.J. A Comparison of Methods to Avoid Overfitting in Neural Networks Training in the Case of Catchment Runoff Modelling. J. Hydrol. 2013, 476, 97–111. [Google Scholar] [CrossRef]
  61. Pham, Q.B.; Kumar, M.; Di Nunno, F.; Elbeltagi, A.; Granata, F.; Islam, A.R.M.T.; Talukdar, S.; Nguyen, X.C.; Ahmed, A.N.; Anh, D.T. Groundwater Level Prediction Using Machine Learning Algorithms in a Drought-Prone Area. Neural. Comput. Appl. 2022, 34, 10751–10773. [Google Scholar] [CrossRef]
  62. Ibrahem Ahmed Osman, A.; Najah Ahmed, A.; Chow, M.F.; Feng Huang, Y.; El-Shafie, A. Extreme Gradient Boosting (Xgboost) Model to Predict the Groundwater Levels in Selangor Malaysia. Ain Shams Eng. J. 2021, 12, 1545–1556. [Google Scholar] [CrossRef]
  63. Chen, L.; Xing, M.; He, B.; Wang, J.; Shang, J.; Huang, X.; Xu, M. Estimating Soil Moisture over Winter Wheat Fields during Growing Season Using Machine-Learning Methods. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3706–3718. [Google Scholar] [CrossRef]
  64. Xu, K.; Han, Z.; Xu, H.; Bin, L. Rapid Prediction Model for Urban Floods Based on a Light Gradient Boosting Machine Approach and Hydrological–Hydraulic Model. Int. J. Disaster Risk Sci. 2023, 14, 79–97. [Google Scholar] [CrossRef]
  65. Natekin, A.; Knoll, A. Gradient Boosting Machines, a Tutorial. Front. Neurorobot. 2013, 7, 63623. [Google Scholar] [CrossRef] [PubMed]
  66. Tao, H.; Awadh, S.M.; Salih, S.Q.; Shafik, S.S.; Yaseen, Z.M. Integration of Extreme Gradient Boosting Feature Selection Approach with Machine Learning Models: Application of Weather Relative Humidity Prediction. Neural. Comput. Appl. 2022, 34, 515–533. [Google Scholar] [CrossRef]
  67. Wang, Y.; Fang, Z.; Hong, H.; Peng, L. Flood Susceptibility Mapping Using Convolutional Neural Network Frameworks. J. Hydrol. 2020, 582, 124482. [Google Scholar] [CrossRef]
  68. Sadeghi, M.; Asanjan, A.A.; Faridzad, M.; Nguyen, P.H.U.; Hsu, K.; Sorooshian, S.; Braithwaite, D.A.N. PERSIANN-CNN: Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks–Convolutional Neural Networks. J. Hydrometeorol. 2019, 20, 2273–2289. [Google Scholar] [CrossRef]
  69. Yang, F.; Feng, T.; Xu, G.; Chen, Y. Applied Method for Water-Body Segmentation Based on Mask R-CNN. J. Appl. Remote Sens. 2020, 14, 1. [Google Scholar] [CrossRef]
  70. Naganna, S.R.; Marulasiddappa, S.B.; Balreddy, M.S.; Yaseen, Z.M. Daily Scale Streamflow Forecasting in Multiple Stream Orders of Cauvery River, India: Application of Advanced Ensemble and Deep Learning Models. J. Hydrol. 2023, 626, 130320. [Google Scholar] [CrossRef]
  71. Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional Neural Networks: An Overview and Application in Radiology. Insights Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [PubMed]
  72. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef] [PubMed]
  73. Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
  74. Castangia, M.; Grajales, L.M.M.; Aliberti, A.; Rossi, C.; Macii, A.; Macii, E.; Patti, E. Transformer Neural Networks for Interpretable Flood Forecasting. Environ. Model. Softw. 2023, 160, 105581. [Google Scholar] [CrossRef]
  75. Liu, C.; Liu, D.; Mu, L. Improved Transformer Model for Enhanced Monthly Streamflow Predictions of the Yangtze River. IEEE Access 2022, 10, 58240–58253. [Google Scholar] [CrossRef]
  76. Ghobadi, F.; Kang, D. Improving Long-Term Streamflow Prediction in a Poorly Gauged Basin Using Geo-Spatiotemporal Mesoscale Data and Attention-Based Deep Learning: A Comparative Study. J. Hydrol. 2022, 615, 128608. [Google Scholar] [CrossRef]
  77. Yin, L.; Wang, L.; Keim, B.D.; Konsoer, K.; Yin, Z.; Liu, M.; Zheng, W. Spatial and Wavelet Analysis of Precipitation and River Discharge during Operation of the Three Gorges Dam, China. Ecol. Indic. 2023, 154, 110837. [Google Scholar] [CrossRef]
  78. Zhang, K.; Li, Y.; Yu, Z.; Yang, T.; Xu, J.; Chao, L.; Ni, J.; Wang, L.; Gao, Y.; Hu, Y.; et al. Xin’anjiang Nested Experimental Watershed (XAJ-NEW) for Understanding Multiscale Water Cycle: Scientific Objectives and Experimental Design. Engineering 2022, 18, 207–217. [Google Scholar] [CrossRef]
  79. Global Runoff Data Centre (GRDC)-Dataset-Waterdata. Available online: https://wbwaterdata.org/dataset/global-runoff-data-centre-grdc (accessed on 17 May 2024).
  80. Nguyen, P.; Shearer, E.J.; Tran, H.; Ombadi, M.; Hayatbini, N.; Palacios, T.; Huynh, P.; Braithwaite, D.; Updegraff, G.; Hsu, K.; et al. The CHRS Data Portal, an Easily Accessible Public Repository for PERSIANN Global Satellite Precipitation Data. Sci. Data 2019, 6, 180296. [Google Scholar] [CrossRef] [PubMed]
  81. Ashouri, H.; Hsu, K.L.; Sorooshian, S.; Braithwaite, D.K.; Knapp, K.R.; Cecil, L.D.; Nelson, B.R.; Prat, O.P. PERSIANN-CDR: Daily Precipitation Climate Data Record from Multisatellite Observations for Hydrological and Climate Studies. Bull. Am. Meteorol. Soc. 2015, 96, 69–83. [Google Scholar] [CrossRef]
  82. Sadeghi, M.; Nguyen, P.; Naeini, M.R.; Hsu, K.; Braithwaite, D.; Sorooshian, S. PERSIANN-CCS-CDR, a 3-Hourly 0.04° Global Precipitation Climate Data Record for Heavy Precipitation Studies. Sci. Data 2021, 8, 157. [Google Scholar] [CrossRef] [PubMed]
  83. Ma, K.; Feng, D.; Lawson, K.; Tsai, W.P.; Liang, C.; Huang, X.; Sharma, A.; Shen, C. Transferring Hydrologic Data Across Continents–Leveraging Data-Rich Regions to Improve Hydrologic Prediction in Data-Sparse Regions. Water Resour. Res. 2021, 57, e2020WR028600. [Google Scholar] [CrossRef]
  84. Ouyang, W.; Lawson, K.; Feng, D.; Ye, L.; Zhang, C.; Shen, C. Continental-Scale Streamflow Modeling of Basins with Reservoirs: Towards a Coherent Deep-Learning-Based Strategy. J. Hydrol. 2021, 599, 126455. [Google Scholar] [CrossRef]
  85. Kratzert, R.; Klotz, F.; Brenner, D.; Schulz, C.; Herrnegger, K. Rainfall-Runoff Modelling Using Long Short-Term Memory (LSTM) Networks. Hydrol. Earth Syst. Sci. 2018, 22, 2775–2784. [Google Scholar] [CrossRef]
  86. Khand, K.; Senay, G.B. Evaluation of Streamflow Predictions from LSTM Models in Water- and Energy-Limited Regions in the United States. Mach. Learn. Appl. 2024, 16, 100551. [Google Scholar] [CrossRef]
  87. Xu, L.; Shi, P.; Wu, H.; Qu, S.; Li, Q.; Sun, Y.; Yang, X.; Jiang, P.; Qiu, C. Investigating the Potential of EMA-Embedded Feature Selection Method for ESVR and LSTM to Enhance the Robustness of Monthly Streamflow Forecasting from Local Meteorological Information. J. Hydrol. 2024, 636, 131230. [Google Scholar] [CrossRef]
  88. Duan, S.; Ullrich, P.; Shu, L. Using Convolutional Neural Networks for Streamflow Projection in California. Front. Water 2020, 2, 28. [Google Scholar] [CrossRef]
  89. Ren, K.; Fang, W.; Qu, J.; Zhang, X.; Shi, X. Comparison of Eight Filter-Based Feature Selection Methods for Monthly Streamflow Forecasting–Three Case Studies on CAMELS Data Sets. J. Hydrol. 2020, 586, 124897. [Google Scholar] [CrossRef]
  90. Feng, D.; Fang, K.; Shen, C. Enhancing Streamflow Forecast and Extracting Insights Using Long-Short Term Memory Networks With Data Integration at Continental Scales. Water Resour. Res. 2020, 56, e2019WR026793. [Google Scholar] [CrossRef]
  91. Sadler, J.M.; Appling, A.P.; Read, J.S.; Oliver, S.K.; Jia, X.; Zwart, J.A.; Kumar, V. Multi-Task Deep Learning of Daily Streamflow and Water Temperature. Water Resour. Res. 2022, 58, e2021WR030138. [Google Scholar] [CrossRef]
  92. Wi, S.; Steinschneider, S. Assessing the Physical Realism of Deep Learning Hydrologic Model Projections Under Climate Change. Water Resour. Res. 2022, 58, e2022WR032123. [Google Scholar] [CrossRef]
  93. Tyralis, H.; Papacharalampous, G.; Langousis, A. Super Ensemble Learning for Daily Streamflow Forecasting: Large-Scale Demonstration and Comparison with Multiple Machine Learning Algorithms. Neural. Comput. Appl. 2021, 33, 3053–3068. [Google Scholar] [CrossRef]
  94. Frame, J.M.; Kratzert, F.; Raney, A.; Rahman, M.; Salas, F.R.; Nearing, G.S. Post-Processing the National Water Model with Long Short-Term Memory Networks for Streamflow Predictions and Model Diagnostics. JAWRA J. Am. Water Resour. Assoc. 2021, 57, 885–905. [Google Scholar] [CrossRef]
  95. Feng, D.; Liu, J.; Lawson, K.; Shen, C. Differentiable, Learnable, Regionalized Process-Based Models With Multiphysical Outputs Can Approach State-Of-The-Art Hydrologic Prediction Accuracy. Water Resour. Res. 2022, 58, e2022WR032404. [Google Scholar] [CrossRef]
  96. Kratzert, F.; Klotz, D.; Hochreiter, S.; Nearing, G.S. A Note on Leveraging Synergy in Multiple Meteorological Data Sets with Deep Learning for Rainfall-Runoff Modeling. Hydrol. Earth Syst. Sci. 2021, 25, 2685–2703. [Google Scholar] [CrossRef]
  97. Xie, K.; Liu, P.; Zhang, J.; Han, D.; Wang, G.; Shen, C. Physics-Guided Deep Learning for Rainfall-Runoff Modeling by Considering Extreme Events and Monotonic Relationships. J. Hydrol. 2021, 603, 127043. [Google Scholar] [CrossRef]
  98. Yin, H.; Guo, Z.; Zhang, X.; Chen, J.; Zhang, Y. RR-Former: Rainfall-Runoff Modeling Based on Transformer. J. Hydrol. 2022, 609, 127781. [Google Scholar] [CrossRef]
  99. Herath, H.M.V.V.; Chadalawada, J.; Babovic, V. Hydrologically Informed Machine Learning for Rainfall-Runoff Modelling: Towards Distributed Modelling. Hydrol. Earth Syst. Sci. 2021, 25, 4373–4401. [Google Scholar] [CrossRef]
  100. Yin, W.; Fan, Z.; Tangdamrongsub, N.; Hu, L.; Zhang, M. Comparison of Physical and Data-Driven Models to Forecast Groundwater Level Changes with the Inclusion of GRACE–A Case Study over the State of Victoria, Australia. J. Hydrol. 2021, 602, 126735. [Google Scholar] [CrossRef]
  101. Jin, J.; Zhang, Y.; Hao, Z.; Xia, R.; Yang, W.; Yin, H.; Zhang, X. Benchmarking Data-Driven Rainfall-Runoff Modeling across 54 Catchments in the Yellow River Basin: Overfitting, Calibration Length, Dry Frequency. J. Hydrol. Reg. Stud. 2022, 42, 101119. [Google Scholar] [CrossRef]
  102. Klotz, D.; Kratzert, F.; Gauch, M.; Keefe Sampson, A.; Brandstetter, J.; Klambauer, G.; Hochreiter, S.; Nearing, G. Uncertainty Estimation with Deep Learning for Rainfall-Runoff Modeling. Hydrol. Earth Syst. Sci. 2022, 26, 1673–1693. [Google Scholar] [CrossRef]
  103. Yin, H.; Zhang, X.; Wang, F.; Zhang, Y.; Xia, R.; Jin, J. Rainfall-Runoff Modeling Using LSTM-Based Multi-State-Vector Sequence-to-Sequence Model. J. Hydrol. 2021, 598, 126378. [Google Scholar] [CrossRef]
  104. Stein, L.; Clark, M.P.; Knoben, W.J.M.; Pianosi, F.; Woods, R.A. How Do Climate and Catchment Attributes Influence Flood Generating Processes? A Large-Sample Study for 671 Catchments Across the Contiguous USA. Water Resour. Res. 2021, 57, e2020WR028300. [Google Scholar] [CrossRef]
  105. Jarajapu, D.C.; Rathinasamy, M.; Agarwal, A.; Bronstert, A. Design Flood Estimation Using Extreme Gradient Boosting-Based on Bayesian Optimization. J. Hydrol. 2022, 613, 128341. [Google Scholar] [CrossRef]
  106. Liu, L.; Liu, X.; Bai, P.; Liang, K.; Liu, C. Comparison of Flood Simulation Capabilities of a Hydrologic Model and a Machine Learning Model. Int. J. Climatol. 2023, 43, 123–133. [Google Scholar] [CrossRef]
  107. Cai, H.; Shi, H.; Liu, S.; Babovic, V. Impacts of Regional Characteristics on Improving the Accuracy of Groundwater Level Prediction Using Machine Learning: The Case of Central Eastern Continental United States. J. Hydrol. Reg. Stud. 2021, 37. [Google Scholar] [CrossRef]
  108. Cai, H.; Liu, S.; Shi, H.; Zhou, Z.; Jiang, S.; Babovic, V. Toward Improved Lumped Groundwater Level Predictions at Catchment Scale: Mutual Integration of Water Balance Mechanism and Deep Learning Method. J. Hydrol. 2022, 613, 128495. [Google Scholar] [CrossRef]
  109. Ghosh, R.; Renganathan, A.; Tayal, K.; Li, X.; Khandelwal, A.; Jia, X.; Duffy, C.; Nieber, J.; Kumar, V. Robust Inverse Framework Using Knowledge-Guided Self-Supervised Learning: An Application to Hydrology. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14 August 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 465–474. [Google Scholar]
  110. Abbas, A.; Boithias, L.; Pachepsky, Y.; Kim, K.; Chun, J.A.; Cho, K.H. AI4Water v1.0: An Open-Source Python Package for Modeling Hydrological Time Series Using Data-Driven Methods. Geosci. Model. Dev. 2022, 15, 3021–3039. [Google Scholar] [CrossRef]
  111. Feng, D.; Beck, H.; Lawson, K.; Shen, C. The Suitability of Differentiable, Physics-Informed Machine Learning Hydrologic Models for Ungauged Regions and Climate Change Impact Assessment. Hydrol. Earth Syst. Sci. 2023, 27, 2357–2373. [Google Scholar] [CrossRef]
  112. Frame, J.M.; Kratzert, F.; Gupta, H.V.; Ullrich, P.; Nearing, G.S. On Strictly Enforced Mass Conservation Constraints for Modelling the Rainfall-Runoff Process. Hydrol. Process. 2023, 37, e14847. [Google Scholar] [CrossRef]
  113. Tsai, W.P.; Feng, D.; Pan, M.; Beck, H.; Lawson, K.; Yang, Y.; Liu, J.; Shen, C. From Calibration to Parameter Learning: Harnessing the Scaling Effects of Big Data in Geoscientific Modeling. Nat. Commun. 2021, 12, 5988. [Google Scholar] [CrossRef] [PubMed]
  114. Papacharalampous, G.; Tyralis, H.; Langousis, A.; Jayawardena, A.W.; Sivakumar, B.; Mamassis, N.; Montanari, A.; Koutsoyiannis, D. Probabilistic Hydrological Post-Processing at Scale: Why and How to Apply Machine-Learning Quantile Regression Algorithms. Water 2019, 11, 2126. [Google Scholar] [CrossRef]
  115. Tyralis, H.; Papacharalampous, G.; Tantanee, S. How to Explain and Predict the Shape Parameter of the Generalized Extreme Value Distribution of Streamflow Extremes Using a Big Dataset. J. Hydrol. 2019, 574, 628–645. [Google Scholar] [CrossRef]
  116. Li, B.; Sun, T.; Tian, F.; Ni, G. Enhancing Process-Based Hydrological Models with Embedded Neural Networks: A Hybrid Approach. J. Hydrol. 2023, 625, 130107. [Google Scholar] [CrossRef]
  117. Han, S.; Slater, L.; Wilby, R.; Faulkner, D. Contribution of Urbanisation to Non-Stationary River Flow in the UK. J. Hydrol. 2022, 613, 128417. [Google Scholar] [CrossRef]
  118. Slater, L.J.; Arnal, L.; Boucher, M.A.; Chang, A.Y.Y.; Moulds, S.; Murphy, C.; Nearing, G.; Shalev, G.; Shen, C.; Speight, L.; et al. Hybrid Forecasting: Blending Climate Predictions with AI Models. Hydrol. Earth Syst. Sci. 2023, 27, 1865–1889. [Google Scholar] [CrossRef]
  119. Slater, L.; Coxon, G.; Brunner, M.; McMillan, H.; Yu, L.; Zheng, Y.; Khouakhi, A.; Moulds, S.; Berghuijs, W. Spatial Sensitivity of River Flooding to Changes in Climate and Land Cover Through Explainable AI. Earths Future 2024, 12, e2023EF004035. [Google Scholar] [CrossRef]
  120. De la Fuente, L.A.; Gupta, H.V.; Condon, L.E. Toward a Multi-Representational Approach to Prediction and Understanding, in Support of Discovery in Hydrology. Water Resour. Res. 2023, 59, e2021WR031548. [Google Scholar] [CrossRef]
  121. Taheri, P.; Taheri, S.; Taheri, M.; Taheri, G. A Novel 24-Hour Deep Neural Network Based Streamflow Forecasting Method in Data-Scarce Regions. In Proceedings of the 2023 13th Smart Grid Conference (SGC), Tehran, Iran, 5–6 December 2023; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2023. [Google Scholar]
  122. Vega-Briones, J.; de Jong, S.; Galleguillos, M.; Wanders, N. Identifying Driving Processes of Drought Recovery in the Southern Andes Natural Catchments. J. Hydrol. Reg. Stud. 2023, 47, 101369. [Google Scholar] [CrossRef]
  123. Quiñones, M.P.; Zortea, M.; Martins, L.S.A. Fast-Slow Streamflow Model Using Mass-Conserving LSTM. arXiv 2021, arXiv:2107.06057. [Google Scholar]
  124. Kapoor, A.; Pathiraja, S.; Marshall, L.; Chandra, R. DeepGR4J: A Deep Learning Hybridization Approach for Conceptual Rainfall-Runoff Modelling. Environ. Model. Softw. 2023, 169, 105831. [Google Scholar] [CrossRef]
  125. Althoff, D.; Destouni, G. Global Patterns in Water Flux Partitioning: Irrigated and Rainfed Agriculture Drives Asymmetrical Flux to Vegetation over Runoff. One Earth 2023, 6, 1246–1257. [Google Scholar] [CrossRef]
  126. Yin, H.; Wang, F.; Zhang, X.; Zhang, Y.; Chen, J.; Xia, R.; Jin, J. Rainfall-Runoff Modeling Using Long Short-Term Memory Based Step-Sequence Framework. J. Hydrol. 2022, 610, 127901. [Google Scholar] [CrossRef]
  127. Koya, S.R.; Roy, T. Temporal Fusion Transformers for Streamflow Prediction: Value of Combining Attention with Recurrence. J. Hydrol. 2024, 637, 131301. [Google Scholar] [CrossRef]
  128. Bouri, I.; Lahariya, M.; Nivron, O.; Julia, E.P.; Backes, D.; Bilinski, P.; Schumann, G. ML Framework for Global River Flood Predictions Based on the Caravan Dataset. arXiv 2022, arXiv:2212.00719. [Google Scholar]
  129. Lima, M.; Deck, K.; Dunbar, O.R.A.; Schneider, T. Toward Routing River Water in Land Surface Models with Recurrent Neural Networks. arXiv 2024, arXiv:2404.14212. [Google Scholar]
  130. Yang, Y.; Chui, T.F.M. Profiling and Pairing Catchments and Hydrological Models With Latent Factor Model. Water Resour. Res. 2023, 59, e2022WR033684. [Google Scholar] [CrossRef]
  131. Renganathan, A.; Ghosh, R.; Khandelwal, A.; Kumar, V. Task Aware Modulation Using Representation Learning: An Approach for Few Shot Learning in Heterogeneous Systems. arXiv 2023, arXiv:2310.04727. [Google Scholar]
  132. Fischer, S.; Schumann, A.; Schumann, A.H. Dominant Flood Types in Europe and Their Role in Flood Statistics Dominant Flood Types in Europe and Their Role in Flood Statistics. Authorea 2024, Preprint. [Google Scholar] [CrossRef]
  133. Nearing, G.; Cohen, D.; Dube, V.; Gauch, M.; Gilon, O.; Harrigan, S.; Hassidim, A.; Klotz, D.; Kratzert, F.; Metzger, A.; et al. Global Prediction of Extreme Floods in Ungauged Watersheds. Nature 2024, 627, 559–563. [Google Scholar] [CrossRef] [PubMed]
  134. Murray, A.M.; Jørgensen, G.H.; Godiksen, P.N.; Anthonj, J.; Madsen, H. DHI-GHM: Real-Time and Forecasted Hydrology for the Entire Planet. J. Hydrol. 2023, 620, 129431. [Google Scholar] [CrossRef]
  135. Lin, Y.; Wang, D.; Jiang, T.; Kang, A. Assessing Objective Functions in Streamflow Prediction Model Training Based on the Naïve Method. Water 2024, 16, 777. [Google Scholar] [CrossRef]
  136. Constenla-Villoslada, S.; Liu, Y.; Wen, J.; Sun, Y.; Chonabayashi, S. Large-Scale Land Restoration Improved Drought Resilience in Ethiopia’s Degraded Watersheds. Nat. Sustain. 2022, 5, 488–497. [Google Scholar] [CrossRef]
  137. Zambrano, F.; Vrieling, A.; Nelson, A.; Meroni, M.; Tadesse, T. Prediction of Drought-Induced Reduction of Agricultural Productivity in Chile from MODIS, Rainfall Estimates, and Climate Oscillation Indices. Remote Sens. Environ. 2018, 219, 15–30. [Google Scholar] [CrossRef]
  138. Jalayer, S.; Sharifi, A.; Abbasi-Moghadam, D.; Tariq, A.; Qin, S. Assessment of Spatiotemporal Characteristic of Droughts Using In Situ and Remote Sensing-Based Drought Indices. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 1483–1502. [Google Scholar] [CrossRef]
  139. Sulugodu, B.; Deka, P.C. Evaluating the Performance of CHIRPS Satellite Rainfall Data for Streamflow Forecasting. Water Resour. Manag. 2019, 33, 3913–3927. [Google Scholar] [CrossRef]
  140. Riazi, M.; Khosravi, K.; Shahedi, K.; Ahmad, S.; Jun, C.; Bateni, S.M.; Kazakis, N. Enhancing Flood Susceptibility Modeling Using Multi-Temporal SAR Images, CHIRPS Data, and Hybrid Machine Learning Algorithms. Sci. Total Environ. 2023, 871, 162066. [Google Scholar] [CrossRef] [PubMed]
  141. Iamampai, S.; Talaluxmana, Y.; Kanasut, J.; Rangsiwanichpong, P. Enhancing Rainfall-Runoff Model Accuracy with Machine Learning Models by Using Soil Water Index to Reflect Runoff Characteristics. Water Sci. Technol. 2024, 89, 368–381. [Google Scholar] [CrossRef]
  142. Nakhaei, M.; Mohebbi Tafreshi, A.; Saadi, T. An Evaluation of Satellite Precipitation Downscaling Models Using Machine Learning Algorithms in Hashtgerd Plain, Iran. Model. Earth Syst. Environ. 2023, 9, 2829–2843. [Google Scholar] [CrossRef]
  143. Read, J.S.; Jia, X.; Willard, J.; Appling, A.P.; Zwart, J.A.; Oliver, S.K.; Karpatne, A.; Hansen, G.J.A.; Hanson, P.C.; Watkins, W.; et al. Process-Guided Deep Learning Predictions of Lake Water Temperature. Water Resour. Res. 2019, 55, 9173–9190. [Google Scholar] [CrossRef]
  144. Han, H.; Morrison, R.R. Data-Driven Approaches for Runoff Prediction Using Distributed Data. Stoch. Environ. Res. Risk Assess. 2021, 36, 2153–2171. [Google Scholar] [CrossRef]
  145. Alipour, A.; Ahmadalipour, A.; Abbaszadeh, P.; Moradkhani, H. Leveraging Machine Learning for Predicting Flash Flood Damage in the Southeast US. Environ. Res. Lett. 2020, 15, 024011. [Google Scholar] [CrossRef]
  146. Lee, J.; Park, S.; Im, J.; Yoo, C.; Seo, E. Improved Soil Moisture Estimation: Synergistic Use of Satellite Observations and Land Surface Models over CONUS Based on Machine Learning. J. Hydrol. 2022, 609, 127749. [Google Scholar] [CrossRef]
  147. Fang, B.; Lakshmi, V.; Bindlish, R.; Jackson, T.J. AMSR2 Soil Moisture Downscaling Using Temperature and Vegetation Data. Remote Sens. 2018, 10, 1575. [Google Scholar] [CrossRef]
  148. Wang, F.; Chen, Y.; Li, Z.; Fang, G.; Li, Y.; Wang, X.; Zhang, X.; Kayumba, P.M. Developing a Long Short-Term Memory (LSTM)-Based Model for Reconstructing Terrestrial Water Storage Variations from 1982 to 2016 in the Tarim River Basin, Northwest China. Remote Sens. 2021, 13, 889. [Google Scholar] [CrossRef]
  149. Chen, Z.; Zeng, Y.; Shen, G.; Xiao, C.; Xu, L.; Chen, N. Spatiotemporal Characteristics and Estimates of Extreme Precipitation in the Yangtze River Basin Using GLDAS Data. Int. J. Climatol. 2021, 41, E1812–E1830. [Google Scholar] [CrossRef]
  150. Greifeneder, F.; Notarnicola, C.; Wagner, W. A Machine Learning-Based Approach for Surface Soil Moisture Estimations with Google Earth Engine. Remote Sens. 2021, 13, 2099. [Google Scholar] [CrossRef]
  151. Li, C.; Yang, H.; Yang, W.; Liu, Z.; Jia, Y.; Li, S.; Yang, D. Error Characterization of Global Land Evapotranspiration Products: Collocation-Based Approach. J. Hydrol. 2022, 612, 128102. [Google Scholar] [CrossRef]
  152. Zhang, G.; Zheng, W.; Yin, W.; Lei, W. Improving the Resolution and Accuracy of Groundwater Level Anomalies Using the Machine Learning-Based Fusion Model in the North China Plain. Sensors 2020, 21, 46. [Google Scholar] [CrossRef]
  153. Seyoum, W.M.; Kwon, D.; Milewski, A.M. Downscaling GRACE TWSA Data into High-Resolution Groundwater Level Anomaly Using Machine Learning-Based Models in a Glacial Aquifer System. Remote Sens. 2019, 11, 824. [Google Scholar] [CrossRef]
  154. Agarwal, V.; Akyilmaz, O.; Shum, C.K.; Feng, W.; Yang, T.Y.; Forootan, E.; Syed, T.H.; Haritashya, U.K.; Uz, M. Machine Learning Based Downscaling of GRACE-Estimated Groundwater in Central Valley, California. Sci. Total Environ. 2023, 865, 161138. [Google Scholar] [CrossRef] [PubMed]
  155. Malakar, P.; Mukherjee, A.; Bhanja, S.N.; Ray, R.K.; Sarkar, S.; Zahid, A. Machine-Learning-Based Regional-Scale Groundwater Level Prediction Using GRACE. Hydrogeol. J. 2021, 29, 1027–1042. [Google Scholar] [CrossRef]
  156. Ali, S.; Liu, D.; Fu, Q.; Cheema, M.J.M.; Pal, S.C.; Arshad, A.; Pham, Q.B.; Zhang, L. Constructing High-Resolution Groundwater Drought at Spatio-Temporal Scale Using GRACE Satellite Data Based on Machine Learning in the Indus Basin. J. Hydrol. 2022, 612, 128295. [Google Scholar] [CrossRef]
  157. Liu, D.; Mishra, A.K.; Yu, Z.; Lü, H.; Li, Y. Support Vector Machine and Data Assimilation Framework for Groundwater Level Forecasting Using GRACE Satellite Data. J. Hydrol. 2021, 603, 126929. [Google Scholar] [CrossRef]
  158. Sun, A.Y.; Scanlon, B.R.; Save, H.; Rateb, A. Reconstruction of GRACE Total Water Storage Through Automated Machine Learning. Water Resour. Res. 2021, 57, e2020WR028666. [Google Scholar] [CrossRef]
  159. Yin, W.; Zhang, G.; Liu, F.; Zhang, D.; Zhang, X.; Chen, S. Improving the Spatial Resolution of GRACE-Based Groundwater Storage Estimates Using a Machine Learning Algorithm and Hydrological Model. Hydrogeol. J. 2022, 30, 947–963. [Google Scholar] [CrossRef]
  160. Senay, G.B.; Velpuri, N.M.; Bohms, S.; Demissie, Y.; Gebremichael, M. Understanding the Hydrologic Sources and Sinks in the Nile Basin Using Multisource Climate and Remote Sensing Data Sets. Water Resour. Res. 2014, 50, 8625–8650. [Google Scholar] [CrossRef]
  161. Wang, J.; Gao, H.; Liu, M.; Ding, Y.; Wang, Y.; Zhao, F.; Xia, J. Parameter Regionalization of the FLEX-Global Hydrological Model. Sci. China Earth Sci. 2021, 64, 571–588. [Google Scholar] [CrossRef]
  162. Ngoma, H.; Wen, W.; Ayugi, B.; Babaousmail, H.; Karim, R.; Ongoma, V. Evaluation of Precipitation Simulations in CMIP6 Models over Uganda. Int. J. Climatol. 2021, 41, 4743–4768. [Google Scholar] [CrossRef]
  163. Zhang, Y.; Ye, A.; Analui, B.; Nguyen, P.; Sorooshian, S.; Hsu, K.; Wang, Y. Comparing Quantile Regression Forest and Mixture Density Long Short-Term Memory Models for Probabilistic Post-Processing of Satellite Precipitation-Driven Streamflow Simulations. Hydrol. Earth Syst. Sci. 2023, 27, 4529–4550. [Google Scholar] [CrossRef]
  164. Neeti, N.; Arun Murali, C.M.; Chowdary, V.M.; Rao, N.H.; Kesarwani, M. Integrated Meteorological Drought Monitoring Framework Using Multi-Sensor and Multi-Temporal Earth Observation Datasets and Machine Learning Algorithms: A Case Study of Central India. J. Hydrol. 2021, 601, 126638. [Google Scholar] [CrossRef]
  165. Kolluru, V.; Kolluru, S.; Wagle, N.; Dev, T. Secondary Precipitation Estimate Merging Using Machine Learning: Development and Evaluation over Krishna River Basin, India. Remote Sens. 2020, 12, 3013. [Google Scholar] [CrossRef]
  166. Alquraish, M.M.; Khadr, M. Remote-Sensing-Based Streamflow Forecasting Using Artificial Neural Network and Support Vector Machine Models. Remote Sens. 2021, 13, 4147. [Google Scholar] [CrossRef]
  167. Bair, E.H.; Calfa, A.A.; Rittger, K.; Dozier, J. Using Machine Learning for Real-Time Estimates of Snow Water Equivalent in the Watersheds of Afghanistan. Cryosphere 2018, 12, 1579–1594. [Google Scholar] [CrossRef]
  168. Rasiya Koya, S.; Kar, K.K.; Srivastava, S.; Tadesse, T.; Svoboda, M.; Roy, T. An Autoencoder-Based Snow Drought Index. Sci. Rep. 2023, 13, 20664. [Google Scholar] [CrossRef]
  169. Gavahi, K.; Abbaszadeh, P.; Moradkhani, H. How Does Precipitation Data Influence the Land Surface Data Assimilation for Drought Monitoring? Sci. Total Environ. 2022, 831, 154916. [Google Scholar] [CrossRef] [PubMed]
  170. Lee, W.J.; Lee, E.H. Runoff Prediction Based on the Discharge of Pump Stations in an Urban Stream Using a Modified Multi-Layer Perceptron Combined with Meta-Heuristic Optimization. Water 2022, 14, 99. [Google Scholar] [CrossRef]
  171. Xu, T.; Guo, Z.; Xia, Y.; Ferreira, V.G.; Liu, S.; Wang, K.; Yao, Y.; Zhang, X.; Zhao, C. Evaluation of Twelve Evapotranspiration Products from Machine Learning, Remote Sensing and Land Surface Models over Conterminous United States. J. Hydrol. 2019, 578, 124105. [Google Scholar] [CrossRef]
  172. Kim, H.; Crow, W.T.; Wagner, W.; Li, X.; Lakshmi, V. A Bayesian Machine Learning Method to Explain the Error Characteristics of Global-Scale Soil Moisture Products. Remote Sens. Environ. 2023, 296, 113718. [Google Scholar] [CrossRef]
  173. Evans, S.; Williams, G.P.; Jones, N.L.; Ames, D.P.; Nelson, E.J. Exploiting Earth Observation Data to Impute Groundwater Level Measurements with an Extreme Learning Machine. Remote Sens. 2020, 12, 2044. [Google Scholar] [CrossRef]
  174. Elbeltagi, A.; Kumari, N.; Dharpure, J.K.; Mokhtar, A.; Alsafadi, K.; Kumar, M.; Mehdinejadiani, B.; Ramezani Etedali, H.; Brouziyne, Y.; Towfiqul Islam, A.R.M.; et al. Prediction of Combined Terrestrial Evapotranspiration Index (Ctei) over Large River Basin Based on Machine Learning Approaches. Water 2021, 13, 547. [Google Scholar] [CrossRef]
  175. Zhang, J.; Liu, K.; Wang, M. Downscaling Groundwater Storage Data in China to a 1-Km Resolution Using Machine Learning Methods. Remote Sens. 2021, 13, 523. [Google Scholar] [CrossRef]
  176. Rahaman, M.M.; Thakur, B.; Kalra, A.; Li, R.; Maheshwari, P. Estimating High-Resolution Groundwater Storage from GRACE: A Random Forest Approach. Environ. MDPI 2019, 6, 63. [Google Scholar] [CrossRef]
  177. Khorrami, B.; Ali, S.; Gündüz, O. Investigating the Local-Scale Fluctuations of Groundwater Storage by Using Downscaled GRACE/GRACE-FO JPL Mascon Product Based on Machine Learning (ML) Algorithm. Water Resour. Manag. 2023, 37, 3439–3456. [Google Scholar] [CrossRef]
  178. Sahour, H.; Sultan, M.; Vazifedan, M.; Abdelmohsen, K.; Karki, S.; Yellich, J.A.; Gebremichael, E.; Alshehri, F.; Elbayoumi, T.M. Statistical Applications to Downscale GRACE-Derived Terrestrialwater Storage Data and to Fill Temporal Gaps. Remote Sens. 2020, 12, 533. [Google Scholar] [CrossRef]
  179. Satizábal-Alarcón, D.A.; Suhogusoff, A.; Ferrari, L.C. Characterization of Groundwater Storage Changes in the Amazon River Basin Based on Downscaling of GRACE/GRACE-FO Data with Machine Learning Models. Sci. Total Environ. 2024, 912, 168958. [Google Scholar] [CrossRef]
  180. Luo, L.; Robock, A.; Mitchell, K.E.; Houser, P.R.; Wood, E.F.; Schaake, J.C.; Lohmann, D.; Cosgrove, B.A.; Wen, F.; Sheffield, J.; et al. Validation of the North American Land Data Assimilation System (NLDAS) Retrospective Forcing over the Southern Great Plains. J. Geophys. Res. Atmos. 2003, 108, 8843. [Google Scholar] [CrossRef]
  181. López-Bermeo, C.; Montoya, R.D.; Caro-Lopera, F.J.; Díaz-García, J.A. Validation of the Accuracy of the CHIRPS Precipitation Dataset at Representing Climate Variability in a Tropical Mountainous Region of South America. Phys. Chem. Earth Parts A/B/C 2022, 127, 103184. [Google Scholar] [CrossRef]
  182. Venema, V.K.C.; Mestre, O.; Aguilar, E.; Auer, I.; Guijarro, J.A.; Domonkos, P.; Vertacnik, G.; Szentimrey, T.; Stepanek, P.; Zahradnicek, P.; et al. Benchmarking Homogenization Algorithms for Monthly Data. Clim. Past 2012, 8, 89–115. [Google Scholar] [CrossRef]
  183. Zhao, Q.; Zhu, Y.; Wan, D.; Yu, Y.; Cheng, X. Research on the Data-Driven Quality Control Method of Hydrological Time Series Data. Water 2018, 10, 1712. [Google Scholar] [CrossRef]
  184. Costa, A.C.; Soares, A. Homogenization of Climate Data: Review and New Perspectives Using Geostatistics. Math. Geosci. 2009, 41, 291–305. [Google Scholar] [CrossRef]
  185. Gao, Y.; Merz, C.; Lischeid, G.; Schneider, M. A Review on Missing Hydrological Data Processing. Environ. Earth Sci. 2018, 77, 1–12. [Google Scholar] [CrossRef]
  186. Hamzah, F.B.; Hamzah, F.M.; Razali, S.F.M.; Samad, H. A Comparison of Multiple Imputation Methods for Recovering Missing Data in Hydrological Studies. Civ. Eng. J. 2021, 7, 1608–1619. [Google Scholar] [CrossRef]
  187. Wu, W.; Li, Y.; Luo, X.; Zhang, Y.; Ji, X.; Li, X. Performance Evaluation of the CHIRPS Precipitation Dataset and Its Utility in Drought Monitoring over Yunnan Province, China. Geomat. Nat. Hazards Risk 2019, 10, 2145–2162. [Google Scholar] [CrossRef]
  188. Le, X.H.; Lee, G.; Jung, K.; An, H.U.; Lee, S.; Jung, Y. Application of Convolutional Neural Network for Spatiotemporal Bias Correction of Daily Satellite-Based Precipitation. Remote Sens. 2020, 12, 2731. [Google Scholar] [CrossRef]
  189. Katiraie-Boroujerdy, P.S.; Naeini, M.R.; Asanjan, A.A.; Chavoshian, A.; Hsu, K.L.; Sorooshian, S. Bias Correction of Satellite-Based Precipitation Estimations Using Quantile Mapping Approach in Different Climate Regions of Iran. Remote Sens. 2020, 12, 2102. [Google Scholar] [CrossRef]
  190. Goshime, D.W.; Absi, R.; Haile, A.T.; Ledésert, B.; Rientjes, T. Bias-Corrected CHIRP Satellite Rainfall for Water Level Simulation, Lake Ziway, Ethiopia. J. Hydrol. Eng. 2020, 25, 05020024. [Google Scholar] [CrossRef]
  191. Goshime, D.W.; Absi, R.; Ledésert, B. Evaluation and Bias Correction of CHIRP Rainfall Estimate for Rainfall-Runoff Simulation over Lake Ziway Watershed, Ethiopia. Hydrology 2019, 6, 68. [Google Scholar] [CrossRef]
  192. Wang, W.; Cui, W.; Wang, X.; Chen, X. Evaluation of GLDAS-1 and GLDAS-2 Forcing Data and Noah Model Simulations over China at the Monthly Scale. J. Hydrometeorol. 2016, 17, 2815–2833. [Google Scholar] [CrossRef]
  193. Mulungu, D.M.M.; Mukama, E. Evaluation and Modelling of Accuracy of Satellite-Based CHIRPS Rainfall Data in Ruvu Subbasin, Tanzania. Model. Earth Syst. Environ. 2023, 9, 1287–1300. [Google Scholar] [CrossRef]
  194. Najmi, A.; Igmoullan, B.; Namous, M.; El Bouazzaoui, I.; Brahim, Y.A.; El Khalki, E.M.; Saidi, M.E.M. Evaluation of PERSIANN-CCS-CDR, ERA5, and SM2RAIN-ASCAT Rainfall Products for Rainfall and Drought Assessment in a Semi-Arid Watershed, Morocco. J. Water Clim. Change 2023, 14, 1569–1584. [Google Scholar] [CrossRef]
  195. Zhang, B.; Xia, Y.; Long, B.; Hobbins, M.; Zhao, X.; Hain, C.; Li, Y.; Anderson, M.C. Evaluation and Comparison of Multiple Evapotranspiration Data Models over the Contiguous United States: Implications for the next Phase of NLDAS (NLDAS-Testbed) Development. Agric. For. Meteorol. 2020, 280, 107810. [Google Scholar] [CrossRef]
  196. Du, H.; Tan, M.L.; Zhang, F.; Chun, K.P.; Li, L.; Kabir, M.H. Evaluating the Effectiveness of CHIRPS Data for Hydroclimatic Studies. Theor. Appl. Climatol. 2024, 155, 1519–1539. [Google Scholar] [CrossRef]
  197. Yang, N.; Yu, H.; Lu, Y.; Zhang, Y.; Zheng, Y.; Walter, R.C.; Bechtel, T.D.; Yang, N.; Yu, H.; Lu, Y.; et al. Evaluating the Applicability of PERSIANN-CDR Products in Drought Monitoring: A Case Study of Long-Term Droughts over Huaihe River Basin, China. Remote Sens. 2022, 14, 4460. [Google Scholar] [CrossRef]
  198. Ekström, M.; Grose, M.R.; Whetton, P.H. An Appraisal of Downscaling Methods Used in Climate Change Research. Wiley Interdiscip Rev. Clim. Change 2015, 6, 301–319. [Google Scholar] [CrossRef]
  199. Schoof, J.T. Statistical Downscaling in Climatology. Geogr. Compass 2013, 7, 249–265. [Google Scholar] [CrossRef]
  200. Chen, J.; Brissette, F.P.; Leconte, R. Uncertainty of Downscaling Method in Quantifying the Impact of Climate Change on Hydrology. J. Hydrol. 2011, 401, 190–202. [Google Scholar] [CrossRef]
  201. Ferraro, R.; Waliser, D.; Peters-Lidard, C. NASA Downscaling Project: Final Report; JPL Open Repository; Jet Propulsion Laboratory: Pasadena, CA, USA, 1 February 2017. [Google Scholar]
  202. Addor, N.; Do, H.X.; Alvarez-Garreton, C.; Coxon, G.; Fowler, K.; Mendoza, P.A. Large-Sample Hydrology: Recent Progress, Guidelines for New Datasets and Grand Challenges. Hydrol. Sci. J. 2020, 65, 712–725. [Google Scholar] [CrossRef]
  203. Rodell, M.; Houser, P.R.; Jambor, U.; Gottschalck, J.; Mitchell, K.; Meng, C.J.; Arsenault, K.; Cosgrove, B.; Radakovich, J.; Bosilovich, M.; et al. The Global Land Data Assimilation System. Bull. Am. Meteorol. Soc. 2004, 85, 381–394. [Google Scholar] [CrossRef]
  204. Landerer, F.W.; Swenson, S.C. Accuracy of Scaled GRACE Terrestrial Water Storage Estimates. Water. Resour. Res. 2012, 48, W04531. [Google Scholar] [CrossRef]
  205. Bai, L.; Shi, C.; Li, L.; Yang, Y.; Wu, J. Accuracy of CHIRPS Satellite-Rainfall Products over Mainland China. Remote Sens. 2018, 10, 362. [Google Scholar] [CrossRef]
  206. Miao, C.; Ashouri, H.; Hsu, K.L.; Sorooshian, S.; Duan, Q. Evaluation of the PERSIANN-CDR Daily Rainfall Estimates in Capturing the Behavior of Extreme Precipitation Events over China. J. Hydrometeorol. 2015, 16, 1387–1396. [Google Scholar] [CrossRef]
  207. Wang, K.; Zhang, T.; Clow, G.D. Permafrost Thermal Responses to Asymmetrical Climate Changes: An Integrated Perspective. Geophys. Res. Lett. 2023, 50, e2022GL100327. [Google Scholar] [CrossRef]
  208. Peng, X.; Zhang, T.; Frauenfeld, O.W.; Mu, C.; Wang, K.; Wu, X.; Guo, D.; Luo, J.; Hjort, J.; Aalto, J.; et al. Active Layer Thickness and Permafrost Area Projections for the 21st Century. Earths Future 2023, 11, e2023EF003573. [Google Scholar] [CrossRef]
  209. O’Driscoll, M.; Clinton, S.; Jefferson, A.; Manda, A.; McMillan, S. Urbanization Effects on Watershed Hydrology and In-Stream Processes in the Southern United States. Water 2010, 2, 605–648. [Google Scholar] [CrossRef]
  210. Fanelli, R.; Prestegaard, K.; Palmer, M. Evaluation of Infiltration-Based Stormwater Management to Restore Hydrological Processes in Urban Headwater Streams. Hydrol. Process. 2017, 31, 3306–3319. [Google Scholar] [CrossRef]
  211. Oswald, C.J.; Kelleher, C.; Ledford, S.H.; Hopkins, K.G.; Sytsma, A.; Tetzlaff, D.; Toran, L.; Voter, C. Integrating Urban Water Fluxes and Moving beyond Impervious Surface Cover: A Review. J. Hydrol. 2023, 618, 129188. [Google Scholar] [CrossRef]
  212. Socioeconomic Data and Applications Center|SEDAC. Available online: https://sedac.ciesin.columbia.edu/ (accessed on 25 May 2024).
  213. Global Water Research Coalition (GWRC). Available online: https://globalwaterresearchcoalition.net/ (accessed on 25 May 2024).
  214. AQUASTAT-FAO’s Global Information System on Water and Agriculture. Available online: https://www.fao.org/aquastat/en/databases/ (accessed on 25 May 2024).
  215. Moges, E.; Demissie, Y.; Larsen, L.; Yassin, F. Review: Sources of Hydrological Model Uncertainties and Advances in Their Analysis. Water 2021, 13, 28. [Google Scholar] [CrossRef]
  216. Renard, B.; Kavetski, D.; Kuczera, G.; Thyer, M.; Franks, S.W. Understanding Predictive Uncertainty in Hydrologic Modeling: The Challenge of Identifying Input and Structural Errors. Water Resour. Res. 2010, 46, 5521. [Google Scholar] [CrossRef]
  217. Abdar, M.; Pourpanah, F.; Hussain, S.; Rezazadegan, D.; Liu, L.; Ghavamzadeh, M.; Fieguth, P.; Cao, X.; Khosravi, A.; Acharya, U.R.; et al. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges. Inf. Fusion 2021, 76, 243–297. [Google Scholar] [CrossRef]
  218. Nemani, V.; Biggio, L.; Huan, X.; Hu, Z.; Fink, O.; Tran, A.; Wang, Y.; Zhang, X.; Hu, C. Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial. Mech. Syst. Signal. Process. 2023, 205, 110796. [Google Scholar] [CrossRef]
  219. Dolezal, J.M.; Srisuwananukorn, A.; Karpeyev, D.; Ramesh, S.; Kochanny, S.; Cody, B.; Mansfield, A.S.; Rakshit, S.; Bansal, R.; Bois, M.C.; et al. Uncertainty-Informed Deep Learning Models Enable High-Confidence Predictions for Digital Histopathology. Nat. Commun. 2022, 13, 6572. [Google Scholar] [CrossRef] [PubMed]
  220. Kimani, M.W.; Hoedjes, J.C.B.; Su, Z. Bayesian Bias Correction of Satellite Rainfall Estimates for Climate Studies. Remote Sens. 2018, 10, 1074. [Google Scholar] [CrossRef]
  221. Abbasi, M.; Farokhnia, A.; Bahreinimotlagh, M.; Roozbahani, R. A Hybrid of Random Forest and Deep Auto-Encoder with Support Vector Regression Methods for Accuracy Improvement and Uncertainty Reduction of Long-Term Streamflow Prediction. J. Hydrol. 2021, 597, 125717. [Google Scholar] [CrossRef]
  222. Xie, X.; Xie, B.; Cheng, J.; Chu, Q.; Dooling, T. A Simple Monte Carlo Method for Estimating the Chance of a Cyclone Impact. Nat. Hazards 2021, 107, 2573–2582. [Google Scholar] [CrossRef]
  223. Hong, Y.; Hsu, K.L.; Moradkhani, H.; Sorooshian, S. Uncertainty Quantification of Satellite Precipitation Estimation and Monte Carlo Assessment of the Error Propagation into Hydrologic Response. Water Resour. Res. 2006, 42, 8421. [Google Scholar] [CrossRef]
  224. Greatrex, H.; Grimes, D.; Wheeler, T. Advances in the Stochastic Modeling of Satellite-Derived Rainfall Estimates Using a Sparse Calibration Dataset. J. Hydrometeorol. 2014, 15, 1810–1831. [Google Scholar] [CrossRef]
  225. Gan, Y.; Duan, Q.; Gong, W.; Tong, C.; Sun, Y.; Chu, W.; Ye, A.; Miao, C.; Di, Z. A Comprehensive Evaluation of Various Sensitivity Analysis Methods: A Case Study with a Hydrological Model. Environ. Model. Softw. 2014, 51, 269–285. [Google Scholar] [CrossRef]
  226. Song, X.; Zhang, J.; Zhan, C.; Xuan, Y.; Ye, M.; Xu, C. Global Sensitivity Analysis in Hydrological Modeling: Review of Concepts, Methods, Theoretical Framework, and Applications. J. Hydrol. 2015, 523, 739–757. [Google Scholar] [CrossRef]
  227. Mirzaei, M.; Huang, Y.F.; El-Shafie, A.; Shatirah, A. Application of the Generalized Likelihood Uncertainty Estimation (GLUE) Approach for Assessing Uncertainty in Hydrological Models: A Review. Stoch. Environ. Res. Risk Assess. 2015, 29, 1265–1273. [Google Scholar] [CrossRef]
  228. Galavi, H.; Mirzaei, M.; Yu, B.; Lee, J. Bootstrapped Ensemble and Reliability Ensemble Averaging Approaches for Integrated Uncertainty Analysis of Streamflow Projections. Stoch. Environ. Res. Risk Assess. 2023, 37, 1213–1227. [Google Scholar] [CrossRef]
  229. Duan, Q.; Ajami, N.K.; Gao, X.; Sorooshian, S. Multi-Model Ensemble Hydrologic Prediction Using Bayesian Model Averaging. Adv. Water Resour. 2007, 30, 1371–1386. [Google Scholar] [CrossRef]
  230. Ehsani, M.R.; Behrangi, A. A Comparison of Correction Factors for the Systematic Gauge-Measurement Errors to Improve the Global Land Precipitation Estimate. J. Hydrol. 2022, 610, 127884. [Google Scholar] [CrossRef]
  231. Horner, I.; Renard, B.; Le Coz, J.; Branger, F.; McMillan, H.K.; Pierrefeu, G. Impact of Stage Measurement Errors on Streamflow Uncertainty. Water Resour. Res. 2018, 54, 1952–1976. [Google Scholar] [CrossRef]
  232. Mizukami, N.; Smith, M.B. Analysis of Inconsistencies in Multi-Year Gridded Quantitative Precipitation Estimate over Complex Terrain and Its Impact on Hydrologic Modeling. J. Hydrol. 2012, 428–429, 129–141. [Google Scholar] [CrossRef]
  233. Van de Schoot, R.; Kaplan, D.; Denissen, J.; Asendorpf, J.B.; Neyer, F.J.; van Aken, M.A.G. A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research. Child Dev. 2014, 85, 842. [Google Scholar] [CrossRef] [PubMed]
  234. Kamyab, H.; Khademi, T.; Chelliapan, S.; SaberiKamarposhti, M.; Rezania, S.; Yusuf, M.; Farajnezhad, M.; Abbas, M.; Hun Jeon, B.; Ahn, Y. The Latest Innovative Avenues for the Utilization of Artificial Intelligence and Big Data Analytics in Water Resource Management. Results Eng. 2023, 20, 101566. [Google Scholar] [CrossRef]
  235. Ming, X.; Liang, Q.; Xia, X.; Li, D.; Fowler, H.J. Real-Time Flood Forecasting Based on a High-Performance 2-D Hydrodynamic Model and Numerical Weather Predictions. Water Resour. Res. 2020, 56, e2019WR025583. [Google Scholar] [CrossRef]
  236. Warren, J. Nathan Marz Big Data: Principles and Best Practices of Scalable Realtime Data Systems; Simon and Schuster: New York, NY, USA, 2015. [Google Scholar]
  237. Fersch, B.; Francke, T.; Heistermann, M.; Schrön, M.; Döpper, V.; Jakobi, J.; Baroni, G.; Blume, T.; Bogena, H.; Budach, C.; et al. A Dense Network of Cosmic-Ray Neutron Sensors for Soil Moisture Observation in a Highly Instrumented Pre-Alpine Headwater Catchment in Germany. Earth Syst. Sci. Data 2020, 12, 2289–2309. [Google Scholar] [CrossRef]
  238. Khan, Z.; Anjum, A.; Kiani, S.L. Cloud Based Big Data Analytics for Smart Future Cities. In Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, Dresden, Germany, 9–12 December 2013; pp. 381–386. [Google Scholar] [CrossRef]
  239. Khan, S.; Shakil, K.A.; Alam, M. Big Data Computing Using Cloud-Based Technologies: Challenges and Future Perspectives. In Networks of the Future; CRC: Boca Raton, FL, USA, 2017; pp. 393–414. [Google Scholar] [CrossRef]
  240. Krishnamurthy, S.; Franklin, M.J.; Davis, J.; Farina, D.; Golovko, P.; Li, A.; Thombre, N. Continuous Analytics over Discontinuous Streams. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Indianapolis, IN, USA, 6–10 June 2010; pp. 1081–1091. [Google Scholar] [CrossRef]
  241. Kolajo, T.; Daramola, O.; Adebiyi, A. Big Data Stream Analysis: A Systematic Literature Review. J. Big Data 2019, 6, 47. [Google Scholar] [CrossRef]
  242. Sauermann, H.; Vohland, K.; Antoniou, V.; Balázs, B.; Göbel, C.; Karatzas, K.; Mooney, P.; Perelló, J.; Ponti, M.; Samson, R.; et al. Citizen Science and Sustainability Transitions. Res. Policy 2020, 49, 103978. [Google Scholar] [CrossRef]
  243. Buytaert, W.; Zulkafli, Z.; Grainger, S.; Acosta, L.; Alemie, T.C.; Bastiaensen, J.; De Bièvre, B.; Bhusal, J.; Clark, J.; Dewulf, A.; et al. Citizen Science in Hydrology and Water Resources: Opportunities for Knowledge Generation, Ecosystem Service Management, and Sustainable Development. Front. Earth Sci. 2014, 2, 104024. [Google Scholar] [CrossRef]
  244. Njue, N.; Stenfert Kroese, J.; Gräf, J.; Jacobs, S.R.; Weeser, B.; Breuer, L.; Rufino, M.C. Citizen Science in Hydrological Monitoring and Ecosystem Services Management: State of the Art and Future Prospects. Sci. Total Environ. 2019, 693, 133531. [Google Scholar] [CrossRef]
  245. Tran, H.N.; Rutten, M.; Prajapati, R.; Tran, H.T.; Duwal, S.; Nguyen, D.T.; Davids, J.C.; Miegel, K. Citizen Scientists’ Engagement in Flood Risk-Related Data Collection: A Case Study in Bui River Basin, Vietnam. Environ. Monit. Assess. 2024, 196, 280. [Google Scholar] [CrossRef]
  246. Paul, J.D.; Buytaert, W.; Allen, S.; Ballesteros-Canovas, J.A.; Bhusal, J.; Cieslik, K.; Clark, J.; Dugar, S.; Hannah, D.M.; Stoffe, M.; et al. Citizen Science for Hydrological Risk Reduction and Resilience Building. Wiley Interdiscip. Rev. Water 2018, 5, e1262. [Google Scholar] [CrossRef]
  247. Walker, D.W.; Smigaj, M.; Tani, M. The Benefits and Negative Impacts of Citizen Science Applications to Water as Experienced by Participants and Communities. Wiley Interdiscip. Rev. Water 2021, 8, e1488. [Google Scholar] [CrossRef]
  248. Salamone, F.; Masullo, M.; Sibilio, S. Wearable Devices for Environmental Monitoring in the Built Environment: A Systematic Review. Sensors 2021, 21, 4727. [Google Scholar] [CrossRef] [PubMed]
  249. Tavra, M.; Racetin, I.; Peroš, J. The Role of Crowdsourcing and Social Media in Crisis Mapping: A Case Study of a Wildfire Reaching Croatian City of Split. Geoenvironmental Disasters 2021, 8, 10. [Google Scholar] [CrossRef]
  250. Khan, Q.; Kalbus, E.; Zaki, N.; Mohamed, M.M. Utilization of Social Media in Floods Assessment Using Data Mining Techniques. PLoS ONE 2022, 17, e0267079. [Google Scholar] [CrossRef]
  251. Perumal, T.; Sulaiman, M.N.; Leong, C.Y. Internet of Things (IoT) Enabled Water Monitoring System. In Proceedings of the 2015 IEEE 4th Global Conference on Consumer Electronics, GCCE 2015, Osaka, Japan, 27–30 October 2016; pp. 86–87. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.