Geospatial Artificial Intelligence (GeoAI) in the Integrated Hydrological and Fluvial Systems Modeling: Review of Current Applications and Trends

: This paper reviews the current GeoAI and machine learning applications in hydrological and hydraulic modeling, hydrological optimization problems, water quality modeling, and fluvial geomorphic and morphodynamic mapping. GeoAI effectively harnesses the vast amount of spatial and non ‐ spatial data collected with the new automatic technologies. The fast development of GeoAI provides multiple methods and techniques, although it also makes comparisons between different methods challenging. Overall, selecting a particular GeoAI method depends on the application’s objective, data availability, and user expertise. GeoAI has shown advantages in non ‐ linear model ‐ ing, computational efficiency, integration of multiple data sources, high accurate prediction capa ‐ bility


Introduction
Hydrology and fluvial research are inexact fields of science, with a large extent of epistemic uncertainty and limited knowledge about the system's complexity, structure, and functioning [1].Both disciplines have been hindered by the limited quality and availability of data [2].Nowadays, access to temporally and spatially high-resolution hydrological and fluvial data has substantially increased, mainly due to advances in the use of automatic sensors in monitoring, environmental 3D scanners, and high-resolution remote sensing from different sources, producing 'big data'.The use of big hydrological data requires the development and the use of the applications of new geospatial tools for computational analytics and hydrological models.Technologies under the geospatial artificial intelligence (GeoAI) concept, such as machine learning (ML) and parallel computing, provide the means to utilize this spatial and non-spatial dataset effectively and also to enhance integrated hydrological and fluvial systems modeling [3].
Hydrological and fluvial modeling took a giant leap forward when the computer revolution started in the 1960s [4][5][6].Since then, engineers and scientists have developed a wide range of hydrological models with different levels of complexity, including empirical and physical-based models [7][8][9][10].Similarly, several 1D, 2D, and 3D numerical hydraulic models are available to model fluvial processes, such as flow characteristics, sediment transport, flood extent, and water depth [5,11].Currently, despite a wide range of physical-based model availability, those models still have challenges in adequately and accurately modeling the complex and non-linear hydrological and hydraulic processes occurring in nature [5].In addition, the application of these models has been restricted to small areas due to limited data availability, challenges in representing spatially varying parameters, and computational intensity [12].Alternatively, GeoAI and ML data-driven models, such as artificial neural networks (ANNs) and long short-term memory (LSTM) deep learning show promising results for hydrological and hydraulic prediction and forecasting in natural environments and at a large geographical scale [13].They can represent the non-linear processes and provide high-accuracy predictions [14,15].ML application in hydrological predictions dates to the 1990s [16], but the development of the new GeoAI and ML algorithms, particularly the deep learning techniques, alongside new data collection technologies, has substantially increased in recent years [17,18].Moreover, there are new studies on developing hybrid models (ML and physical-based models) [14,19,20] and physical process-guided ML methods [21][22][23][24].Therefore, a review of the potential of the new GeoAI and ML methods for integrated hydrological and fluvial systems modeling is needed to guide scientists and practitioners to select the proper tools and to be aware of current and potential future methodologies.
Existing reviews of GeoAI and ML applications in hydrological modeling and fluvial studies have covered specific topics, such as the prediction of runoff, floods, and water quality [25][26][27][28][29].Other reviews have focused on applying a particular GeoAI and ML method [30][31][32][33].However, an overarching review of GeoAI and ML in hydrology is lacking.We aim to review the most recent GeoAI and ML method applications in hydrological, hydraulic, water quality, and fluvial process modeling.
This broad review on using GeoAI in hydrological and fluvial processes modeling provides a critical assessment of the technical development, the potential, and the limitations of the models and the current research trends and gaps from the standpoint of hydrology and fluvial system researchers.The review identified more than 1300 publications over the last two decades, published mainly in water resources, civil and environmental engineering, geosciences, and environmental sciences journals.

Review Methodology and Outline
The application of GeoAI in hydrological and fluvial systems research has substantially increased and diversified in recent years, comprising a wide range of topics.Therefore, a systematic review is challenging.In this review, we adopted a scoping review methodology [34,35].The scoping review supports consistent and structured literature searches to capture relevant information and provides a comprehensive overview of the current applications and research.We explored four categories of GeoAI applications: (1) hydrological and hydraulic modeling; (2) hydrological model calibration and modeling optimization problems; (3) water quality modeling; and (4) fluvial geomorphology and morphodynamic mapping.We searched the literature from the Web of Science, including Scopus, Springer Link, Wiley Online Library, and MDPI.We used the Boolean operators (AND and OR), the proximity operators (NEAR and PRE), and nested logic (use of parentheses) to constrain the literature search to those works containing topic keywords combined with GeoAI methods in the article's title, abstract, or keywords.Because the GeoAI terminologies are diverse and commonly phrase-named, e.g., deep neural networks, deep convolutional neural networks, and deep learning, we used proximity operators to find records containing all the terms within a defined number (n) of word neighbors.In the case of a more common GeoAI phrase name, such as machine learning or artificial intelligence, we used double quotation marks to indicate that the words should not be searched separately.In addition, we used left and right truncation or shortening to account for the GeoAI or topic names that vary in prefixes and suffixes, using the asterisk symbol, e.g., *morpho* to select papers containing the terms geomorphologic, geomorphometry, morphodynamic, or hydromorphological.An example of the searching query construction in the Web of Science database for a GeoAI application in fluvial geomorphology and morphodynamic studies is: (TS = (Deep NEAR/3 neural NEAR/3 learn*) OR TS = (Artificial NEAR/3 Neural NEAR/3 Network*) OR TS = "Artificial Intelligence" OR TS = AI OR TS = geoAI OR TS = "Machine learning") AND TS = fluvial AND TS = *morpho*.We used different keywords combined with GeoAI terminologies according to the hydrological subfields of interest, e.g., hydrological, optimization, calibration, water quality, nutrient, pollutant, sediment, etc.In some cases, we also used the NOT operator to further refine the searching by excluding papers containing certain words from other topics or subfields.For example, we excluded water quality terminologies to search for papers focusing purely on hydrological modeling.Figure 1 shows the yearly publication statistics of the GeoAI and ML applications in the different hydrological subfields.We applied additional selection criteria in each database, including only peer-reviewed journal publications.After gathering the initial list, we briefly reviewed them to select only papers within our review's scope.The publication list was further filtered to ensure that the selected publications provided relevant information about GeoAI applications in hydrological and fluvial studies.We thoroughly studied the selected papers to extract information about the GeoAI model performance, the software used, and the advantages and limitations.Based on this information, we further discussed the comparison of GeoAI methods with the conventional and physical-based hydrological models (Section 5) and identified further opportunities and future trends in applying GeoAI methods in hydrological and fluvial studies (Section 6).

Brief Introduction to Geospatial Artificial Intelligence
GeoAI is an emerging discipline that combines innovations in spatial data science, AI, ML, and big geospatial data [36].GeoAI is the study, development, and application of intelligent computer programs to automatic geospatial and non-spatial data processing; it models geospatial association and interaction, predicts spatial dynamics phenomena, provides spatial reasoning, and discovers spatio-temporal patterns and trends [37,38].GeoAI includes the methods, techniques, and tools of AI and ML to carry out geospatial modeling, such as spatial hydrological prediction and fluvial landform classifications.The Ge-oAI and ML methods (henceforth, GeoAI) can be grouped into unsupervised learning (clustering and dimension reduction), supervised learning (regression and classification), and modeling optimization problems (see Table 1).A detailed theoretical and mathematical description of the GeoAI and ML methods is given by Hastie et al. [39], Goodfellow et al. [40], and Lee et al. [41].[42].Several GeoAI clustering algorithms are used for geospatial and time-series data clustering (Table 1).A clustering algorithm does not require prior knowledge about the types and number of classes.More advanced dimension reduction clustering algorithms, such as autoencoders, can be used for data compression, reconstruction, and anomalies detection [43].
GeoAI regression techniques are oriented towards evaluating the relationship between response variables (dependent) and with one or more causative/independent variables (predictors).There is a wide range of methods and techniques in this category, ranging from traditional regression methods to ensemble and boosting regression trees, e.g., random forest, boosted regression, SVM, the traditional ANN, and deep learning methods [39,40,44].
The GeoAI supervised learning techniques are oriented towards identifying classes or categories.They learn from the given set of observations, called training data, and based on that, classify new observations into predefined classes.Unlike regression-based ML algorithms, the output variable of the classification is a category.The values represent class names or labels [39].Several GeoAI classification methods are available (Table 1), e.g., SVM, random forest, ANN, and deep learning.GeoAI classification is widely used in remote sensing image classification, landform pattern recognition, and change detection.
An ML optimization algorithm is applied to find the best solution out of the solution space [45].The ML optimization algorithm plays an essential role in optimizing the objective function, e.g., identifying the optimal parameter values of a complex model.ML optimization algorithms can be broadly categorized into evolutionary computing and metaheuristic methods (Table 1).ML optimization shows a wide range of applications, e.g., catchment models, parameter calibration by identifying the optimal set of parameter values and scale of analysis, identification of the best management scenarios for a multi-objective operation, etc.Additionally, the reinforced learning method is another approach for problem optimization.It enables an agent(s) to learn in a dynamic environment by defining states, actions, and maximum rewards, using feedback from an agent's actions and experiences [46].

Current GeoAI Applications in Integrated Hydrological and Fluvial Systems Modeling
GeoAI applications in hydrology and fluvial studies are rapidly increasing and replacing the traditional methods.A reason for rapid GeoAI adoption in hydrological sciences might be linked to the progress in collecting big hydrological datasets, using automatic sensors with internet transmission, or the internet of things (IoT).Similarly, the evolution and increase in earth observation satellites (conventional and nanosatellites), unmanned aerial vehicles (UAV), light detection and ranging (LiDAR), and other surveying technologies produce high-resolution geospatial data, allowing better landscape characterization.GeoAI allows the harnessing of big and high-dimensional data to better understand the hydrological processes in a particular system.Specifically, GeoAI provides new data analytic tools to the entire data processing cycle, such as sensor data fusion, hydrological modeling, data assimilation, multi-objective scenario optimization, smart decision support, evaluation of climate change impact, construction of early warning systems, and geo-visualization.Therefore, GeoAI greatly enhances and supports decision making in integrated water resources management (IWRM) and nexus approaches [47].Figure 2 depicts a GeoAI application model for a smart IWRM support system.  1) Internet of things (IoT) supports real-time, high-frequency, hydrological monitoring.The data are stored in a cloud platform and accessed by an application programming interface (API).These data can be used for the real-time identification of problems in the system, e.g., a river basin.( 2) GeoAI provides data analytic and online real-time modeling tools for hydrological system analysis and prediction.(3) GeoAI also supports multi-objective, multi-scenario optimization modeling, which in turn is the basis of smart decision support systems for IWRM.(4) Geovisualization in web mapping and mobile apps can be used for data dissemination and stakeholder engagements and implementing early warning systems.The smart IWRM system can be closed with the evaluation and adjustment of the IWRM plan and the improvement of the hydrological monitoring system.WQ (automatic water quality monitoring), ADCP (acoustic doppler current profiler for water current velocity measurement and river bathymetry), GW (automatic groundwater monitoring in wells), UAV (unmanned aerial vehicle for very high-resolution land cover mapping and surface elevation models), EOS (earth observation system for environmental condition monitoring), LiDAR (LiDAR survey of high-resolution topography data), and GNSS (use of global navigation satellite systems for ground truth data collection).
This section provides an overview of the current applications of the GeoAI methods and techniques and their advantages and limitations in many hydrological subfields such as hydrological and hydraulic modeling, optimization problems for hydrological model calibration and decision-making support, surface water quality, and fluvial geomorphological studies.

Hydrological and Hydraulic Modeling
Water flow in the catchments and river networks is a complex and stochastic process, operating in different spatio-temporal scales and characterized by non-stationarity, dynamism, and non-linearity [48,49].These properties have limited the development of a reliable hydrological and hydraulic prediction model that can be generalized to a large geographical area.The increasing sensor-based, high-frequency (sub-hourly) hydrological data collection and the high spatial and temporal resolution mapping of land cover and topography have enhanced the understanding of hydrological processes.This fact has led to the development of more sophisticated physical-based hydrological models.However, these models are computationally expensive and limited to small-scale applications.Alternatively, several data-driven GeoAI methods have emerged for hydrological and hydraulic classification and prediction at multiple spatial-temporal scales.Table 2 shows examples of the current applications of the GeoAI in hydrological and hydraulic modeling.Objective: To study the RF model performance vs. hydromad conceptual models in USA and Canada.
Advantages: The RF model is simple and outperformed existing conceptual flood-forecasting models, predicting low and medium flood magnitudes.
Limitations: The RF models exhibit inaccuracies for higher flood events, and their performance depends on the catchment characteristics.
[ Limitations: The model seems to be under development. [62]

Hydrological System Classification
The classification of the different types of hydrological systems is one of the most widely applied modeling tasks in hydrology and ecohydrology.It aims to find similarities between different hydrological systems, e.g., those based on the hydrological response, the hydromorphological and climatic characteristics, and other variables.Unsupervised GeoAI algorithms, such as K-mean clustering [63] and SOM [64,65], have been applied to catchment classification.Both algorithms organize multidimensional input data through linear and non-linear techniques, depending on the intrinsic similarity of the data themselves.Several studies highlight the SOM nonlinear techniques for producing robust and consistent hydrological classification [66][67][68], even though the classification consistency is highly influenced by the quality of the input variables [69].Additionally, where training data is available, supervised GeoAI methods have produced highly accurate and biophysically meaningful catchment classification [70,71].

Hydrological Data Fusion and Geospatial Downscaling
Integrated hydrological modeling requires the extensive data collocation of different components of the hydrological system in various spatial and temporal scales.Therefore, it is necessary to complete data and/or create new data by integrating several datasets from different sources, resolutions, and measurement noisiness [72].This approach is called data fusion.Data fusion can increase the measurement quality and reliability, estimate unmeasured states, and increase spatial and temporal coverage.Several probabilistic and GeoAI data fusion techniques are available [73,74].The commonly used GeoAI techniques in data fusion are non-linear Bayesian regression, ANN, RF, and deep learning [73][74][75][76].These methods provide several advantages in representing non-linear, complex, and lagged relationships in different hydrological datasets.GeoAI data fusion is also applied to automatic data denoising and anomaly detection and remote sensing data fusion [77].GeoAI data fusion is also used in rain, soil moisture, and discharge data generation.Sist et al. [78] introduce the ANN-based data fusion of multispectral (visible and infrared) satellite data with radar (microwave) satellite data to improve rainy area mapping and the estimation of the precipitation amount.Zhuo and Han [79] used data fusion to generate soil moisture products from satellite data, land surface temperature, and multi-angle surface brightness reflectance and were able to significantly increase the availability of daily soil moisture products.Fehri et al. [80] used the best linear unbiased predictor data fusion technique to generate discharge data from crowdsourced data and existing monitoring systems.There are more examples of using data fusion in data integration in areas other than in improving knowledge, which could be the next step to be further explored.
Environmental geospatial data, particularly remote sensing data, are usually measured at different spatial and temporal scales; high-temporal resolution data are usually measured at coarse (low) spatial resolution, and fine (high) spatial resolution data are obtained with low temporal frequency [77,81].Therefore, combining the different datasets by downscaling methods is necessary to generate spatio-temporal high-resolution data.GeoAI-based downscaling has shown several advantages.For example, CNN is frequently used for downscaling coarse-resolution to fine-resolution precipitation products, using different static and dynamic variables as predictors [82,83].These studies have shown that CNN achieves different degrees of accuracy, depending on the precipitation rate and the condition complexity; it has, e.g., lower accuracy in extreme wet conditions [83].Other studies have shown a higher downscaling accuracy of GeoAI methods by having a spatial component in the model, e.g., spatial RF vs. RF in downscaling daily fractional snow cover [84] and land surface temperature from MODIS data [85][86][87].

Spatial Prediction of Hydrological Variables
The application of GeoAI in hydrological spatial prediction is diverse; it can be used, for example, in the risk mapping of hydrological extremes such as flood and drought [88][89][90].In particular, GeoAI is widely applied in flood mapping, using satellite imagery, UAVs, high resolution LiDAR topographic data, and automatic water level sensors [91][92][93].The common GeoAI algorithms used, e.g., in flood prediction are SVM, RF, ANN, and deep learning [92][93][94].The selection of the methods is variable and depends on the mapping objective, the system complexity, and the data availability [91].In areas with limited data and/or complex systems, where nonlinear methods are not easily interpretable, AN-FIS soft computing has been applied with good prediction accuracy and strong generalization ability [95].ANFIS combines data and expert knowledge through a set of fuzzy semantic conditional rules [96][97][98].
Another GeoAI application is the spatial prediction of hydrological model variables, e.g., saturated hydraulic conductivity [99,100] and weather data [101].This is particularly useful as spatial hydrological variables are not available.Thus, they can only be predicted using points observation and surrogate spatial data such as remote sensing data.GeoAI spatial prediction has shown advantages in modeling nonlinear processes.However, the prediction quality depends on the quality and quantity of the observed data points and the applied GeoAI method [102].

Hydrological Process Modeling
GeoAI has shown the potential for accurate hydrological modeling, such as for rainfall-runoff, river discharge, soil moisture dynamics, and groundwater table fluctuation [95,103,104].The non-linear nature of these processes is challenging to model with simple empirical and physical-based models.Therefore, GeoAI methods such as ANNs have proved to be better for modeling complex hydrological processes and forecasting them in the short and long term and in different management scenarios [26].However, traditional ANNs do not model sequential order data such as time-series data.Therefore, a further development for the temporal dynamics of hydrological sequential events is the RNN and LSTM neural networks.RNN and LSTM use the previous information in the sequence to produce the current output, although RNN is better designed to model short sequences only.In the case of long temporal sequences of the antecedent conditions, LSTM is preferred.LSTM uses an additional 'memory cell' compared to RNN to maintain information for long sequences or periods of time [105,106].This memory cell lets the model learn longer-term dependencies, e.g., the effects of antecedent soil moisture conditions on runoff generation [105,107].LSTM is advantageous for modeling hydrological processes in regions with strong seasonality, such as a northern climate with varying winter conditions [91,105].The LTSM model also allows the use of multiple time-series predictors, such as precipitation, temperature, discharge, and time [58,108].A further extension of LSTM is created by combining it with CNN.In CNN, learning is achieved through convolving an input with filter layers to speed up parameter optimization [107,109].Combining CNNs and LSTM encodes both the spatial and the temporal information [87,110].LSTM techniques can also be coupled with other signal-processing algorithms such as wavelet transformation (WT).WT is applied to time-series data decomposition, e.g., the decomposition of high-and low-frequency flow signals, the identification of seasonality and trend, the decomposition of non-stationary signals, and data denoising [30].Denoised data are used as inputs for the LSTM model [111].
Another approach is to use a physical-based model coupled with GeoAI, e.g., for runoff and flood prediction [19,112,113].Overall, the output of the physical-based model is used as the input for GeoAI model training.For example, Noori and Kalin [14] used the SWAT model to simulate daily streamflow and estimate baseflow and stormflow, which were used as inputs for ANNs.The benefit of this approach is that once the model is trained, it can perform orders of magnitude faster than the original physical-based models without impairing prediction accuracy [17].Another benefit of the hybrid modeling is that a trained model, e.g., in catchment hydrological modeling, can achieve better performance for other catchments than the uncalibrated process-based models [105,112].
Overall, most of the GeoAI models achieved higher prediction accuracy than the physical-based hydrological models.However, there are several types of GeoAI algorithms, with different architectures and mathematical formulations (e.g., ANN, CNN, and LSTM) to perform similar tasks.In addition, different types of predictor variables and data sampling sizes are used, making the GeoAI model performance comparison challenging.GeoAI models are less physically interpretable, as they do not explicitly represent the physical laws governing the hydrological processes.Therefore, their causal inference is still limited.GeoAI applications are currently oriented towards hydrological prediction.GeoAI has the potential to provide accurate and timely information which is applicable to large areas, and using data from IoT sensors and cloud computing, it can deliver real-time prediction [114].

Hydraulic Modeling
The new generation of very-high-resolution river bathymetry has improved the 1D, 2D, and 3D hydraulic modeling of rivers [115,116].River hydraulic models have been widely used in the estimation of flood extent, water depth and velocity, sediment transport, and the assessment of fluvial morphodynamics [5,11,117].However, very complex hydraulic models (3D) are data and computationally demanding and restricted to small-scale applications.Hydraulic modeling is sometimes inconsistent and does not represent all the bio-physical processes occurring in the natural fluvial environment [118,119].In addition, the numerical solving approach of the hydraulic model results in high numerical instability due to sensitivity to the initial and boundary conditions, model structure, and spatial and temporal discretization [120].Thus, the GeoAI method has emerged as a promising tool for hydraulic modeling in large-scale and natural systems [19,119,121,122].Emerging deep learning applications in computer fluid dynamics have also shown potential for the modeling of turbulent and complex flow structures [123][124][125].Additionally, coupling the hydraulic model with the Bayesian Ge-oAI methods improves hydraulic modeling over a broad range of spatiotemporal scales and physical processes [126].

Hydrological Data Assimilation
Hydrological data assimilation (DA) is a state estimation theory that assumes that models are an imperfect representation of the system and that hydrological data might contain noise.Both can also contain different types of information and be complementary [127].DA aims to harness the information in the hydrological model and in the observations to approximate the true state of the system, considering its uncertainty statistically [127][128][129].DA methods include linear dynamics (e.g., Kalman filter, the most popular state estimation method) and nonlinear dynamics [127].The DA methods can be related to ML.Data fusion and DA use similar techniques, but the problem formulation differs [130].
In hydrological modeling, the ML-based DA is the most common type of coupling of ML and the physical-based model, the so-called loosely hybrid hydrological model [131].DA updates the state system predicted by a physical-based model at a given time or place with observational data, using Bayesian approximation such as the ensemble Kalman filter (EnKF) [127] or ML methods, e.g., ANN, RNN, and LSTM [132].Both the DA and the ML methods solve an inverse problem, expressed as the model y = h(x,w), where h is the model function, x represents the state/feature variable, w is the parameters/weights of the model, and y is the observations/labels in DA/ML, respectively.DA is oriented to find the true state of the system (x) from the observation and ML is commonly oriented to find model parameters or weight (w) from the observation.DA holds w constant to estimate x; ML holds x constant to estimate w; see [133] for a detailed revision.
Many studies have shown that ANN data assimilation outperforms conventional DA, particularly for complex and non-linear response systems [61].An additional development of ML-based DA methods is the so-called deep DA [132], which trains deep learning neural networks such as LSTM for high dynamic systems.Deep DA has shown potential for accurate prediction for periods or sites where observations are unavailable and conventional DA cannot be applied to reduce the model error [132].

Hydrological Model Calibration
In hydrological modeling, the inverse modeling approach is widely applied.In inverse modeling, the model features and parameter values are unknown, and those are identified by minimizing the error between the model output and the observed data [134,135].The model feature identification includes the definition of the main hydrological processes, the mathematical equations representing it, the boundary conditions, and the time regime [136].The parameter identifications encompass the identifying of the model optimal parameter set values that reproduce the observed data acceptably [136].In highly parameterized models, identifying the optimal values of the parameters is challenging and represents a substantial part of the modeling work.Usually, there is not a single set of optimal values of parameters that can simulate the observed data well but a set of optimal parameters values that can achieve similar model performance.This modeling phenomenon is called the non-uniqueness or equifinality problem [137].The hydrological model calibration often requires specialized optimization algorithms, and several ML-based calibration algorithms have been developed to support model calibration (see Table 3 for examples).The EPMODE and the MODE-ACM were both reliable and showed better performance than the NSGA-II and SPEA2 models.

Software: Not stated
Objective: Comparison of three multi-objective algorithms for hydrological model calibration.
Advantages: All three algorithms are able to find Pareto sets of solutions.The most uniform distribution of the solutions was derived with MOSCEM-U as the NSGA-II has the shortest Pareto optimal front and the MOPSO has the maximal extent of the obtained nondominated front.
Limitations: The rate of convergence with the optimal solutions varies across the three algorithms.

Software: Not stated
Objective: To calibrate hydrological models using more effective and efficient multi-objective algorithms called MOSCEM-UA.Advantages: MOSCEM-UA allows multi-objective calibration, preventing the collapse of the algorithm to a single region of highest attraction.It combines the complex shuffling and the probabilistic covariance-annealing search procedure of the SCEM-UA algorithm.
Limitations: Challenging to make sure that a diverse and large initial population size is provided, which supports multiple objective approaches.
[ Limitations: Increasing the number of objective functions did not necessarily lead to a better performance than the bi-objective calibration.The increasing number of objective functions also introduces computational challenges.Insufficient data and flood events seriously affect the calibration performance. [145] The Objective: To implement a smart stormwater real-time system control based on RL.

Advantages:
The RL-based model learned the control valve strategy in a distributed stormwater system by interacting with the system it controls under thousands of simulated storm scenarios.It effectively tries various control strategies until it achieves target water level and flow set points. [148] Limitations: RL performance is highly sensitive to the RL agent reward formulation and requires a significant amount of computational resources to achieve a good performance.
Hydrological models are often calibrated with a single objective function, although adequate and fast multi-objective optimization techniques exist, which better support the several output variables [141].There are many optimization algorithms, meta-heuristic and ML-based, for model parameter calibration, such as particle swarm optimization (PSO), grey wolf optimization (GWO), genetic algorithms (GAs), genetic programming (GP), strength Pareto evolutionary algorithms (SPEA), micro-genetic algorithms (micro-GA), and Pareto-archived evolution strategies (PAES).Depending on the selected performance indicators of the model, the best model for hydrologists varied.According to the free lunch theorem [149], this is not expected to change for a while; it proposes that no one model fits all.In any case, all the models performed well.See Yusoff et al. [150] and Ibrahim et al. [45] for a specific review of optimization algorithms.
Meta-heuristic optimization algorithms, which are mostly inspired by the biological/behavioral strategies of animals, provide a good solution to optimization problems, particularly with incomplete or imperfect information or limited computational capacity [151].An advantage of these algorithms is that they make relatively few assumptions about the optimization problems and reduce the computational demand by randomly sampling a subset of solutions, which otherwise would be too large to be iterated entirely [151].However, some meta-heuristic algorithms such as PSO may not guarantee that a globally optimal solution will be found, particularly when the number of decision variables or dimensions being optimized is large [45].The GA is inspired by genetic evolutionary concepts, such as the non-dominated sorted genetic algorithm II (NSGA-II).The genetically adaptive multi-objective method (AMALGAM) [152] has been applied for multiobjective, multi-site calibration and to solve highly non-linear optimization problems [144,153].AMALGAM is a multi-algorithm that blends the attributes of several optimization algorithms (NSGA-II, PSO, the adaptive metropolis search, and differential evolution) [144].The GA has been shown to be well-suited for hydrological models, such as the SWAT semi-distributed hydrological models, which cannot be adequately calibrated by gradient-based calibration algorithms [144,153,154].The objective function for each solution in a GA can be assessed in parallel computation, providing computational efficiency [144].Additional calibration methods based on deep learning have also been developed, outperforming many of the existing evolutionary and regionalization methods [20,146].

Decision Support System for Integrated Water Resources Management
Integrated water resources management (IWRM) deals with multiple actors to consensually and communicatively integrate decisions in a hydrological unit to ensure equitable economic development and social welfare while assuring hydrological system sustainability [155].IWRM demands quality and timely information.Hence, increasing automation with GeoAI-based decision support systems is thought to enhance IWRM [17,156].Multi-objective and scenario analysis are typical applications of GeoAI techniques in IWRM to find solutions for conflicting objectives, forecast the impact of management strategies, and optimize hydrological system operation [157,158].We found widespread applications of GeoAI in reservoir and water distribution optimization using ANN [159,160], assembled and deep learning algorithms, and genetic programming [161,162].Another application is found in building a smart irrigation decision support system [147].Here, partial least square regression and the adaptive network-based fuzzy inference system (ANFIS) are proposed as reasoning engines for automated decisions.An additional example of artificial intelligence application is the adaptive intelligent dynamic urban water resource planning [158].It uses Markov's decision process to tackle complex water management problems, predicting water demand, scheduling management, financial planning, tariff adjustment, and the optimization of water supply operations [158] (See Table 3).Overall, the GeoAI-based IWRM integrates various types of algorithms to perform different tasks, such as prediction and forecasting using various types of geospatial data, and optimization algorithms for management scenarios with multiple objectives.Algorithms such as ANFIS are used for system reasoning to automate the decision support [157,158,163].ANFIS allows the mimicking of human reasoning and decision making based on a set of fuzzy IF-THEN rules.ANFIS has the learning capability to approximate nonlinear functions and can self-improve in order to adjust the membership function parameters directly from the data [164].

Automatic Water Quality Monitoring
The data collection of water quality with wireless sensor networks and internet of things (IoT) technologies is rapidly increasing and providing very-high-frequency WQ data (sub-hourly) [165,166].There is evidence that the high-frequency data better represent the dynamics variation of river discharge and sediment and solute fluxes [167].It enables the early mitigation of floods and drinking water problems [168,169].High-frequency data can also lead to a more precise and accurate classification of the biochemical status of rivers and lakes [170].However, such sensors and devices are subject to failures, poor calibration, and inaccurate data recording in certain conditions [171,172].Therefore, automatic data quality control, error and anomaly detection, sensor drift compensation, and uncertainty assessment are important [171][172][173].GeoAI showed advantages in managing WQ sensor networks and sensor data fusion, such as fault detection, data correction, and upgrades from different monitoring sensors by data fusion [174].See Table 4 for selected examples of GeoAI applications on WQ monitoring.Additional applications of Ge-oAI are in the detection, localization, and quantification of pollutant critical sources and critical periods of loading in monitoring networks [175,176].The most common GeoAI algorithms for WQ sensor fusion are based on Bayesian algorithms, fuzzy set theory, genetic programming, ANN, and LSTM [177][178][179][180]. Advantages: Hybrid models showed improved predictive power compared to standalone algorithms.Hybrid bagging random tree (BA-RT) model showed greatest predictive power and was able to produce reliable results despite a dataset spanning a short time period.Limitations: BA-RT model struggled to accurately predict extreme WQ index values, while most models also overestimated WQI values.[190] SVM and RF Software: e1071 R package used to build SVM model and random forest R package used to build RF model.
Objective: SVM and RF algorithms compared to predict high-frequency variation in stream solutes in Hubbard Brook, New Hampshire, USA.Advantages: Both ML algorithms were capable of effectively predicting concentrations of major ions.
Limitations: Solutes with atmospheric, episodic, or strong biotic and abiotic controls were much more poorly predicted than solutes least affected by ecosystem dynamics.[191] Many WQ parameters cannot easily be measured in situ and in real time for various reasons, such as high-cost sensors, low sampling rate, multiple processing stages, and the requirement of frequent cleaning and calibration.Therefore, a common practice is the estimation of a particular WQ parameter value based on other surrogate parameters, called soft sensors [181,183,184].ML techniques showed higher accuracy in implementing soft sensors than conventional regression-based models [181,183,184,192].
The ML method has also shown an advantage in automatic hysteresis pattern analysis using high-frequent water quality data with, e.g., restricted Boltzmann ANN [193].A more detailed hysteresis pattern classification allows the gaining of new insights into WQ pollutants sources and drivers, the influence of catchment and riverine features, the effect of antecedent conditions, and the influence of changes in rainfall and snowmelt patterns [193].

Spatio-Temporal Water Quality Prediction
We found diverse applications of the GeoAI methods in WQ spatio-temporal pattern analysis, the classification of WQ, and the prediction of WQ variables and the pollutant loading estimation.A detailed review of the ML application in WQ prediction is found in Rajaee et al. [27], Naloufi et al. [29], and Chen et al. [194].Table 4 shows examplesof GeoAI applications for this purpose.Commonly used GeoAI for WQ prediction and classification are unsupervised clustering such as k-means, density-based spatial clustering of applications with noise (DBSCAN), and SOM, but also time-series segmentation such as dynamic time warping [195].Supervised ML classification and prediction algorithms for WQ are RF, SVM, the Bayesian network, and ANN, and deep learning such as LSTM is also frequently used [190,196,197].
High-frequency WQ monitoring data contains noise signals due to random and systematic errors, impairing the WQ prediction accuracy.Hence, combining data denoising techniques such as Fourier and wavelet transform with GeoAI improves WQ prediction.For example, Song et al. [198] found that combining synchro-squeezed wavelet transform and an LSTM network substantially improved the WQ parameter prediction.Similarly, Najah Ahmed et al. [28] integrated wavelet discrete transform with the artificial neuro-fuzzy inference system (WDT-ANFIS) to obtain high-accuracy prediction of river WQ parameters.
Additionally, the WQ data usually have temporal autocorrelation and multicollinearity between the WQ parameters.To consider these characteristics in the prediction models, Zhou et al. (2020) [199] proposed an ML model based on t-distributed stochastic neighbor embedding (t-SNE) and self-attention bidirectional LSTM (SA-Bi-LSTM), demonstrating substantial WQ prediction improvement.Another promising approach is uniform manifold approximation and projection (UMAP) for multidimensional WQ data ordination and classification.Unlike other dimension reduction methods, UMAP retains a global and local information structure, and the data ordination is bio-physically meaningful [200].
Inland water has naturally high spatial variation.It requires complex spatial prediction models and large datasets.The GeoAI have shown breakthroughs in spatial WQ prediction by combining field observations, remote sensing data, or UAV imagery.For example, using deep learning, RF, genetic algorithm-RF, adaptive boosting (AdaBoost), genetic algorithm-AdaBoost and the genetic algorithm-extreme gradient boosting (GA-XGBoost) [183,194].However, these models usually demand extensive training data, which are restricted to a few pilot areas or intensely monitored areas.
Another approach in WQ prediction is the application of hybrid models and the integration of physical-based models with GeoAI methods, such as SVM, RF, ANN, and LSTM.Hybrid models usually outperformed physical-based models.For example, Noori et al. [188] found substantial improvement in monthly nitrate, ammonium, and phosphate load prediction when using hybrid SWAT-ANN models.Hybrid models are also helpful for unmonitored catchment predictions [188].The hybrid model also improves GeoAI explanatory and generalization capability, although some disadvantages observed in the physical-based model, such as extreme values not being well predicted, persisted in the hybrid models.Similarly, the process-guided recurrent neural network (RNN), which combines the biophysical principles of the process-based model and RNN, modeled the seasonal variation of lake phosphorus loading with lower bias and better reproduced the long-term changes of phosphorus loading compared to using the physical-based model and RNN independently [21].
Overall, the GeoAI water quality prediction depends not only on the selected algorithms and settings but also on the WQ parameters, data size, and training data quality for the learning models [183,188,191].

Machine Learning in Fluvial Geomorphic and Morphodynamic Mapping
Fluvial geomorphology triggered the quantitative dynamic paradigm [201] as an approach to quantifying and understanding the processes of the fluvial environment [5].The simultaneous development of techniques such as multispectral satellite images, synthetic aperture radar (SAR), LiDAR, UAV imagery, structure from motion photogrammetry (SfM), multibeam sonar (sound navigation and ranging), among others, has resulted in an unprecedented, seamless characterization and quantification of the fluvial environment and its dynamics [202][203][204].This geospatial dataset explosion, as in many other disciplines, has resulted in the perfect foundation for applying GeoAI methods in fluvial geomorphology.Here, we reviewed the recent GeoAI applications in fluvial geomorphological studies.Table 5 shows selected examples of GeoAIapplications in fluvial geomorphic studies.Objective: To measure river wetted width (RWW) with a novel approach at the subpixel scale by using MODIS and Landsat OLI images in Bay of Bengal, India, and Landsat TM in Columbia River, USA.Advantages: CNN-based sub-pixel scale classification resulted in a more accurate estimation of RWW than the conventional hard image classification.
Limitations: No full spectral unmixing was possible due to the spectral variations of land cover classes and the nonlinear mixture phenomenon.Misclassification issues were reported when shadows, bridges, or trees were located along riverbanks.In such situations, RWW was unmeasured. [ GEOBIA, EL, RF, extra tree (ET), gradient tree boosting (GTB), extreme gradient boosting (XGB).Then, it was combined with a voting classifier.Software: Pythonbased with scikit-learn package.
Objective: To map the main hydromorphological types that characterize fluvial landscapes in Europe by using Copernicus image mosaic and EU DEM.Target classes: water, sediment bars, riparian vegetation, other floodplain units.
Advantages: RF outperformed any other tested classifier, e.g., ET, GTB, and XGB.Hierarchical object-based segmentation is robust for combining spectral and topographical data at different spatial resolutions and enhancing low spectral resolutions.Area-based validations were the preferred method to validate the quality of the object-based maps.
Limitations: Vegetation units and sediment bars were not very well classified.Main source of error was related to the high mixture of riparian vegetation, sediment bars, and other floodplain features. [209] Object-based RF and pixel-based RF, combined with recursive feature elimination and PCAs Objective: To reveal uncertainty, overfitting, and efficiency of terrain attribute identification in fluvial landforms using morphometric variables derived from a LiDAR DTM from Tisza River, Hungary.Advantages: Object-based RF method had a better classification accuracy (95%) than the pixel-based RF method (78%) when identifying 4 different [210] Software: rpart and caret R packages, and EnMAP-Box (environmental mapping and analysis program) 2.1.1 software.
river landforms (crevasse channels, swales, point bars, levees).Overfitting was controlled in the study by limiting the number of input variables.Limitations: Object-based RF classifications needed visual interpretation, field observations, and high-resolution data.PCAs did not help to select more efficient and important variables.

RF and SVM
Software: eCognition developer 9 software.
Objective: Semi-automatic map of riverscape units and in-stream microhabitats, providing continuous, objective, and multi-scale classification using very-high-resolution near-infrared aerial imagery and LiDAR DTM from the Orco River, Italy.
Advantages: RF better identified riverscape elements, e.g., channel and bars, while SVM did better when classifying in-stream meso-habitats.Topographical data, in particular detrended DTM (DDTM), was a relevant data source for an accurate classification of the riverscape units.Nearinfrared imagery combined with DDTM was the best predictor.
Limitations: Extensive expert-based training was necessary for detailed postclassification.Several subjective rules added to the process.Most confusion in the classification was detected between the floodplain and sparse vegetation classes, where the DDTM was not helpful.Only used a low number of the variables selected (slope and valley bottom width).The model was validated with a basin-scale expert mapping of valley types.This is time-consuming and not available for other areas.
The model was only proposed as a preliminary assessment for further studies. [ Modified Hebbian algorithms and kmean clustering.Software: On-line batch Hebbian algorithm and CoSA (clustering of sparse approximations) packages.
Objective: To investigate the applicability of ML classifier in Arctic regions using DigitalGlobe Worldview-2 visible/near-infrared, high-resolution imagery from Mackenzie River, Canada, and Selawik and Barrow Rivers in Alaska (USA).
Advantages: Allows automatic discretization of landscape units in large areas.Useful as a preliminary method to learn which scale of clustering is suitable to study different processes or focuses of the study (e.g., hydrology versus vegetation).Capable of detecting vegetation changes as it recognizes vegetation levels in different classes.
Limitations: No error assessment was performed, nor was there ground truth validation.The selection of an appropriate number of clusters depends on the expert's decision. [ SVM, RF, ANN, partial least squares, multivariate adaptive regression splines, Objective: To extrapolate a geomorphic classification of channel types to a regional stream network using DEMs and thematic maps (e.g., lithology, soil, stream network, etc.) from Sacramento River, USA.[214] flexible discriminant analysis, k-NN, regression tree, bagged trees, linear discriminant analysis, regularized linear discriminant analysis, and naive Bayes.Software: Caret and h2o R packages.
Advantages: Multiple algorithms compared.RF outperformed other models with more accuracy and stability and lower entropy in reach-scale river type classification.Rigorous approach in model design and evaluation of performance with 20 × 10-fold cross validation used for clarification of some black box aspects of ML.
Limitations: Needs large expert-based field survey data.It is unclear if ML is able to integrate predictors at different scales and to show different uncertainty across the watershed.
RF and RF combined with recursive feature elimination (RF-RFE).

Software: R program
Objective: To detect structural and/or neotectonic controls influencing the knickpoints of the drainage network using DEM and thematic maps (geology and geomorphology) from Abaeté Watershed, Brazil.
Advantages: Simple and reproducible methodology that provides causal relationship of knickpoint formation to lithological contacts and neotectonic configuration and activity.RF succeeded in partially predicting geomorphic indices (e.g., stream length gradient index or normalized channel steepness index) and can be used to predict them in unsampled areas without overfitting.
Limitations: Low performance of the methods, obtaining R 2 = 0.38 as the highest correlation between predicted and direct estimation values of geomorphic indices.It may have been affected by the selection of the covariables. [215] Template-matching (object-based) algorithm (TMA) and pixel-based SVM.Software: Feature Analyst (Overwatch Systems Ltd.).Not specified for SVM.
Objective: To delineate water surface boundaries and assess the influence of river and bank characteristics in the efficacy of a template-matching compared to a pixel-based algorithm, using high-resolution images with false-color infrared, from the Brazos River, USA.
Advantages: Both algorithms adequately delineate the water surface.SVM performed better and handled complex and noisy class relationships.TMA performed better than SVM in spatially complex channel morphologies (e.g., partially submerged sediment deposits, sediment bar structures) due to its capability to incorporate both spectral and spatial information.
Limitations: Validation relies on expert knowledge and previous maps.Selection of ancillary data types depends on expert decision and the delineation accuracy of TMA.In addition, the low spectral dimension of the images limited the pixel-based classification.Both algorithms encounter problems when classifying multiple complex features (e.g., overhanging trees) and illumination conditions (e.g., shading).The TMA performance was less spatially consistent than that of SVM.
[216] SOM Software: R based (R v3.5.1).Package "kohonen" v3.0.7 Objective: To produce waterbody typology from 22 GIS-derived continuous catchment characteristics to capture the dominant controls that influence river reaches across England and Wales.Advantages: SOM-based water body topology reflects catchment functional feature controls on river reaches.The method is extendable to other areas where reach-level monitoring is relevant.The SOM combined with hierarchical clustering can be applied over a wide range of catchment, e.g., a national-level waterbody typology map. [217] Limitations: The method could not isolate individual effects from catchment controls as they are dependent on each other.It does not detect temporal change and local controls such as dams, channelization, and others not taken into account.
Objective: To introduce the BathyNet framework, a photogrammetry and radiometric-based combined retrieval of water depth using U-Net CNNs.
Study area was Lech river, Augsburg, Bavaria, Germany.
Advantages: U-Net CNNs approximate arbitrary functions and include spatial context.The U-Net CNN-based depth retrieval outperformed traditional regression-based optical inversion methods.
Limitations: U-Net CNNs require large amounts of training data and their application might be unfeasible in areas with scarce water-depth field samples. [218] The current state-of-the-art of GeoAI in fluvial geomorphology consists of an automatic extraction of fluvial features at a fine scale by integrating larger and multidimensional datasets, using unsupervised classifiers (e.g., K-means, SOM), supervised classifiers (e.g., RF, SVM, ANN, deep learning, CNN), or by combining both methods, e.g., K-means with ANN.Most of the reviewed articles were focused on the development of the methods and workflow, the testing of new applications, or the comparison of algorithm performances [205,207,209], rather than the study of fluvial processes and underlying dynamics.These applications of GeoAI provide the basis to the discovery of new fluvial patterns and trends and increase knowledge about fluvial environments (e.g., Ling et al. 2019;Guillon et al. 2020, Heasley et al. 2020) [208,214,217].
Overall, GeoAI outperforms conventional methods of fluvial landform classification, reaching a classification accuracy of over 80%.Most common applications are found in river channels and water body mapping [208,216], the classification of riverine landforms and vegetation successions [213,214,219,220], the estimation of catchment hydrogeomorphic characteristics (e.g., valley bottom, floodplain, and terrace) [212,221], and benthic and fish habitat mapping [207,211,222,223].
Another application of GeoAI is the integration of multiple techniques to provide more accurate and very-high-resolution data for fluvial studies.For example, the fluvial environment is highly dynamic and demands frequent bathymetry surveys to understand the change and morphodynamic drivers in lakes and rivers.Emerging technologies, such as acoustic Doppler current profiler (ADCP), green LiDAR, high-resolution image radiometric model, and 3D cloud points generation with SfM, allow more frequent and accurate bathymetry mapping [203,204].However, each approach has limitations, e.g., ADCP collects data only from areas where the sensor has passed, and it does not provide continuous spatial scanning.It does not measure near-bank areas, and it is subject to the acoustic side-lobe effect [224].Photogrammetry and the green LiDAR method are sensitive to water turbidity and light penetration in the water column [225,226].Therefore, multisource bathymetry modeling using the GeoAI method increases the bathymetric data accuracy and reduces uncertainties due to data quality in change detection.For example, ADCP data, image radiometric-based water depth, and SfM depth data can be integrated using U-Net convolutional neural networks [218,227].
The GeoAI approach, when using multi-temporal remote sensing data, allows the mapping of a broader fluvial landscape and its change, thereby revealing spatiotemporal scales of fluvial morphodynamics, as in e.g., Van Iersel et al. [228], Hemmelder et al. [229], and Boothroyd et al. [230].There are different GeoAI approaches for automatic change detection using multi-temporal images such as generative adversarial networks (GAN), autoencoder, CNN, and others, as presented by Shi et al. [231].
Although GeoAI has been rapidly adopted in fluvial geomorphological studies, a wide spectrum of workflows and software is found; many GeoAI approaches seem to be under development and in the testing stage.Therefore, without a general, consistent, and robust workflow among them, it is difficult to generalize and compare the GeoAI methods performance and overall accuracies, as well as the study results.
The current limitations of GeoAI methods in fluvial studies are that the classification quality is highly dependent on expert knowledge.The unsupervised classification output is often inconsistent, and the cluster classes do not have direct geomorphic or fluvial process meaning and need a post-classification labeling.Supervised GeoAI classifiers require a large training sampling, and the training data quality is highly dependent on expert knowledge.In addition, many of the studies using GeoAI to classify fluvial landform or river typologies have been conducted in areas where an extensive quantity of previous studies and data collection exists [212,214].Therefore, its application in poorly sampled areas is somewhat limited.
In many cases, GeoAI is enhanced with the use of fine-scale fluvial geomorphic mapping, e.g., LiDAR or UAV-based images, which are still restricted to pilot areas, mostly in Western countries.In addition, several different landform class names are used to rename fine-scale fluvial landforms, and therefore, a standardized fluvial landform taxonomy is lacking [232].
Another limitation of supervised GeoAI applications is the misclassification of elements out of the GeoAI training range, as presented, e.g., in Carbonneau et al. [205].Moreover, the use of very different methods for assessing the GeoAI algorithm's performance and accuracy may lead to inconsistencies in the validity of results, e.g., map cross-tabulation often uses limited validation points rather than areal-based reference data, due to the lack of geomorphological reference maps at a very fine scale.Another issue with regard to performance and accuracy assessments is the use of scalar error statistics, such as root mean square error, which may not be reliable in fluvial mapping.
Here the resulting error is a complex combination of random and systematic components, and the isotropy and stationary assumptions do not apply to the fluvial process [233].It is also heavily influenced by a small percentage of classification errors, which lead to incorrect rankings of overall model performances or to prediction error [206].Therefore, a more consistent and comparable GeoAI-based fluvial mapping accuracy assessment is needed.5. GeoAI Causal and Predictive Inference Capability

Renewed Data-Driven Research
Observational and experimental studies have been the basis of understanding the empirical relationships of physical processes occurring in the earth and the development of the mechanical or physical-based models to predict them [234].With the substantial increase in observational data and the development of GeoAI methods, empirical studies have been renewed with data-driven models [17,235].Unlike traditional statistical models, GeoAI methods do not rely on a formal assumption about the data structure and types of data distribution such as normality.They are more flexible and adaptable for nonlinear and high-dimensional data.GeoAI methods automatically identify and exploit correlations and patterns (classification) in the data to make predictions.For example, an ANN, with many hidden layers and free parameters estimated by training and arbitrary fitting curves, converts inputs to outputs by simply minimizing error variances [39].
To date, in most of the GeoAI applications for hydrological studies the cause-effect relationship inference has been limited, because the multiple driven factors and interactions between the used variables and scales are not explicitly represented in the models [50,123].In addition, the GeoAI and ML internal hyper-parameter optimization is not explicitly stated in most of the modeling studies.For this reason, GeoAI methods are often called "black-box" models [236].See Table 6 for the characteristics of physical-based and GeoAI hydrological models.Therefore, causal inferences might be questionable without robust assumptions and the veracity of the assumed data structure [237].Thus, most GeoAI models are mostly inductive approaches, mainly oriented for operational prediction and forecasting work, such as early warning systems.Nevertheless, GeoAI models have the potential to reveal unknown associations and complex patterns of hydrology processes by integrating more high-dimensional and multi-source data than traditional methods.By implementing proper model interpretation and explainability methods, they can also extend GeoAI applications for causal inference [236,237].

Process-Based Model GeoAI Model
Based on general physical laws.
A data-driven approach, inductive model building, may not fully respect physical laws.
All input variables and parameter ranges are wellstructured and known.
Unstructured data, not all input variables' role in the model is known, making the output less interpretable.
Limited to the current state-of-the-art.Able to reveal unknown associations and patterns.
It is a deductive, hypothesis testing approach.It can be used for causal inference.
It is an inductive, exploratory approach.Their use in causal inference depends on the GeoAI model building and selected variables.
No uniqueness problem due to inverse modeling in model parameterization.
No uniqueness problem due to GeoAI hyper-parameter optimizations.
Mostly deterministic, the system is represented by the average values of variables.
Deterministic and probabilistic, depending on the GeoAI method, variables can be treated as probabilistic.
Reductionist, considerable simplification of complex processes which can result in prediction bias.
Integrative, GeoAI can be integrated into several types of observation and may be able to reveal patterns not represented by a physical-based model.Therefore, GeoAI prediction can be less biased.
It is assumed to be of general application.
It is assumed to be applied only within the range of the training data.
Fixed to the model basis data requirement, and unable to deal with multisource data.
Flexible to data input, from minimal input to big data.GeoAI can maximize the use of all types of available data, from different sources, types, and quality.
High computing demand for high-frequency and largescale modeling.
High computing efficiency and suitable for highfrequency and large-scale modeling.
One-time calibration, once the model is calibrated, the parameters are usually fixed.
Continuous learning model, the model calibration is constantly updated with past and new data.
Well-defined framework for model performance, uncertainty, and error propagation evaluation.
Diverse and developing approaches for model performance, uncertainty, and error propagation evaluation.

Generalization of GeoAI Prediction
GeoAI models may only be applicable within their specific training data or calibrated ranges [238], unless the modeling scheme and variables used can be argued to be generally valid, e.g., representing general laws such as conservation and momentum laws that govern natural processes [234].GeoAI modeling generalization is also a challenging problem from the perspective of model performance assessment, depending on the model complexity, variables, and training dataset size.A very simple model cannot learn the problem being modeled (underfitting problem), whereas a highly complex model with a large dataset might overfit the training dataset (overfitting problem).Both cases are not generalizable or applicable to new datasets.Current GeoAI generalization approaches are based on finding an optimal tradeoff between training and validation accuracy, using regularization, weight decay, ensembles, and other approaches in the model training stage [40].However, the decision boundary in complex models becomes sensitive to data size and outliers, model architecture, and hyper-parameter optimization.It has also been observed that different sets of the model architecture and hyper-parameters can produce a similar model performance, leading to the non-uniqueness modeling problem [50,239].

GeoAI Data Requirement for Reliably Prediction
GeoAI models depend on the quality and quantity of the data.The amount of data required for them depends on many factors, such as the complexity of the hydrological system and the applied GeoAI algorithm [47].A complex system with more sophisticated GeoAI methods will demand a large and multidimensional dataset [42].For example, deep/extreme learning algorithms usually demand large sample sizes to compute acceptable results [240].Current hydrological and geospatial data are increasing rapidly, fostered by the development of automatic monitoring systems and land surveying technologies.However, the data quantity (volume) and quality (veracity and value) vary; the data types are diverse (unstructured, structures, spatial, non-spatial, etc.), and the datasets usually come from different sources.Datasets with these characteristics are called big data [241,242] and require advanced and new methodologies to integrate them with GeoAI models properly.

GeoAI Capacity to Provide Novel Physical Insights
GeoAI data-driven research and data mining are increasingly used to gain information from data, elucidate systems behavior, reveal new insights about the system functioning, and detect change in the system responses [17].There are several examples of GeoAI applications in hydrological modeling [91,105,107,243].Recent studies applying deep learning to rainfall-runoff simulation indicate that there is significantly more information in large-scale hydrological data sets than hydrologists have been able to translate into theory or models [129].GeoAI has also revealed new hydrological patterns and trends, using heterogeneous data from different sources and quality [244,245].Therefore, novel data-driven modeling provides the potential to gain new information and knowledge and a better understanding of the hydrological system and its changes [129,235].

Toward Transdisciplinary GeoAI Research in Hydrological Modeling
Nowadays, earth science has mostly adopted GeoAI approaches developed in other fields, particularly computer science.GeoAI is also an active field of research in advanced hydrological modeling, providing new insights into hydrological system functioning, advantages in computational efficiency, and prediction accuracy.Nevertheless, it depends on how the hydrological GeoAI model has been set up by the user, the quantity and quality of the data, and the types and number of variables used.
GeoAI methods can be integrated with other data analysis techniques, e.g., Fourier and wavelet transformation, to remove noise and provide better hydrological feature extraction [30,198].Hence a transdisciplinary approach is demanded to ensure insightful research on GeoAI applications in hydrological and fluvial studies [235].This is particularly relevant as the complexity of the GeoAI models is increasing continuously, and model parametrization and parallel computing solutions require expert knowledge for proper GeoAI technology adoption [18,240].Similarly, these issues also arise when hydrological science principles are not explicitly integrated with the GeoAI data-driven models, resulting in a limited explainability of the underlying physical laws that govern hydrological and hydraulic processes [50,129].

Augmenting GeoAI Prediction Capability with Open Data and Crowdsourced Data
GeoAI models demand a large amount of training data.Although data collection technology has progressed substantially, only a few geographical areas or pilot hydrological systems are well equipped.For example, very few catchments have implemented IoT hydrological monitoring technology.GeoAI models will demand a rapid and massive increase in data collection.The current open-access policy of many governmental environmental agencies, related to climatological, hydrological, and environmental data, enhances the data-driven research and GeoAI applications, particularly in Western countries.Similarly, open access to high-resolution topographical and earth observation data (e.g., NASA and the ESA-EU Copernicus Programme) also accelerates the development of GeoAI-based hydrological models [241,246].Additionally, the current trend of implementing open-access training libraries, e.g., training data for land cover classification, is valuable, but more specialized hydrogeomorphic labeled data are still under development.
Citizen science also plays a key role in complementing and increasing data collection worldwide.There are several examples of how hydrological crowdsourcing enhances hydrological data availability for scientific research, using images and social media data [247] and low-cost data loggers [248,249], but the success and quality of hydrological crowdsourcing are variable, depending on the regions, the instrument used, and the variables reported [250].GeoAI-based hydrological model development will benefit from crowdsourcing data collection.

From Physical-Based and GeoAI Hybrid Models to Fully Integrated GeoAI-Physical-Based Models
Physical-based and GeoAI hydrological models have had different paths of development.As discussed previously, a physical-based model is derived from empirical and experimental research; meanwhile, a GeoAI model is derived from data sciences techniques.
Physical-based and GeoAI models are not complementary per se, but in many cases, the integration of both approaches has shown a great potential to improve hydrological modeling [18,129].Currently, there is a different level of integration; most of them are still so-called loose integrated models, where the GeoAI and the physical-based models work independently.The GeoAI method is used for data preparation and the refining of physical-based models, e.g., data fusion, ad hoc parameter optimizations, and data assimilation.In some cases, the outputs of physical-based models are used to train GeoAI models [19,188,251].Currently, full GeoAI-physical integration is under development, embedding machine learning solutions into physical-based models or developing physically guided GeoAI models; see, e.g., Hanson et al. [21].Both approaches tend to overcome current GeoAI model limitations by providing more physical explanatory power, physically consistent and robust prediction, and a high level of generalization.

From Small-Scale to Global-Scale Hydrological Modeling
In recent years, substantial attention has been paid to large-scale and global-scale hydrological modeling [252][253][254].Although only experimental catchments have sufficient data to perform a reliable hydrological prediction, the global availability of climatological, hydrological, and remote sensing data allows for the parametrizing of the global-level hydrological model.This planet-wide dataset can only be handled thanks to a combined advancement in GeoAI application and cloud computing development, e.g., Google Earth Engine, CoLab, SEPAL [255], and many other national high-performance computer clusters.However, global-scale hydrological modeling still involves a high level of prediction uncertainty [256,257], but current progress in the development of physicalbased GeoAI models and remote sensing data assimilation can improve global modeling accuracy.

Automation of Hydrological and Fluvial System Modeling
GeoAI applications are increasing the automation of hydrological prediction and forecasting [258].Some hydrological modeling has already applied internal selfcalibration [259][260][261].Similarly, there is also substantial progress in developing automated machine learning (autoML) by self-tuning the models' hyperparameters, such as, e.g., autotune and AUTO-SKLEARN [262,263].The hyperparameters drive both the efficiency of the model training process and the resulting model quality [262].Therefore, a self-tuning module will enhance a more rapid adoption of GeoAI models in hydrological modeling, and the integration of physical-based and GeoAI models can improve autonomous hydrological prediction.
Similarly, self-supervised image classification, particularly that developed in the robotic field [264], is rapidly being adopted in hydrological studies in, e.g., satellite image classification, fluvial landform classification, and landform change detection.Selfsupervised models use automatically generated pseudo-labels, significantly reducing manual labeling, one of the most time-consuming tasks in supervised classification [265].Self-supervised image classification is enhanced by machine learning methods such as autoencoder and the generative adversarial network (GAN).Autoencoder enhances image quality and reduces noise by dimension reduction and retaining latent features [266].GAN is a promising technique to further automate high-dimensional image classification with limited data training.GAN generates new data instances that resemble the existing training data by the competition between a generator and a discriminator [267].Several examples show the advantages of incorporating GAN models in hydrological classification [267][268][269] or combining it with autoencoder [270].Integrating GAN with an LSTM network model [271][272][273]; combining GAN with an ANN fuzzy model [274] was also found to improve the automated hydrological and weather prediction using satellite data.

GeoAI-Based Multi-Dimensional Geo-Visualization and Digital Twin
Hydrological systems are complex by nature and have been challenged to comprehensively and effectively convey spatial and non-spatial hydrological information.The explosion of high-dimensional, multi-source, spatio-temporal hydrological data demands new ways of multi-dimensional geo-visualization [275].The GeoAI model optimizes the transformation of multi-dimensional data into conventional 3D geovisualization (x, y, and z features), but also into 4D (including temporal dimension) and 5D geo-visualization (including geographical scale).The 4D and 5D visualization is crucial for dynamic and interactive web-based geo-visualization [276].The GeoAI also supports building hydrological digital twins, integrating IoT sensors, and multi-scale satellite and close-range remote sensing data, with web-based hydrological GeoAI models for real-time prediction and geo-visualization.A 'digital twin' is a comprehensive digital emulator of the real-world system that aims to optimize the design and operations of complex processes through a highly interconnected workflow [277].Hydrological digital twins support the correct implementation of the IWRM actions, including natural disaster response, nexus approaches, and adaptation to climate change.Those actions require approaches underpinned by a deeper analysis of river basin systems functioning, scaling-up field-based knowledge, and new digital solutions to provide real-time, high-resolution information [278].Additionally, the advance in web-mapping services (WMS) and mobile app development with interactive geo-visualization [279] enhances hydrological information dissemination for decision-makers, stakeholders, and the general public engagement.

Conclusions
GeoAI applications in integrated hydrological and fluvial system modeling have steadily increased in recent years.We found plenty of GeoAI applications in hydrological and fluvial studies.The main applications were for assessing GeoAI hydrological prediction and classification performance, comparing GeoAI methods with hydrological physical-based models and integrating physical-based models with GeoAI.A wide range of GeoAI methods are currently applied in this field, e.g., RF, SVM, ANN, LSTM, GAN, GA, and meta-heuristic algorithms.The selection of a particular algorithm depends on the application objective, data availability, and user expertise.
Overall, GeoAI applications showed advantages in non-linear modeling, computational efficiency, integration of heterogeneous data sources, high-accuracy prediction, and the unraveling of new hydrological patterns or in detecting changes using high-dimensional, multi-source geospatial data.GeoAI methods seem particularly relevant for complex systems and large geographical-scale modeling.A significant disadvantage of GeoAI models is the low level of physical interpretability, explainability, and model generalization.Therefore, current research trends focus on integrating the physical-based model with GeoAI methods to bridge data-driven and theory-driven knowledge generation.Several levels of model integrations exist, but a full physical-based GeoAI model is still under development.The GeoAI models have shown high potential for autonomous hydrological prediction and forecasting and early warning systems.

Figure 1 .
Figure 1.The yearly number of publications found in Web of Science (2000-2021) on GeoAI and machine learning applications in the different hydrological subfields.

Figure 2 .
Figure2.A GeoAI application model for a smart decision support system for integrated water resources management (IWRM).(1) Internet of things (IoT) supports real-time, high-frequency, hydrological monitoring.The data are stored in a cloud platform and accessed by an application programming interface (API).These data can be used for the real-time identification of problems in the system, e.g., a river basin.(2) GeoAI provides data analytic and online real-time modeling tools for hydrological system analysis and prediction.(3) GeoAI also supports multi-objective, multi-scenario optimization modeling, which in turn is the basis of smart decision support systems for IWRM.(4) Geovisualization in web mapping and mobile apps can be used for data dissemination and stakeholder engagements and implementing early warning systems.The smart IWRM system can be closed with the evaluation and adjustment of the IWRM plan and the improvement of the hydrological monitoring system.WQ (automatic water quality monitoring), ADCP (acoustic doppler current profiler for water current velocity measurement and river bathymetry), GW (automatic groundwater monitoring in wells), UAV (unmanned aerial vehicle for very high-resolution land cover mapping and surface elevation models), EOS (earth observation system for environmental condition monitoring), LiDAR (LiDAR survey of high-resolution topography data), and GNSS (use of global navigation satellite systems for ground truth data collection).
To delineate valley bottom extent across large catchments and automatically classify valley bottom segments of variable length by using DEM-based derivatives from Richmond River, Australia.Advantages: The k-means successfully clustered the entire river network into 6 valley bottom segments of varying length.Limitations: The resulting cluster is unlabeled and needs expert recognition.

Table 1 .
General classification of geospatial artificial intelligence (GeoAI) and machine learning methods.

Table 2 .
Selected GeoAI applications in hydrological and hydraulic modeling.

Table 3 .
Selected GeoAI applications for model calibration and decision support systems.
Objective: To compare optimization techniques to calibrate conceptual hydrological models.Advantages: All techniques perform well, better results gained than when using a single-objective algorithm.The NSGA-II with two indicators performed better than the MPSO with one indicator.

Table 4 .
Selected GeoAI applications in monitoring and spatio-temporal prediction of water quality.

and Software Objective, Advantages, and Limitations Reference
IBK algorithm can also be used within a low-cost system to allow incorporation into IoT-based WQ systems.Limitations: Overloading of servers may occur in IoT-based systems if prediction algorithms run on the cloud for a large number of sensing nodes.Advantages: The best model performance depended on the predicted WQ variable.CEEMDAN-RF and CEEMDAN-XGBoost show better performance, less errors, and higher stability than simple RF and XGBoost.New error metric is introduced for model performance evaluation and compared with conventional methods of model evaluation.Limitations: The prediction model only depends on time-series data and no other explanatory variables were included.Advantages: RF outperformed linear models when more than one predictor was included.Limitations: At least 3 predictors required to identify clear benefit of using RF compared to multiple linear regressions.Advantages: LS-SVM and MARS models performed better in terms of external validation criteria and F test compared with multipleregression-based models and ANN and ANFIS equations.Limitations: Intensive amount of data collection required for a wide variety of parameters.

Table 5 .
Selected GeoAI applications in fluvial geomorphic and morphodynamic mapping.

and Software Objective, Advantages and Limitations Reference
Fuzzy-CNN models were successfully used to provide continuous and crisp subpixel classification of Sentinel-2 imagery.The model was transferable to satellite images with different acquisition time.It can be used for annual change detection.
[206]tive: To predict vegetation, bare sediment, and water bodies at a subpixel scale with Sentinel-2 images, trained with high resolution UAV images.Study areas were Sesia, Po, and Paglia Rivers in Italy.[206]Advantages:

Table 6 .
Characteristics of physical-based and GeoAI hydrological models.